The language we said is divided into two categories: natural language and artificial language. The natural language is the language formed in the process of human development, which is a medium that passes information between people. Artificial language refers to the language designed to be designed for some purpose. Computer language is a kind of manual language refers to a language used to pass information between people and computers.
AC information between people and computers To use a computer language. The computer is actions, one step, actually executing a program that has been compiled with the computer language. The program is a collection of instructions to be executed, and the programs are written in the language we have mastered. People want to control the computer, use a computer to solve the problem, must issue a command to the computer through the computer language. We call the process of writing the program called programming, and the computer language is called programming language.
Computer language can be used to control computers to solve some practical problems. These problems can be numerical calculations, and their operation objects are some symbol strings consisting of symbols; or non-numerical calculations such as sound, image processing issues, their operation objects are sounds and images. We should know that all computer languages are not universal, and each computer language has its own characteristics, advantages and operational environments, have their own application and operational objects.
The electronic computer has been introduced, and people consider its non-numerical operational issues, and select machine translation as the first non-numerical issue. This choice can be said that the computer non-numerical application is extremely broad, many linguistic theories and methods, and many technical achievements are generated and resolved on its basis or inspiration. For example, issues such as input and output devices, large storage devices, speech recognition, and text identification have been proposed in the initial stage of machine translation. However, due to machine translation is a relatively advanced artificial intelligence, it has not yet been able to actually or widely applied, and other aspects of calculating linguistics have been greatly developed. Computer intelligence retrieval has been implemented in the late 1960s, and Intercontinental retrieval is available through satellite. With a computers, the speech statistics have become frequent events. On the basis of statistical analysis, a large number of positive sequences, reverse phrases and frequency dictionaries have been built, and various corpus is established to promote the birth of the calculation. At the same time, a large number of indexes and a word index were also prepared. The information processing issues of the big character set have been resolved, which provides convenient conditions for information processing of Chinese and other oriental languages. Computer-assisted teaching is increasingly maturing and common. As a natural language understanding of an important branch of artificial intelligence has also laid the foundation, and the textual automatic identification, speech recognition, and speech synthesis such as words are also booming. Computers have also received more and more applications in experimental phonetics, dialect, grammar analysis, and dictionary.
The reason why calculating linguistics has such a long-term development is due to the needs of society. Today, the world is in the new technology revolutionary era, a modern language information processing system that is based on electronic computers is forming around the world, which marks highly developed information society. Computational linguistics is born and developed in order to act as this historical mission.
Computational language development to today, according to its working nature and complexity, it can be concatenated as the following three aspects:
1 automatic arrangement: This is the best working in computers and is also the most mature part of the language. Statistics, classify, sort, edit various terms, indexes and dictionary, establish a corpus, terminology database, etc., have been widely used. Since these technologies have been quite mature, there is a ready-made package providing services.
2 Automatic analysis: This is a more complex language automatic processing. This automatic analysis system works based on specific language information deposited in the computer, and the purpose is to obtain a predetermined conclusion, such as such a computer checking the dictionary or grammar testing. If the conclusion is incorrect, it is proved that the dictionary or syntax is not complete enough, and the original data or rules need to be revised or supplemented. Such systems are generally in the experimental research phase.
3 automatic research: This is a more complex language automatic processing. This automatic research system works according to the general language information stored in the computer, with the means of statistics, comparison, and classes, etc. Some natural language understanding systems in artificial intelligence research are working hard in this regard, but there is no more mature research results.
The type of computer language is very much, in general, can be divided into machine language, assembly language, and high-level language.
Computational linguistics can be said to be a combination of computers and linguistics. This combination has gone fruitful results, except for those application topics mentioned above, it also manifests in the influence of linguistic theory and methods. The definition of language is expanded: language is not only important communication tools in humans, but also communication between human machines. In order to meet the requirements of computer processing, the maximum characteristic of computational linguistics is to require language, because only formulation can algorithize, automate. According to this request, a series of automatic analysis methods for language-oriented information processing are developed, including predictive analysis, depending on analytical methods, intermediary components, preferably semantic, expanded transfer networks, and concepts from belonging, and the like. These automatic analysis methods have been applied in the system of machine translation and natural language understanding and prove effective. The formalization of the language is a layered. The formalization of grammar is relatively simple, people have made a lot of work; the formalization of semantics is a complex problem, and there are not many work in people. The semantic formal problem solves better and will greatly affect the effectiveness of language automatic processing. Therefore, continuing the effective form of structural analysis methods and semantic analysis methods, studying their relationship between them, and discussing their limitations in different systems, which is a key research topic in computational linguistics.
The fifth generation computer requires people to assign it to hearing (identifying spoken) and stronger visual (automatic identification text), gives it to speak capabilities (synthetic words) and dictation capabilities (voice typing), and also require It gives it to understand the natural language and translate some (or more) natural language into another (or more) the ability of natural language. In this way, computational linguist needs to provide data and various application software aspects of physical parameters, language probability, etc., so that the expert, engineer is jointly solved to add "wings" to the computer to make it Really become "universal intelligent machine".
Complete the above tasks and must rely on the efforts and cooperation of the entire language. Although there is a uniqueness of the machine-oriented language, it will be stove in many ways, but the practice proves that the basic language of traditional linguistics has a big relationship to solve some new tasks, such as the traditional English and Chinese comparative linguistics. It will provide a lot of convenience to English-Chinese machine translation. In this sense, computational linguistics can only get a rapid development of the results of traditional linguistics and transformation.
It is worth mentioning that machine translation is an important branch of artificial intelligence and the first application area. However, in the case of the existing machine translation, the translation quality of the translation system is far from the ultimate goal; and the quality of the translation is the key to the success or failure of the translation system. Chinese mathematicians, Professor Zhou Haizhong, said in the paper "50 years" in the paper: To improve the quality of the translation, the first thing to solve is the language itself, not the program design problem; alone The machine translation system must not improve the quality of machine translation. In addition, in the case of how the cerebral fuzzy identification and logic judgment is made, the translation should be impossible to reach "Letter, Da, Ja".
Calculation Language and Natural Language Information Processing Research Research By LanguageUnderStanding and Automatic Generation (LanguageGeneration). The former recognition syntactic structure from the sentence surface layer, judge the semantic relationship between the ingredients, and finally get the meaning of the sentence expression; the latter will select the word from the meaning of the meaning, according to the semantic relationship between the words, construct each ingredient The semantic structure and syntax structure between the semantic structure and the syntactic structure, finally conforming to grammar and logic.
Computational linguistics is like other disciplines, there are two levels of scientific research and technology. The purpose of scientific research is to discover the inherent law of language, explore the calculation method of language understanding and generation, the basic resources of the construction of language information; and technical research is driven by the application goal, according to the actual needs of society, design and development of practical languages Information processing system.
The application goal of natural language information processing is to communicate with a natural language between people. Specifically, it is to establish a variety of computer application software systems for treating natural languages, such as machine translation, natural language understanding, voice automatic identification and synthesis, text automatic identification, computer-aided teaching, information retrieval, text automatic classification, automatic abstracts, There is information extraction in the text, smart search on the Internet, and various electronic dictionary and terminology databases.
With the broad popularity of the Internet, the social demand for language information processing is increasing, and there is an urgent need to use automated means to handle massive language information. However, due to the limitations of discipline theory development and the complexity of Chinese itself, the research of my country's computational linguistic theory and methods cannot provide sufficient support for the development of Chinese information processing applications. One of the years of calculating linguistics and natural language processing in China is that the goal of application research and practical system development is more clear, relatively investment, and some results have been achieved; the research of basic theories and methods is relatively weak. The study and development trend of the period 1998-2002 were still the case. In the various application objectives described above, the projects in which the research power is concentrated is: text information retrieval, document automatic classification, automatic abstract, voice automatic identification, machine translation, and text information extraction and filtering. In addition, the construction of language resources and the language-based language analysis methods based on corpus have also been particularly concerned, and it has made more fast progress.