Home » Articole » EN » Business » Translation » Computer-assisted translation » Machine translation » Disambiguation in machine translation

Disambiguation in machine translation

Word-sense disambiguation concerns finding a suitable translation when a word can have more than one meaning. The problem was first raised in the 1950s by Yehoshua Bar-Hillel.[1] He pointed out that without a “universal encyclopedia”, a machine would never be able to distinguish between the two meanings of a word.[2] Today there are numerous approaches designed to overcome this problem. They can be approximately divided into “shallow” approaches and “deep” approaches.

Shallow approaches assume no knowledge of the text. They simply apply statistical methods to the words surrounding the ambiguous word. Deep approaches presume a comprehensive knowledge of the word. So far, shallow approaches have been more successful.

The late Claude Piron, a long-time translator for the United Nations and the World Health Organization, wrote that machine translation, at its best, automates the easier part of a translator’s job; the harder and more time-consuming part usually involves doing extensive research to resolve ambiguities in the source text, which the grammatical and lexical exigencies of the target language require to be resolved:

Why does a translator need a whole workday to translate five pages, and not an hour or two? ….. About 90% of an average text corresponds to these simple conditions. But unfortunately, there’s the other 10%. It’s that part that requires six [more] hours of work. There are the ambiguities one has to resolve. For instance, the author of the source text, an Australian physician, cited the example of an epidemic which was declared during World War II in a “Japanese prisoner of war camp”. Was he talking about an American camp with Japanese prisoners or a Japanese camp with American prisoners? The English has two senses. It’s necessary therefore to do research, maybe to the extent of a phone call to Australia. [3]

The ideal deep approach would require the translation software to do all the research necessary for this kind of disambiguation on its own; but this would require a higher degree of AI than has yet been attained. A shallow approach which simply guessed at the sense of the ambiguous English phrase that Piron mentions (based, perhaps, on which kind of prisoner-of-war camp is more often mentioned in a given corpus) would have a reasonable chance of guessing wrong fairly often. A shallow approach that involves “ask the user about each ambiguity” would, by Piron’s estimate, only automate about 25% of a professional translator’s job, leaving the harder 75% still to be done by a human.

Notes

  1. ^ Milestones in machine translation – No.6: Bar-Hillel and the nonfeasibility of FAHQT by John Hutchins
  2. ^ Bar-Hillel (1960), “Automatic Translation of Languages”. Available online at http://www.mt-archive.info/Bar-Hillel-1960.pdf
  3. ^ Claude Piron, Le défi des langues (The Language Challenge), Paris, L’Harmattan, 1994.

This guide is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Video: Machine Learning and Machine Translation

Leave a Reply

Your email address will not be published. Required fields are marked *