July 13 2017 in Language Connect by Global-Language
Machine translation makes mistakes in 95% of cases
How and when do we use Machine Translation? How many mistakes can you expect and how good does it work in different language pairs? A group of activists has rec
A group of activists has recently carried out a research showing how helpless Machine Translation turns out to be. The specialists of a popular web resource have performed one of the largest tests checking the most popular MT tools over a big volume of material (covering nearly 38,000,000 words). In their experiment the same samples were translated with a help of MT and manually (by a human) for different pairs of languages. In order to obtain the results in a form that can be clearly understood, the following comparative scale has been proposed:
100% – human translation of sample fragments are identical to MT-parts;
85-95% matching – the translated fragments are quite precise, still requiring minor editing;
50-75% matching – MT can be useful to automatically substitute certain words, but not as a solid fragment;
0% matching – MT-fragments correlate with human translated samples any less than 50%.
Thus, all the tested languages fall into one of these 4 groups. The results of this investigation turned out to be quite impressive.
Machine translation works best for French, Portuguese, Spanish, and English.
This tendency turned out to be evident, showing the highest frequency of matching between the human translated samples and MT-segments – more than 20%, while 90% of translated phrases had at least some traces of matching.
Bad results for Russian, Polish, and Korean – they showed a much lower frequency of matching, with only 5% of full matching and 20-40% of partial equivalence.
This phenomenon can be clearly explained by grammar and morphology peculiarities of these languages. It is known that the first group of languages has analytical structure – i.e. particular word order, where certain grammar constructions, such as auxiliary verbs, play specific role, whereas Russian, Polish, and Korean languages have synthetic structure, where word endings and different forms of words indicate time, process, and time of speaking.
Sadly, but it clearly shows that modern Mechanisms of Machine Translation are not able to cope with the easiest tasks at the moment. While technologies are able to provide innovative devices for instant translation of certain languages, the precision of translation, as well as applying them for dozens of more sophisticated languages is still a question to be solved.
How to make MT work better? How to adjust its algorithms to hard languages? When will it be possible to rely on MT results? Unfortunately, these are the questions that both linguists and engineers are not able to answer so far.