The Rise of the Machines: What Google’s New Translation Engine Means for Your Next Project
When Google a few weeks back announced that it would deploy its new Neural Machine Translation system (NMT) for its English – Chinese translations (18 million translations per day), it made a significant splash. Did Google actually crack the code and indeed bridge the gap between human and machine translations as the title of its research paper suggests? How will it impact the industry, and what does it mean for you, our customers?
Let’s have a closer look.
Machine translation has been around for a while. Patents for “translation systems” can be traced back as far as 1933. In the 1960s and 70s the United States, fueled by the Cold War and the Vietnam War, pushed several initiatives in that direction as well. Still, the road to accurate and commercially viable machine translation was long and rocky.
Even today, you’ll learn quickly that, if you use Google Translate, for example, the result is rarely satisfactory and certainly can’t stand up to the scrutiny of a professional translator. But even with the flaws, machine translation has its place. When you are looking to translate huge amounts of content that ideally has a predictable set of vocabulary and questions of style and grammar can be secondary, machine translation will be your friend. Technical documentation that follows strict writing rules and style guides, content that is generated using controlled English, and informational content that doesn’t require “for publishing” translation quality can be good candidates. Speed and lower costs are the main upsides while you can expect a higher risk of contextual errors as well as fluency and grammatical issues.
For a quick and dirty solution, Google Translate does a pretty good job already even though the results do not stand up to professional standards. So far, the company has been working with Phrase-Based Machine Translation (PBMT). With this method, the content is not translated word by word. Instead, it is broken down in little blocks of content, phrases, that don’t have anything to do with linguistic phrases but are based on statistic methods. Neural Machine Translation (NMT) works differently, as Google’s research scientists explain in their blog post announcing Google NMT.
They stress that whereas PBMT breaks sentences down into words and phrases to be translated independently, NMT considers the entire input sentence as a unit for translation. The advantage of that method is that it requires fewer engineering design choices, or, to put it the other way around, it has a higher self-learning capability. NMT is notorious though for missing rare words and Google didn’t consider it fast or accurate enough to put it on its production server for the world to use.
Until now, that is. Google’s researchers are now confident that they’ve overcame the biggest limitation of NMT. They describe their secret sauce fairly detailed in their research paper, but let’s focus on the results they are presenting:
According to the Google researchers, their NMT system produces translations that are “vastly improved compared to the previous phrase-based production system.” Using sentences from Wikipedia and news sources, the number of translation errors was 55% to 85% lower in several major languages that with the previous method. The company is so optimistic about its approach that it plans to roll out the NMT system for other languages besides Chinese soon.
But while the gap between human and machine translation might indeed have become narrower, the machines are still by no means perfect, as Google itself acknowledges: “GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms, and translating sentences in isolation rather than considering the context of the paragraph or page.”
So for you, the customer, the basic rules of thumb we laid out in our blog post remain intact: Since humans can translate context, humor, irony, and idiomatic expressions, they excel every time a high quality, publication-ready translation is needed. Sales and marketing material, legal documentation, safety documentation, literary works, and content that can pose liability issues will stay the human domain. Branding and company messaging is also not very well suited for being left in the hands of machines.
If you are in doubt which is the best way to go for you and your next translation project, don’t hesitate to contact us.