Report written by Nadežda Jakubková.
Today, machine translation (MT) is so pervasive that — for many young or early-career localization professionals, at least — it’s hard to imagine a time without it. But such a time did exist. Those with a decade or two of language industry experience under their belt have, no doubt, witnessed firsthand MT’s evolution into the nearly omnipresent entity that it is today.
Even still, it may be surprising to learn that the history of MT dates back to the early 20th century, long before many of us were even born. In 1933, the first patents for machine-assisted translation tools were issued in France and Russia, and we’ve been building on that technology ever since.
Along the way, developers and translators alike have learned some valuable lessons that are worth looking into today — especially as novel technologies like OpenAI’s ChatGPT draw the attention (both positive and negative) of more and more thought leaders in our industry.
The 1930s-1950s: The early days
Although human beings have long fantasized about machines that could miraculously translate text from one language to another, it wasn’t until the 1930s that such technology actually seemed like it could be a reality.
Georges Artsrouni and Petr Troyanskii received the first-ever patents for MT-like tools in 1933, just a couple months apart from each other, working completely independently of each other in France and Russia respectively. These tools were quite rudimentary, especially in comparison to what we think of when we hear the term “MT” today. They worked by comparing dictionaries in the source and target language, and as such, could really only account for the root forms of words — not their various declensions and conjugations.
Troyanskii’s mechanical translation device, for example required a typist to transcribe the target language words, an operator to annotate their grammatical function, and an editor to turn it into a readable text in the target language. Without computers, the technology was little more than a glorified bilingual dictionary.
But the first general purpose electronic computers were not far off on the horizon — in the mid-1940s, developers like Warren Weaver began to theorize about ways they could use computers to automate the translation process. In 1947, Weaver proposed the use of statistical methods, information theory, and war-time cryptography techniques to automate translation with electronic computers. And shortly thereafter, academic institutions began devoting resources to the development of MT technology — the Massachusetts Institute of Technology, for instance, appointed its first full-time MT researcher in 1951.
These efforts culminated in the infamous Georgetown experiment, the first public demonstration of computer-powered MT technology. Researchers at Georgetown University partnered with IBM to create a tool that could translate Russian (albeit, Russian that had been transliterated into Latin characters) into English. The researchers hand-selected 60 sentences to present to the public — though their tool translated these sentences adequately, the technology still left a lot to be desired when it came to everyday use.
Although we often hear grandiose, perhaps overly optimistic claims about human parity in MT today, it’s important to note that such claims are not at all new. The researchers on the Georgetown experiment, for example, claimed that they needed just five more years of hard work to perfect their tool, based on just 60 sentences translated from Russian into English alone (and romanized Russian at that!).
Now, of course, we know that this was not the case. More than half a century later, we still can’t quite say that our MT technology is perfect. The lesson learned here is to be careful about over-promising when it comes to MT — perfection is not always as close as it might seem.
The 1960s-1980s: The dawn of RBMT
Though their technology was far from perfect, it seemed like the researchers at Georgetown and IBM had generated some nice momentum for MT.
In the United States, that momentum came to a halt in the 1960s. In 1966, the Automatic Language Processing Advisory Committee (ALPAC) published a report claiming that MT was too expensive to justify further research into, since MT was less efficient than human translators. MT research lost funding in the United States, but in other places, developers continued chugging along with MT projects.
During this period, researchers experimented with a handful of different MT methods, with rule-based MT (RBMT) becoming the most popular. RBMT relies on explicit grammatical and lexical information from each language, operating based on a series of rules in each language.
Early RBMT systems include the Institute Textile de France’s TITUS and Canada’s METEO system, among others. And while US-based research certainly slowed down after the ALPAC report, it didn’t come to a complete stop — SYSTRAN, founded in 1968, utilized RBMT as well, working closely with the US Air Force for Russian-English translation in the 1970s.
Though it was the most prominent form of MT at this time, RBMT had several limitations caused by its technology constraints. It was no doubt an improvement upon the technology of the Georgetown experiment, but RBMT was time-consuming to create, since developers needed to manually input the rules of each language. Plus, it often generated inaccurate or awkward-sounding outputs, especially when the input was somewhat ambiguous or idiomatic.
In searching for ways to improve and upscale RBMT, some developers found solutions in different areas — in 1984, Makoto Nagao developed example-based MT in Japan, for instance. Although this method is not widely used today, it remains an example of the ingenuity of early MT researchers. This then led other developers to the field of statistics, as we’ll see in the next section. From these pioneering researchers and developers, we’ve learned the importance of constant vigilance, scalability, and maintenance of MT systems.
The 1990s-2010s: More advanced methods
In their quest to improve RBMT, researchers developed another, more efficient method for MT: statistical MT (SMT).
In the 1990s, researchers at IBM developed a renewed interest in MT technology, publishing research on some of the first SMT systems in 1991. Unlike RBMT, SMT doesn’t require developers to manually input the rules of each language — instead, SMT engines utilize a bilingual corpus of text to identify patterns in the languages that could be converted into statistical data. Analysis of these corpora allows SMT engines to identify the most likely translation options for a given input — these models performed significantly better than RBMT and quickly became all the rage.
And as electronic computers slowly became more of a household item, so too did MT systems. SYSTRAN launched the first web-based MT tool in 1997, providing lay people — not just researchers and language service providers — access to an MT tool. Nearly a decade later, in 2006, Google launched Google Translate, which was powered by SMT from 2007 until 2016.
Alongside the development of SMT we can find inklings of neural MT (NMT) development as well. The same year that SYSTRAN launched its web MT tool, Ramon Neco and Mikel Forcada published the first paper on “encoder-decoder” structure, paving a pathway for the development of NMT technology. In 2003, researchers at the University of Montreal developed a language model based on neural networks, but it wasn’t until 2014, with the development of the sequence-to-sequence (Seq2Seq) model, that NMT became a formidable rival for SMT.
After that, NMT quickly became the state-of-the-art MT tool — Google Translate adopted it in 2016. NMT engines use larger corpora than SMT and are more reliable when it comes to translating long strings of text with complex sentence structures. That said, not all that glitters is gold: NMT requires a lot of time and computational resources, and may struggle with domains that lie outside their training data.
Though NMT is a far cry from the days of Artsrouni and the Georgetown experiment, it’s important not to completely dispose of the old MT methods. SMT is still used today by developers to check the relevancy of their training data, though RBMT is rarely, if ever, used on its own for practical purposes.
As MT’s improved though, we’ve also learned that combining different MT methods can yield better results: hybrid MT approaches can utilize methods like RBMT, SMT, and NMT in conjunction with each other to refine the translation. This is a particularly helpful approach for low-resource languages where there’s little training data available: an RBMT engine creates a rough translation that can be further improved by SMT and NMT engines later on in the process.Neural systems are also trickier to debug than SMT — because NMT engines require so much data, it’s impossible to know all of the words and phrases that go into training them, creating a black-box problem. Perhaps the most critical lesson learned from the early days of NMT, then, is that large quantities of high-quality data are critical to developing good MT engines.
The 2020s: ChatGPT and beyond
Although large language models (LLMs) perform a lot of other functions besides translation, some thought leaders have presented tools like ChatGPT as the future of localization and, by extension, MT.
OpenAI's GPT series, including tools like ChatGPT and the recently launched LLM GPT-4, feature language models built on large-scale neural networks with advanced features. Though their translational capabilities aren’t quite as great as state-of-the-art NMT, that’s not to say they can’t be improved upon — after all, these tools weren’t designed with translation in mind.
As localization teams incorporate this technology into their workflow — and some have already begun to do so — you can bet that the technology will become more and more specialized for our field. Plus, combining this technology with pre-existing MT technology might yield interesting results. ChatGPT is a decent editing tool that could be used alongside MT tools to touch up their output.
Moving forward, it’s important to take a grounded and principled approach to adopting and developing future technologies. By tracing back the history of MT all the way to its earliest incarnations, we can draw the following lessons:
报告由Nadežda Jakubková撰写。
如今,机器翻译(MT)已经非常普及,至少对于许多年轻或刚进入职业生涯的本地化专业人员来说,很难想象没有它的时代。毫无疑问,那些拥有十年或二十年语言行业经验的人已经亲眼目睹了机器翻译演变成今天几乎无处不在的实体。
尽管如此,令人惊讶的是,机器翻译的历史可以追溯到20世纪初,早在我们许多人出生之前。1933年,法国和俄罗斯颁发了第一批机器辅助翻译工具专利,从那时起,我们就一直在开发这项技术。
在此过程中,开发人员和翻译人员都学到了一些值得今天研究的宝贵经验,特别是像OpenAI的ChatGPT这样的新技术吸引了我们行业越来越多的思想领袖的注意力(无论是正面还是负面)。
1930 - 1950年代:早期
尽管人类长期以来一直幻想着机器可以奇迹般地将文本从一种语言翻译成另一种语言,但直到20世纪30年代,这种技术才真正成为现实。
Georges Artsrouni和Petr Troyanskii于1933年获得了有史以来第一个MT类工具的专利,仅相隔几个月,分别在法国和俄罗斯完全独立地工作。这些工具相当初级,特别是与我们今天听到“MT”这个术语时所想到的相比。他们通过比较源语言和目标语言的词典来工作,因此,实际上只能说明单词的词根形式,而不是它们的各种变格和变位。
例如,特罗扬斯基的机械翻译设备需要一名打字员来转录目标语言的单词,一名操作员来注释它们的语法功能,以及一名编辑来将其转化为目标语言的可读文本。如果没有计算机,这项技术只不过是一本被美化的双语词典。
但第一台通用电子计算机并不遥远-在20世纪40年代中期,像沃伦·韦弗这样的开发人员开始理论化他们可以使用计算机自动化翻译过程的方法。1947年,韦弗提出了使用统计方法,信息论和战时密码技术,用电子计算机自动翻译。此后不久,学术机构开始投入资源开发机器翻译技术-例如,麻省理工学院于1951年任命了第一位全职机器翻译研究员。
这些努力在臭名昭著的乔治敦实验中达到顶峰,这是计算机驱动的MT技术的第一次公开演示。乔治城大学的研究人员与IBM合作开发了一种工具,可以将俄语(尽管俄语已被音译为拉丁字符)翻译成英语。研究人员手工选择了60个句子向公众展示-尽管他们的工具充分翻译了这些句子,但该技术在日常使用中仍有很多不足之处。
虽然我们经常听到关于人类在MT中平等的宏伟的,也许过于乐观的说法,但重要的是要注意,这种说法并不新鲜。例如,乔治城大学实验的研究人员声称,他们只需要再花五年的时间来完善他们的工具,仅仅是基于从俄语翻译成英语的60个句子(而且是罗马化的俄语!)。
当然,我们知道事实并非如此。半个多世纪过去了,我们仍然不能说我们的MT技术是完美的。这里学到的教训是,当涉及到MT时,要小心过度承诺-完美并不总是像看起来那样接近。
1960 - 1980年代:RBMT的黎明
尽管他们的技术还远远不够完美,但乔治城大学和IBM的研究人员似乎已经为机器翻译创造了一些良好的势头。
在美国,这种势头在20世纪60年代停止了。1966年,自动语言处理咨询委员会(ALPAC)发表了一份报告,声称机器翻译的成本太高,没有理由进一步研究,因为机器翻译的效率低于人工翻译。机器翻译研究在美国失去了资金,但在其他地方,开发人员继续与机器翻译项目一起工作。
在此期间,研究人员尝试了一些不同的MT方法,其中基于规则的MT(RBMT)成为最受欢迎的方法。RBMT依赖于来自每种语言的明确的语法和词汇信息,基于每种语言中的一系列规则进行操作。
早期的RBMT系统包括法国纺织协会的TITUS和加拿大的METEO系统等。虽然美国的研究在ALPAC报告之后肯定放缓了,但它并没有完全停止-成立于1968年的SYSTRAN也使用了RBMT,并在20世纪70年代与美国空军密切合作进行俄英翻译。
虽然RBMT是当时最重要的MT形式,但由于技术限制,它有一些局限性。这无疑是对乔治城实验技术的改进,但RBMT的创建非常耗时,因为开发人员需要手动输入每种语言的规则。此外,它经常生成不准确或听起来很笨拙的输出,特别是当输入有点模糊或惯用时。
在寻找改进和升级RBMT的方法时,一些开发人员在不同的领域找到了解决方案-例如,1984年,Makoto Nagao在日本开发了基于示例的MT。虽然这种方法今天没有被广泛使用,但它仍然是早期MT研究人员独创性的一个例子。这将其他开发人员引向了统计学领域,我们将在下一节中看到。从这些开拓性的研究人员和开发人员那里,我们了解到了MT系统的持续警惕、可扩展性和维护的重要性。
1990 - 2010年代:更先进的方法
为了改进RBMT,研究人员开发了另一种更有效的MT方法:统计MT(SMT)。
在20世纪90年代,IBM的研究人员重新对MT技术产生了兴趣,并在1991年发表了一些关于第一批SMT系统的研究。与RBMT不同,SMT不需要开发人员手动输入每种语言的规则-相反,SMT引擎利用双语文本语料库来识别可以转换为统计数据的语言模式。对这些语料库的分析允许SMT引擎识别给定输入的最可能的翻译选项-这些模型的表现明显优于RBMT,并迅速流行起来。
随着电子计算机逐渐成为一种家庭用品,MT系统也是如此。SYSTRAN于1997年推出了第一个基于网络的机器翻译工具,为非专业人员-而不仅仅是研究人员和语言服务提供商-提供了使用机器翻译工具的机会。近十年后的2006年,谷歌推出了Google Translate,从2007年到2016年,它都采用SMT技术。
随着SMT的发展,我们也可以发现神经MT(NMT)发展的端倪。在SYSTRAN推出网络机器翻译工具的同一年,Ramon Neco和Mikel Forcada发表了第一篇关于“编码器-解码器”结构的论文,为NMT技术的发展铺平了道路。2003年,蒙特利尔大学的研究人员开发了一种基于神经网络的语言模型,但直到2014年,随着序列到序列(Seq 2Seq)模型的发展,NMT才成为SMT的强大竞争对手。
在那之后,NMT迅速成为最先进的MT工具-谷歌翻译在2016年采用了它。NMT引擎使用比SMT更大的语料库,并且在翻译具有复杂句子结构的长文本字符串时更可靠。也就是说,并非所有发光的都是金子:NMT需要大量的时间和计算资源,并且可能会与位于其训练数据之外的域进行斗争。
虽然NMT与Artsrouni和乔治敦实验的时代相去甚远,但重要的是不要完全抛弃旧的MT方法。SMT今天仍然被开发人员用来检查他们的训练数据的相关性,尽管RBMT很少(如果有的话)用于实际目的。
随着MT的改进,我们还了解到,结合不同的MT方法可以产生更好的结果:混合MT方法可以利用像RBMT、SMT和NMT这样的方法彼此结合来细化翻译。这是一种特别有用的方法,适用于缺乏训练数据的低资源语言:RBMT引擎创建一个粗略的翻译,可以在稍后的过程中由SMT和NMT引擎进一步改进。神经系统的调试也比SMT更棘手-因为NMT引擎需要如此多的数据,不可能知道所有的单词和短语来训练它们,这就产生了一个黑盒问题。也许从NMT早期学到的最重要的教训是,大量高质量的数据对于开发良好的MT引擎至关重要。
2020年代:ChatGPT及以后
尽管大型语言模型(LLM)除了翻译之外还执行许多其他功能,但一些思想领袖已经提出了像ChatGPT这样的工具,作为本地化的未来,并通过扩展,MT。
OpenAI的GPT系列,包括ChatGPT和最近推出的LLM GPT-4等工具,功能语言模型构建在具有高级功能的大规模神经网络上。虽然它们的翻译能力不如最先进的NMT,但这并不是说它们不能改进-毕竟,这些工具在设计时并没有考虑到翻译。
随着本地化团队将这项技术纳入他们的工作流程中--有些团队已经开始这样做了--您可以打赌,这项技术将变得越来越专业化。此外,将这项技术与现有的MT技术相结合可能会产生有趣的结果。ChatGPT是一个不错的编辑工具,可以与MT工具一起使用,以修改其输出。
展望未来,重要的是要采取一种有基础和有原则的方法来采用和开发未来的技术。通过追溯机器翻译的历史,一直到它最早的化身,我们可以得出以下教训:
以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。
阅读原文