The Future of Machine Translation: A Recap

综述:关于机器翻译的未来

2020-06-09 01:20 Lilt

本文共882个字,阅读需9分钟

阅读模式 切换至中文

Over the last few decades, modern machine translation has improved more and more. From its beginnings in the 1940s to its contemporary improvements, machine translation has undergone plenty of change. As it’s improved, however, questions about its ability have been raised time and time again. In our recent webinar, The Future of Machine Translation, Lilt's CEO Spence Green spoke about the background of machine translation, the state of the art today, and what new developments and improvements we can expect in the future. The future of MT is not as cut and dry as it may seem, and the importance of human involvement is crucial. At Lilt, we know just how important translators are, and we’ve built our ecosystem to provide them with the tools they need to be more efficient than ever. After all, there’s a reason why over 70% of translators prefer to work with a system that augments their abilities instead of simply editing translated content. Spence first took a step back to better understand the history of machine translation and how we’ve gotten where we are now. MT research started in the 1940s, though it wasn’t what you might think. Early on, research focused on writing linguistic rules on how to translate from one language to another. While the results proved to be relatively accurate (depending on linguistic representation), there are too many sentences in the world for that manual process to be feasible. The 1980s brought more computational power, and IBM started building systems that could learn rules from bilingual data inputs. The next of MT was phrase-based systems - instead of word to word translations, it was chunks of text to chunks of text. These days, we’re onto neural networks that can take many more parameters to translate data. The raw MT quality has gotten much better over the years. But for businesses, raw output doesn’t guarantee quality. Google Translate, for example, is fast and inexpensive, but has a lower quality output. It’s commonly used in situations where speed and cost are the primary considerations over quality - for example, with a business that has to translate enormous and ever-growing quantities of user-generated content. Airbnb users don’t expect translations to be perfect on a post for a listing. Often, the source text isn’t perfect to begin with, so raw MT output from systems like Google Translate can be acceptable. The Machine Translation plus Post-Editing workflow:  content gets translated then edited by a linguist. But it’s still less functional for more professional content, like financial documentation, product information, law, etc. MT plus Post-Editing (MTPE) adds a human review step after the MT system runs its translations. While that may be relatively inexpensive, it’s a much slower process than raw MT output and quality can still suffer. On top of that, MTPE is a process that translators strongly dislike, as they report finding the work unfulfilling. Research on the topic supports that claim. In a study done by CSA Research, 89% of translators said they prefer to translate text instead of editing raw MT output. And while the quality of MTPE translations is usable, it often sounds more literal, since the base translation is still created by a machine. And since humans edit the raw output, there are no MT efficiency gains that benefit translators, as the MT models don't improve over time with the translator's input - only when the models are manually retrained. So if not MTPE, what’s the future of machine translation? At Lilt, we firmly believe that AI will augment, not replace, human translators. While machines are great at automating repetitive tasks, they don’t deal with complex tasks nearly as well. Complex tasks that involve reasoning, context, and integrating information from many different sources of knowledge is where humans excel - and where the gap between human ingenuity and machine ability is most clear. The focus has already started and will continue to shift towards “human-in-the-loop” AI, a system where the human-to-machine feedback loop helps to improve the output over time. There are already plenty of examples of human-in-the-loop AI systems in the world currently, from automotive to aerospace to medical.  Lilt's adaptive machine translation features human feedback to train and update systems. This figure shows the system for translation. The model is first trained with baseline and customer translation data. Once the engine provides translation suggestions, the human linguist can review and provide immediate feedback. That feedback then updates the engine for future translation suggestions, and so on.  The result? Future answers are more accurate based on a constantly updated model, providing efficiency improvements for translators when it comes to both quality and speed. According to the same CSA Research study, 71% of linguists prefer working with an adaptive MT system like Lilt as opposed to editing raw MT output.  If you want to learn more about the future of machine translation and how human-in-the-loop AI is already making an impact on systems around the world, watch the on-demand webinar here. Get more of Spence’s insights and understand how you can set your company up for the future with the right adaptive MT solution.
过去几十年,现代机器翻译取得了越来越大的进步。 机器翻译从20世纪40年代的诞生开始,发展到现在的情况,当中经历了许多变化。 然而,虽然机器翻译有所改进,关于它的能力,总有源源不断的质疑声。 在我们最近的网络研讨会“机器翻译的未来”中,Lilt的首席执行官Spence Green谈及机器翻译的背景、当今的技术状况,以及未来我们可以期待的新发展和新改进。 MT的未来并不像看起来那样一成不变,人工参与在当中发挥至关重要的作用。 在Lilt,我们知道翻译人员的重要性,所以已经建立起我们的生态系统,为他们提供所需的工具,让他们比以往任何时候都更有效率。 毕竟,超过70%的翻译人员喜欢使用一个系统来增强他们的能力,而不是简单地编辑翻译内容,是有原因的。 首席执行官Spence 首先回溯过去,以便让大家更好地理解机器翻译的历史,以及我们是如何发展到现今情况的。 MT研究始于20世纪40年代,但是它可能并不是你想象的那样。 早期的研究集中在如何从一种语言翻译成另一种语言的语言规则的编写上。 虽然结果被证明是相对准确的(取决于语言表示),但世界上有太多的句子,因此这种人工处理不可行。 20世纪80年代,计算能力得到增强,IBM开始构建能够从双语数据输入中学习规则的系统。 下一阶段的MT是基于短语的系统——它不是单词对单词的翻译,而是一段段文本对一段段文本的翻译。 如今,我们正在研究神经网络,它可以接受更多的参数来翻译数据。 这些年来,MT本身的质量有了很大的提高。 但对于企业来说,这并不能保证翻译质量。 例如,谷歌翻译速度快,价格便宜,但输出质量较低。 它通常用于速度和成本比质量更重要的场合——例如,一个企业必须翻译大量的,不断增长的用户生成内容。 爱彼迎用户并不期待所发布的内容翻译完美无瑕。 通常,源文本一开始就不是完美的,所以来自Google Translate等系统的原始MT输出可以接受。 机器翻译+译后编辑工作流程:先翻译内容,然后由语言学家编辑。 但对于更专业的内容,如金融文档,产品信息,法律等,MT的表现仍不尽如人意。MT加后期编辑(MTPE)在MT系统运行其翻译后增加了一个人工审查步骤。 虽然成本可能较低,但与MT原始输出相比,它用时较长,而且质量仍然可能无法得到保证。 最重要的是,MTPE是一个译者非常不喜欢的过程,据译者报告,他们觉得这样工作没有成就感。 关于这一主题的研究支持了这种说法。 在CSA Research所做的一项研究中,89%的译者表示他们更喜欢翻译文本,而不是编辑原始的MT输出。 虽然MTPE翻译的质量是可用的,但它通常看起来还是字对字直译,因为基本翻译内容仍然是由机器创建的。 而且,由于是由人工编辑原始输出,翻译效率并没有提高。因为随着时间的推移,翻译模型不会随着译者的输入而得到改进——只有当模型被人工重新训练时才能得到改进。 那么如果未来趋势不是朝着MTPE发展,机器翻译的未来会是怎样的呢? 在Lilt上,我们坚信AI将为人工翻译添砖加瓦,而不是取代人工翻译。 虽然机器在自动完成重复性任务方面很出色,但它们不能处理复杂的任务。 复杂的任务涉及到推理、上下文,以及整合来自许多不同知识来源的信息,这是人类的优势所在——也是人类创造力和机器能力之间最明显的差距所在。 重点已经开始转向、并将继续转向“人为回环”AI,在这个系统中,随着时间的推移,人对机器的反馈回路有助于提高输出。 目前世界上已经有很多“人为回环”人工智能系统的例子,涵盖汽车领域、航空航天领域以及医疗领域。 LILT的自适应机器翻译以基于人工反馈来训练和更新系统为特色 这个图显示了用于翻译的系统。 首先用基线和客户翻译数据训练模型。 一旦引擎提供翻译建议,人类语言学家就可以审查并提供即时反馈。 然后,该反馈更新引擎以获取未来的翻译建议,等等。 结果呢? 未来,在不断更新的模型基础上,答案将更加准确,在质量和速度方面为翻译人员提供效率改进。 根据CSA的同一项研究,71%的语言学家更喜欢使用像Lilt这样的自适应机器翻译系统,而不是编辑原始机器翻译输出。 如果你想了解更多关于机器翻译的未来的讯息,以及“人为回环”AI如何已经对世界各地的系统产生影响,请观看这里的点播网络研讨会。 获取更多Spence的洞察力,并了解如何用正确的适应性MT解决方案为您的公司建立未来。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文