Machine Translation: Breaking Down the Hype and Fears

机器翻译:无需过度炒作,也无需过度恐惧

2020-08-25 15:50 Andovar

本文共935个字,阅读需10分钟

阅读模式 切换至中文

Machine Translation and the use of Machine Learning, Artificial Intelligence and Language Processing technologies has gotten a lot of attention in the last couple of years. The most recent hype cycle of note started with Google’s 2016 research paper about its Neural Machine Translation (NMT) system and subsequent announcement that their flagship product Google Translate was switching to an NMT engine. The marked improvement in translation quality sparked a wave of media coverage and announcements from other players in the industry that they too are moving to NMT. This includes Andovar’s Machine Translation (MT) partner Omniscien, who have not only built their own NMT engines but also created a set of tools to prepare content for MT and improve the output further.   Hype, fear and machine translation Let’s start with a short definition of Neural Machine Translation courtesy of Wikipedia: Neural machine translation (NMT) is an approach to machine translation that uses a large artificial neural network to predict the likelihood of a sequence of words, typically modeling entire sentences in a single integrated model. Understood? Good. Not understood? Even better. There’s a whole ocean of misinformation waiting for you. Media coverage and discussion of MT is extremely polarized. On one hand, we have the MT providers and mainstream media that tend to repeat their most exaggerated claims to get readers’ attention. You can recognize those articles by their repeated use of catch-phrases such as "Babel Fish", "singularity" or "Star Trek’s Universal Translator". One doesn't need to search long to find different attitudes to MT expressed by translators. These typically fall into one of the following three categories: Ridicule: "Look at this funny example of inaccurate MT I’ve found!" Fear-mongering: "We’ll lose our jobs!" Dismissive: "If you care about your customers at all, don’t use MT!" So is an MT revolution just around the corner? Will it render all translators unemployed, obliterate the need to learn languages and bring about world peace? Or is it all just snake oil for the gullible? Anyone with any real knowledge of MT will reply that the truth lies in the middle.   The good, the bad and the undecided Even the most fervent proponents of MT agree that it is not perfect all of the time; and even its most zealous opponents admit that it does surprisingly well some of the time. Machine Translation, whether Neural or not, does work well with some types of content and in some language pairs. Despite this, finding the middle ground seems elusive. Instead of arguing whether MT is all good or all bad, let’s admit that it’s a bit of both (with the good parts being particularly good). What this means in practice is that it should be considered in all translation projects and either accepted or dismissed based on the results rather than preconceptions. Here are some guidelines to consider for the application of MT:               Overarching all those is the purpose of the translation and quality expectations that go with it. In some cases, an imperfect translation may be better than no translation, but in others (legal texts for example) only the highest human quality will do. All of this applies not only to whole projects, but also to their sections. Andovar recommends triaging the content: consider sending the legalese to professional translators, the product descriptions to MT with post-editing, and the structured technical documentation to MT only. After all, some of this really is excellent.   Conclusion MT's drawback is that it lacks the ability to discern culture, locale and social differences, often necessitating human post-editing. Trying to get a correct translation from an engine like Google Translate can still be inaccurate depending on language nuance. At its base, MT is not as intuitive as some would believe but it is still incredibly useful in translating large bulks of language into fairly comprehensible sentences. Computer aided-translation (CAT) is a system that blends MT with human review. With CAT, the need for human eyes to review work is mandatory, making it an effort that offers the highest success rate in terms of accuracy. Where MT is not intuitive, human review can address the nature of language, ideally yielding natural-sounding communication and text. It is this blended approach that keeps human's involved and garner's the best results, and one that we here at Andovar believe in strongly. To read more into the various types of translation technologies available for use today, feel free to check out our Ultimate Guide to Translation Automation Technologies. In addition, to hear more about this discussion on Machine Translation itself and its presence in the localization industry, follow this link to watch a video presentation on the topic by Andovar's Chief Executive Officer, Conor Bracken.    Andovar at your service At Andovar, we are as interested as you are in the evolutionary steps of the internet and the world at large. We want to be right there at the forefront of technology disruptions, integrating new system updates as they become available. Our staff have become experts in the best systems available and we are prepared to help you access the best hardware and software available to help you smoothly transition to them in the future. Andovar’s Language Technology Tools are here to make your life a little easier. Please feel free to get in touch if you have any questions about translation technologies or to see how we can help you with your next localization project!   hbspt.cta._relativeUrls=true;hbspt.cta.load(6085627, '377ab199-f6a5-4365-8bc7-273fe5898506', {});
过去几年,机器翻译和机器学习、人工智能和语言处理技术的使用得到了广泛关注。最近的炒作周期始于谷歌2016年发表的一篇关于其神经机器翻译(NMT)系统的研究论文,随后又宣布其旗舰产品谷歌翻译将改用NMT引擎。翻译质量的显著提高引发了一波媒体报道和行业内其他参与者的声明,他们也将采取NMT。其中包括Andovar的机器翻译(MT)合作伙伴Omniscien,他们不仅构建了自己的NMT引擎,还创建了一套工具,为机器翻译准备内容,并进一步改进输出。 炒作,恐惧与机器翻译 让我们从维基百科提供的神经机器翻译的简短定义开始: 神经机器翻译(Neural MachineTranslation,NMT)是一种使用大型人工神经网络来预测单词序列的可能性的机器翻译方法,通常在单个集成模型中对整个句子进行建模。 明白了吗?很好。不明白?那还更好。因为有一大堆错误信息在等着你。 媒体对机器翻译的报道和讨论呈现出极端的两极分化。一方面,机器翻译供应商和主流媒体倾向于重申他们最夸张的说法,以吸引读者的注意。这些文章反复使用诸如:“巴别鱼”,“奇点”或“星际迷航的万能翻译器”之类流行语,你可以通过这些认出它们。 无需多时就可以发现译者对翻译的不同态度。这些态度通常属于以下三类之一: 调侃:“看看我发现的这个机翻不准确的搞笑例子!” 散布恐惧:“我们会失业的!” 不屑一顾:“如果你很在乎你的客户,就不要用机翻!” 那么一场关于机器翻译的革命是否指日可待呢?机翻是否会让所有翻译人员失业,消除学习语言的需要,并带来世界和平?还是这一切都只是给易受骗者的万应灵药? 真正了解机器翻译的人都会回答说,真相处于中间立场。 好的,坏的和未定的 即使是最狂热的机器翻译支持者也赞同,机翻并不总是完美的;即使是机器翻译最狂热的反对者也承认,机翻出的内容有时会出奇得好。不管是神经翻译还是非神经翻译,机器翻译在某些类型的内容和某些语言对中确实表现得很好。尽管如此,似乎很难找到中间立场。 与其争论机器翻译是好是坏,不如承认其有点儿兼而有之(好的部分特别好)。这在实践中意味着所有的翻译项目中都应该考虑到机翻,根据结果而不是先入为主的观念来接受或拒绝机翻。 以下是应用机器翻译时需要考虑的一些指导原则: 所有这些都是翻译的目的和与之相伴随的质量期望。在某些情况下,一个不完美的翻译可能比没有翻译要好,但在另一些情况下(例如法律文本),只有最优质的人工翻译才行。 所有这些不仅适用于整个项目,也适用于它们的各个部分。Andovar建议对内容进行筛选:考虑将法律术语发送给专业翻译人员,将产品描述发送给带有译后编辑的机器翻译,将结构化的技术文档仅发送给机器翻译。毕竟,其中一些机器翻译出的内容确实是非常棒的。 结论 机器翻译的缺点是缺乏辨别文化、地域和社会差异的能力,往往需要人类的后期编辑。根据语言的细微差别,试图从像谷歌这样的引擎获得正确的翻译仍然是不准确的。从根本上说,机器翻译并不像有些人认为的那样凭直觉,但在将大量的语言翻译成相当容易理解的句子方面,它仍然非常有用。 计算机辅助翻译(CAT)是一种将机器翻译与人工评论融合在一起的系统。在CAT中,需要强制人工审查工作,这使得它在准确性方面具有最高的成功率。在机翻内容不直观的地方,人工审查可以解决语言的本质,理想情况下产生听起来自然的交流和文本。正是这种混合的方法让人们参与进来,得到最好的结果,这也是我们在Andovar 所坚信的。 想要更多地了解当今可用的各种类型的翻译技术,请随时查看我们的翻译自动化技术终极指南(Ultimal Guide To translation Automatic technologies)。 此外,想要听到更多关于机器翻译本身的讨论以及它在本地化行业中的存在,请跟随此链接观看Andovar首席执行官康纳·布莱肯关于该主题的视频演示。 Andovar为您服务 在Andovar,我们和您一样对互联网和整个世界的进化步骤感兴趣。我们希望站在技术革新的最前沿,在新系统更新可用时进行整合。 我们的员工已经成为现有最佳系统的专家,对于帮助您访问现有最佳的硬件和软件,帮助您在未来顺利过渡到使用这些硬件和软件,我们已经做好了准备。在这里,Andovar的语言技术工具能让你的生活变得简单一些。 如果您对翻译技术有任何疑问,请随时联系我们,或者了解我们会如何帮助您完成下一个本地化项目! hbspt.cta._relativeURLS=true;hbspt.cta.load(6085627,'377AB199-F6A5-4365-8BC7-273FE5898506',{});

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文