Disruptive So Far: Looking Back at Neural Machine Translation (NMT) Within Natural Language Processing

迄今为止的颠覆性:回顾自然语言处理中的神经机器翻译(NMT)

2021-03-05 23:25 CSOFT

本文共591个字,阅读需6分钟

阅读模式 切换至中文

With news this week that Google has improved some of the key linguistic functionalities of its search engine’s machine learning algorithms – including an AI spell check improvement that Google’s head of search considers more important by itself than the previous five years’ progress – it is an interesting moment in the brief history of language-related AI to reflect on how things have advanced within the language services industry, particularly regarding neural machine translation (NMT). Much has changed in the few years since neural machine translation (NMT) first made waves with its potential to revolutionize localization, even in the very nature of the questions people continue to ask about it. Three years ago, for instance, one significant concern was over how neural MT could impact global language diversity. While concerns for technology’s unintended consequences for minorities are still very much in circulation, the question of AI’s inclusivity within language groups is now the more discussed ethical issue, as algorithms are now known to generate statistical in-groups and out-groups along demographic lines. The fact that this has not manifest in crises specific to the translation industry is partly indicative of the relatively moderate rollout of these capabilities, as well as just how much of a performance gap remains between raw machine translation and human-inclusive models even several years onward. The sheer diversity of populations now interacting in the globalized world economy may be a greater challenge to NMT’s effectiveness than NMT is to global diversity, as the limitations of a machine-centric model for translation have emerged in clearer focus. In terms of performance, one of the major things that has changed for the better is our understanding of how NMT works, and how to improve on raw machine translation. As during its infancy, NMT today requires intensive work from human linguists not only to iron out any linguistic flaws that may arise, but also to verify that the model itself is performing correctly. Whereas plain translation is about expressing something in the correct conjugations and declensions, machine translation post-editing (MTPE) is also about verifying the mechanism behind these choices, and is arguably the harder task. To reduce the burden for human linguists and engineers, focus has shifted toward the crucial element of MT training, whereby neural translation models are prepared on linguistic datasets known to contain accurate translations in relevant subject matter areas. As NMT practices continue advancing to steady enthusiasm across our industry, it is worth bearing in mind that natural language processing (NLP) capabilities now entering a ‘golden era’ will at some point enter practical applications specific to localization. When this happens, a paradigm shift will likely come underway as human linguists become less important to quality assurance. However, we may first see a significant uptick in the market for machine translation solutions as the world economy continues to weather crisis. As NMT is first and foremost a driver of cost-effectiveness and efficiency for high volume translations, it is often the preferable method when budget concerns are decisive. NMT may not look quite as cutting-edge as it once did, but it is a more mature technology with an established role in localization strategy. With a global network of linguists, subject matter experts, and engineers trained in the latest best practices for machine translation and linguistic review, CSOFT International can help companies realize cost-effective solutions meeting all of their translation requirements for entering new markets. You can learn more about our translation technologies and MTPE services at csoftintl.com!
本周有消息称,谷歌已经改进了其搜索引擎机器学习算法的一些关键语言功能--包括一项人工智能拼写检查改进,谷歌搜索主管认为这一改进本身比前五年的进展更为重要--在与语言相关的人工智能简史上,这是一个有趣的时刻,可以回顾一下语言服务行业的进展,特别是神经机器翻译(NMT)。 自从神经机器翻译(NMT)首次以其革命性的本地化潜力掀起波澜以来,几年里发生了很大的变化,甚至在人们持续询问的问题的本质上也是如此。例如,三年前,一个值得关注的问题是神经MT如何影响全球语言多样性。虽然人们对技术对少数群体造成的意外后果的担忧仍然很普遍,但人工智能在语言群体中的包容性问题现在是讨论得更多的伦理问题,因为现在已知算法可以根据人口统计数据生成群体内和群体外的统计数据。这一点在翻译行业特有的危机中并没有体现出来,这在一定程度上说明了这些能力的推出相对温和,也说明了即使在几年后,原始机器翻译和包含人类的模型之间仍有多大的性能差距。随着以机器为中心的翻译模式的局限性越来越明显,在全球化的世界经济中相互影响的人口的完全多样性可能是对NMT有效性的更大挑战,而不是NMT对全球多样性的挑战。 在性能方面,我们对NMT工作原理的理解,以及如何改进原始机器翻译,是一个重要的改进方向。正如在它的婴儿时期一样,今天的NMT需要人类语言学家的密集工作,不仅要消除可能出现的任何语言缺陷,而且还要验证模型本身的性能是否正确。普通翻译是用正确的变体和变节来表达事物,而机器翻译后期编辑(MTPE)也是验证这些选择背后的机制,这无疑是一项更艰巨的任务。为了减轻人类语言学家和工程师的负担,人们的注意力已经转移到机器翻译训练的关键要素上,即在已知包含相关主题领域准确翻译的语言数据集上准备神经翻译模型。 随着NMT实践在整个行业中的持续发展,值得记住的是,现在进入“黄金时代”的自然语言处理(natural language processing,NLP)能力将在某个时候进入本地化特定的实际应用。当这种情况发生时,随着人类语言学家对质量保证变得不那么重要,一种范式的转变很可能正在进行。然而,随着世界经济持续经受危机,我们可能首先看到机器翻译解决方案市场的显著上升。由于NMT首先是大量翻译的成本效益和效率的驱动因素,因此当预算问题是决定性的时,NMT通常是更可取的方法。NMT可能看起来不像以前那么尖端,但它是一种更加成熟的技术,在本地化战略中具有既定的作用。 CSOFT International拥有一个由语言学家,主题专家和工程师组成的全球网络,他们在机器翻译和语言审查方面接受过最新最佳实践的培训,能够帮助公司实现符合其进入新市场的所有翻译要求的经济有效的解决方案。您可以在csoftintl.com了解更多关于我们的翻译技术和MTPE服务!

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文