Machine Translation: Providing Context

机器翻译:语境供应

2020-08-17 17:00 Nimdzi Insights

本文共495个字,阅读需5分钟

阅读模式 切换至中文

Some machine translation (MT) providers are holding out hope for MT systems that adapt to document context. Could this development eliminate the need for custom MT engines? Will context-enabled MT help MT achieve human parity? Will we still need to customize a few years from now? Let’s discuss further. The Conference on Machine Translation added a "document-level MT" task in 2019: “We are particularly interested in approaches which consider the whole document. We invite submissions of such approaches for English to German and Czech, and for Chinese to English. We will perform document-level human evaluation for these pairs.” The task of assessing the effectiveness of document-level approaches will also be a part of the 2020 conference, which will be held  online on November 19-20, 2020. This approach may work well in research settings, though it’s likely to become more widely used within the next few years. While some providers of customized MT try to make it easier to select data for customization (e.g. Microsoft Office 365 subscribers can use the documents in their cloud as monolingual customization data), this new level of context has been raising questions from investors and other interested parties about the need to develop new pieces of technology supporting customization Source: Nimdzi Language Technology Atlas, June 2020 Do NMT systems already adapt to document context? There is a major discussion around whether MT, at least for certain language pairs, has reached human parity. “What is clear from research (e.g. Läubli et. al. 'Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation') [is that] achieving human parity in MT has to be evaluated in document context, not just in sentence context,” says Achim Ruopp, Adjunct Professor at Georgetown University. “This implies that the MT systems also have to translate sentences within document context, as human translators do if they have the document context available in their translation environment. Document-context-aware MT is something researchers have been working on for a while (e.g., Google's MacDuff Hughes mentioned it as a priority at AMTA 2016). But where researchers/MT suppliers are with this is not so clear — because of the issue of evaluation, both in methodology and evaluation data,” Ruopp continues.  Producing high-quality, custom MT models requires some expertise and experimentation. Ruopp believes that this complexity is one reason for MT API providers to replace custom MT with systems adapting dynamically to document context. Another reason is that the MT providers need to provide the API features and underlying infrastructure to create, use, and maintain these custom models. This creates complexity on the provider side. And, although MT providers are not complaining about this, it’s still a significant factor that is reflected in the pricing of custom MT models. Looking ahead, full document context-aware MT is expected to become a significant asset for the MT industry. However, at the moment we are still in the phase of customized and multi-purpose MT solutions.
一些机器翻译(MT)供应商对适应文本语境的机器翻译系统抱有希望。这种发展可以消除对个性化定制机器翻译引擎的需要吗?语境支持的机器翻译会帮助机器翻译实现人类平等吗?几年后我们还需要个性化定制吗?我们来进一步讨论。 2019年,机器翻译大会增加了“文档级机器翻译”项目: “我们对文档整体考虑的方法特别感兴趣。而且我们要求提交针对英文到德语和捷克语,以及中文到英文的方法。我们将对这些语言对进行文档级的人工评估。”评估文档级方法的有效性也将是2020年会议的一部分,该会议将于2020年11月19-20日在线举行。 尽管在未来几年内,文档级机器翻译可能会得到更广泛的应用,但这种方法可能在研究背景中成就显著。虽然一些个性化定制机器翻译供应商试图使个性化定制的数据选择更容易(例如,微软办公软件的用户可以将云端中的文档用作单语定制数据),但这种新级别的语境引起了投资者和其他感兴趣的各方对开发支持定制的新技术需求的疑问。 资料来源:《尼姆兹语言技术图集》,2020年6月 神经网络机器翻译系统是否已经适应文档语境? 至少对于某些语言对来说,机器翻译,是否已经达到了人类的平等?围绕着这一问题有一个重要讨论。乔治敦大学(Georgetown University)兼职教授阿希姆·鲁普(Achim Ruopp)说:“从研究(例如,Läubli等人提出,‘机器翻译实现了人类对等吗?是一个文档级评估案例')中可以清楚看到,在机器翻译中实现人类平等必须在文档语境中评估,而不仅仅是句子层面。” “这意味着机器翻译系统也必须在文档语境中翻译句子,就像人类译者在其翻译环境中了解文档语境时所做的那样。机器翻译的文档语境感知是研究人员已经研究了一段时间的内容(例如,谷歌的麦克达夫·休斯(MacDuff Hughes)在AMTA 2016上提到它是优先级)。但是研究人员/机器翻译供应商在这方面的立场并不是很清楚--因为在方法论和评估数据方面的评估问题。“鲁普继续说。 生产高质量的定制机器翻译模型需要一些专业知识和实验。鲁普认为,这种复杂性是机器翻译应用界面供应商用动态适应文档语境系统取代自定义机器翻译的一个原因。另一个原因是机器翻译供应商需要提供应用界面特性和创建的底层基础设施,并使用和维护这些自定义模型。这对供应商来说比较复杂。而且,虽然机器翻译供应商没有抱怨这一点,但它仍然是一个反映在定制机器翻译模型定价方面的重要因素,。 展望未来,机器翻译的全文档语境感知有望成为该行业的重要资产。然而,目前我们仍处于定制化和机器翻译综合解决方案的阶段。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文