Machine Translation Post-Editing: The Two-Second Rule

机器翻译后期编辑:两秒规则

2020-10-01 11:50 Nimdzi Insights

本文共521个字,阅读需6分钟

阅读模式 切换至中文

If you’re a driver, you’ve probably heard of the two-second rule. Staying at least two seconds behind any vehicle is considered a rule of thumb for drivers wanting to maintain a safe following distance at any speed. The two seconds don’t represent safe stopping distance but rather safe reaction time. Reaction time for machine translation (MT) Reaction time depends on several factors, including age and experience. For most people, it’s in the 0.2 to 2 second range. The same reaction time approach can be applied to machine translation post-editing (MTPE). Source: “What Is a Safe Following Distance?,” Smart Motorist When linguists deal with raw MT engine output, they sometimes spend too much time analyzing it and deciding whether or not it’s usable at all. Here’s where imposing some limits on decision-making time could be of help: if you spend two seconds looking at an MT segment (after familiarizing yourself with both the source and target text), and see that you cannot easily edit it, discard it and translate it from scratch (or use a lower fuzzy match from the translation memory). Three-to-five second rule? Research on efficient post-editing shows that it might be difficult to determine whether you should correct bad MT output or if it would be faster to delete and re-translate any segments of borderline quality. But if you want to be efficient, you shouldn’t spend more than two to three seconds determining this. As with driving, some experts consider two seconds to be the minimum time you should allow, but recommend applying a three-second rule instead. This means an extra margin of safety and confidence (both when driving and post-editing). Before the Neural Machine Translation era began in 2014-2017 there was a lack of post-editing guidelines online. In 2009, TAUS addressed the post-editing dilemma with the recommendation: “In a customer support application, for example, MT users should avoid PE where possible, or limit it to error items that can be evaluated as correctable within 2 seconds. Otherwise PE becomes too expensive for the possible end user benefits.” In 2013, quoting the rules of Microsoft, Mesa-Lao (as quoted in a Comparative Study of Post-Editing Guidelines, 2016) provided suggestions on how to decide whether MT output should be recycled. Those included a “5-10 second evaluation” recommendation on making such decisions. Avoid driving too slowly Talking to MTPE practitioners in 2020, you’ll hear some of them consider five seconds to be an actual norm.  Indeed, there are times when the two-second rule doesn’t apply (you could argue that leaving just two seconds of distance between your car and the one in front of you is still dangerous). It was designed for making decisions in a normal traffic situation under normal circumstances. Some MT segments may be particularly challenging to process and therefore require more time to decide whether ‘tis nobler “to MTPE or not to MTPE.” At the same time, as with driving, thinking for too long about every decision makes you lose precious time. And what’s MT here for? Why, to save time and effort.
如果你是一个司机,你可能听说过两秒钟规则。在跟前车行进时,至少保持两秒的距离,这是一个司机想要在任何速度之下都能保持安全距离的首要条件。这两秒不意味着安全的刹车距离,而是代表安全反应时间。 机器开始翻译的反应时间。 反应时间受司机的年龄和经验等因素的影响。对大多数人来说,大概就在0.2秒到2秒之间。同样的,这种反应时间也可以运用于机器制动后。 《聪明的汽车人》中的一句话:“安全跟车距离是多少?” 当语言学家研究最初的机器翻译引擎输出的时候,他们花费了太多时间来分析引擎本身以及确定它是否可用。在这里,在决策时间施加一些限制可能会有帮助,如果你花两秒查看一个机器翻译片段(在熟悉了源文本和目标文本之后),发现你没有办法轻松的理解它,那就舍弃机器翻译,从头开始翻译(或者使用更低的模糊程度来匹配引擎中的译码存储器)。 3到5秒原则? 关于高效后期编辑的研究表明,你是应该纠正错误的机器翻译结果,还是说删除和重新翻译低质量的片段更为高效,这些情况都很难决定。但是如果你想提高效率,你就应该在两到三秒之间来决定这一点。就像开车一样,一些专家认为两秒是你应该允许的最短时间,但建议应用的还是三秒规则。这意味着额外的安全裕度和信心(无论是在驾驶时还是后期编辑时)。 在2014-2017年神经机器翻译时代开始之前,在线上缺乏后期编辑指南。2009年,翻译自动化用户协会关于这一问题提出了以下建议:“例如,在客户支持应用程序中,机器翻译用户应该尽可能避免译后编辑,或者将其限制在可以在2秒内评估出的可更正的错误项。否则,译后编辑将变得过于昂贵,无法为最终用户带来可能的好处。“ 2013年,引用微软的规则,Mesa-Lao(正如在一项编辑后指南的比较研究中引用的,2016年)提供了如何决定是否应该回收机器翻译结果的建议。其中包括“5-10秒的评价”建议。 避免翻的太慢。 在2020年,与译后编辑的从业者交谈时发现他们中的一些人认为5秒是一个实际的标准。 的确,有时两秒规则并不适用(你可能会认为,你的车和你前面的车保持两秒的距离仍然是危险的)。它是为在正常情况下,根据正常交通情况而设计的。有些机器翻译片段处理起来可能特别具有挑战性,因此需要更多的时间来决定是进行译后编辑好还是不进行译后编辑比较好。同时,就像开车一样,每一个决定考虑太久会让你失去宝贵的时间。译后编辑是为了什么而存在?为了节省时间和精力。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文