The Human Factor in Linguistic Quality Evaluation

语言质量评价中的人为因素

2020-08-27 08:10 Nimdzi Insights

本文共462个字,阅读需5分钟

阅读模式 切换至中文

As we discussed in the June 2020 edition of the Nimdzi Language Tech Atlas, there exist different kinds of tools that help minimize human errors when dealing with localization. These are automated Quality Assurance (QA checkers) as well as proven solutions for in-context, in-country review (such as InContext Translation and QA by Lingoport, Rigi.io, or visualReview by translate5). To further help QA teams, companies like Microsoft developed tools such as MS Policheck which produced reports on the localized content for human evaluators to go through. Reports contain potential issues with “offensive” or “contentious” terms. And yet, marketing and localization teams across the globe continue to call out review as an ongoing issue. Here’s a common situation described in this podcast on maximizing the impact of localized content: “...I was in a meeting where my day job was actually providing LQA services, and they were saying, “Wow, you’re rating this other company really high, and yet our in-country stakeholders have so much feedback and they’re not satisfied.” As it happens, even when automation is in place to help ensure linguistic quality, one can still get frustrated customers and offended users. Here’s where a manual approach to quality matters. There are quite a few types of testing that involve human input and dedicated involvement (functionality testing, linguistic QA, regression testing, etc.) with culturalization testing being one of the most important types. It requires human effort to check anything that could be potentially considered inappropriate, offensive or unwittingly laughable in target locales. Linguistic quality audit Localization and testing companies like Alpha CRC also use a manual approach to localization auditing: all checks are done by auditors directly on the platforms any end-user would use. Over the course of the audit, another important thing a machine can’t yet help check is impression. As Alpha CRC put it, impression is basically checking overall content in context. Suitability for the target audience, tone, style, and fluency are examples of other aspects a human tester is keeping an eye on. Linguistic quality audit. Source: Alpha CRC Why is such an audit important? All in all, the instructions, lists of potentially offensive terms, corporate glossaries, and style guides used during testing are the result of the work of one person (or one group). The problem is, then, that content quality is based on just this single person's opinion. The testers and auditors are here to provide a second opinion. Anything questionable will result in a discussion. It's important to have multiple opinions and questions raised before release—so that these are not brought up by end-users once the product is live. All these manual testing activities around culturalization and impression help create content that resonates with local audiences.
正如我们在2020年6月版的Nimdzi语言技术图集中讨论的那样,在处理本地化工作时,有各种各样的工具可以帮助我们减少人为错误。我们有很多自动化检查应用(如QA检查程序),以及应对内容、国内审查的行之有效的解决方案(如由Lingoport、Rigi.io提供的上下文内翻译和QA检查程序或由visualReview研发的翻译5)。 为了进一步帮助QA团队,像微软这样的公司开发了像MS Policheck这样的工具,这些工具可以生成本地化内容报告,供评估人员审阅。报告中会显示诸如存在“攻击性”或“有争议性”词语的问题。 然而,全球的市场和本地化团队把审查视为一个仍未解决的问题。以下是本播客中关于本地内容影响最大化过程中出现的常见情况: “…有一次我参加了一个会议,为其提供LQA服务,有顾客对我说:哇,你给另一家公司的评级真的很高,而我们国内的利益相关者却多次反馈他们并不满意。” 就像上文所说的那样,即使帮助确保语言质量的自动化检查已经到位,其结果仍然会让有些客户感到沮丧和冒犯。这时,人工检测就显得尤其重要。 有相当多类型的测试需要人工专门输入和参与(如功能测试、语言QA、回归测试等等),其中文化测试是最需要人工参与的类型之一。它需要人工检查在目标地区可能被认为是不合适的、冒犯的或者产生歧义的任何问题。 语言质量审计 即使像Alpha CRC这样的本地化测试公司也使用手动方式来进行本地化审计: 所有的检查都是由审计人员在终端用户使用的平台上直接完成的。在审计过程中,机器还不能帮助检查的另一个重要方面是整体印象。正如Alpha CRC所说,印象检查基本上是通览文章后对文章的整体印象。对是否适合目标受众、语气、风格和流畅性是测试人员需要关注的几个方面。 语言质量审核. 资料来源:Alpha CRC 为什么这样的审计不可或缺? 总的来说,测试期间生成的使用说明、潜在冒犯性术语列表、公司术语表和风格指南是一个人(或一个小组)工作的结果。而问题在于内容的质量仅仅取决于个人的观点,于是测试人员和审计员在会这里提供另一种意见。有问题就会有讨论,在发布前提出多种意见和问题是很重要的,这样一旦产品上线,最终用户在使用时就不会再有这些问题。 所有这些围绕文化和印象的人工测试活动有助于创建能与当地观众产生共鸣的内容。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文