How does AI Ethics Impact Translation?

AI伦理问题对翻译有何影响?

2021-01-04 18:20 TAUS

本文共1218个字,阅读需13分钟

阅读模式 切换至中文

Much of the tech stack in today’s translation activity will be irrigated by AI. This means using machine learning algorithms and data to augment human productivity. It also increasingly requires that everyone involved is aware of the ethical implications of this technology. So where exactly is the ethical problem in the translation process? AI’s job is to enrich (clients’) content by making it understandable by more people, so it can ideally make a positive impact, either as relevant information or as an economic good. If it fails to do this legally and ethically, then in the long run no one will use it. That’s the basic business deal! Usually, ethical and/or legal issues around AI in relation to content processing, which was also touched upon in the Moral Machines blog article, focus on: The (mis)use of personal data inside a translation, The unacknowledged or illegal use of another owner’s language data, Dangerous carbon footprints in compute, and The biases found in data selections that concern personal health, finance, and other sensitive domains relating to hiring, and other interfaces with people. Note that in none of these cases is the act or fact of translation qua translation a critical component. These aspects of the data business therefore have no direct bearing on moral issues around AI and translation. But we live in a suspicious environment, and translation has now been dissected into distinct tasks, so it is important to recognize potential problems and apply relevant responses. First, what is so special about translation as an action that could make it less problematic in an AI context? Here’s one obvious answer: Translation involves the semantic transfer of a piece of content into another human language – i.e. a code switch in a data stream - so the process itself neither adds nor subtracts existing content. The meaning and associated implications and inferences (and of course the social/moral biases!) of the source remain “as is” in the target. The “content” you see in language A is what you get in language B. If this code switch produces inaccuracies, they will be handled by well-known editing procedures, in the same way as a spelling or attribution error in any written or transcribed content. Translation mistakes are, therefore, not moral or legal failings. Let the public have fun about absurd MT errors, but don’t let us mix messages! Content can contain moral monstrosities In fact, the ethical tradition in translation can be summed up by the famous “don’t shoot, we’re only the messengers”. Translators plead non-responsibility once they have accurately and appropriately translated or post-edited a target text, however intolerable or fake its content may be. The fact that content can contain moral monstrosities – e.g. the personal suffering interpreters experienced at the Nuremberg trials or South Africa’s Truth & Reconciliation Commission hearings - is not a translator’s fault or responsibility. But truth to what is said or written is by definition part of that mission. Hence the outcry at the horrific murder in 1991 of the Japanese literary translator of an English novel that referred to a damaging myth in the history of Islam. He reproduced truth in his role as an intermediary on behalf of a teller of tales. In principle, a translation will always give access to the truth of what has been said/written; in this way, translators help enrich the human story for everyone. This is heavy stuff. As a playful alternative, do check out the wonderfully subversive tale by the Hungarian writer Dezso Kosztolanyi in Kornel Esti, in which a kleptomaniac translator “steals” objects and money from fictional characters in the works he translates by changing the amounts mentioned in the original! A philosophical joke, but we need a little humor these days. hbspt.cta._relativeUrls=true;hbspt.cta.load(2734675, '18445e49-2db2-4712-a9f8-8a809fe0149c', {}); AI in identifying social bias As translation becomes more deeply embedded in language data-driven processes, we shall increasingly see both source and target texts analyzed by intelligent tools capable of identifying “moral issues” and alerting stakeholders. For example, using word spotting to address those points noted by the European Guidelines for Trustworthy AI. Both humans and bots could then be tasked to seek out well-known biases in the linguistic expression of socially, medically or politically sensitive questions. These could include racial and gender inclusiveness, and suggest corrections to dangerously ambiguous language, or collect data on signals relevant to improving quality/work evaluation. They will inevitably miss some of them, as tools tend to. Already MT engines can be selected on the basis of their capacity to handle gender-friendly translation issues accurately. In general, then, systemic bias seems unlikely to become a major translation problem. Partly because there are humans in the loop - and partly in spite of that! The problem of user-generated social media content One obvious hotspot to monitor is user-generated content from social media and online commentary, whose input quality cannot always be controlled upstream by translation buyers. Once again, a translated text could be automatically scanned for word/phrase signals of dangerous social bias, fake content, etc. However, on many occasions, this “biased” content is precisely what is required as useful data for some translation buyers! They might use these findings to measure the recognition habits and expectations of their readers, so they can adapt their own messaging in future communications. AI to monitor content consumption Perhaps the most important development to recognize going forward is that AI will not just be augmenting bits and pieces of the translation chain; it will also be used to monitor the way that translated content is consumed. This does not (yet) raise major ethical issues for the industry unless privacy is breached, but it could generate end-customer concerns about the “rhetorical adequacy” - the social, semantic, and pragmatic fit - of the language expressed. The problem is that this kind of linguistic editing could penalize translations that fail to produce versions that match the “signal generation quotient” (number of hits or reactions) achieved by the source. So communication surveillance is likely to be competitively tied to highly concrete, measurable results in terms of sales conversions, sentiment responses, and other data points. (See this blog for some background on the role of data and signals in the translation business). “Ethical” checks will doubtless be built into these algorithms. So when managing teams of data annotators and translators, it will always be wise to inquire into target language sensibilities about content – just in case unexpected issues arise in local societies and/or target reader communities. It is surely the human value of the work done, and of the resulting opening up of access to information or knowledge for other people that makes translation worthwhile in the first place. The risk is, of course, that the emerging interest in ethical AI and preferring data sources that represent the wealth of human experience and individual differences will ultimately become just another opportunity for using machine learning to drive content. Our job will always be to value the power of human language, not the imitation machine that pretends to be our digital twin! hbspt.cta._relativeUrls=true;hbspt.cta.load(2734675, '93c72fc5-17f4-4e74-b9d2-0d33f09adebb', {});
当今翻译活动中用到的许多技术都将因人工智能得到提升。即利用机器学习算法和数据来提高人类的生产力。这也要求每个参与者都意识到这项技术背后的道德含义。 那么翻译过程中的伦理问题到底出在哪里呢?人工智能的工作是通过让更多的人理解客户的待翻译内容并对其优化。因此,理想状况下,无论是作为关联信息还是作为经济商品,人工智能都会产生积极的影响。但如果它在发挥该作用时未能合乎法律和道德,那么从长远角度看就没有人会使用它。这是最基本的商业规则! 通常,人工智能处理翻译内容时涉及的伦理和/或法律问题(道德机器博客的文章中也提到了这一点)集中在: 翻译中对个人资料的(不当)使用, 未经许可或非法使用另一所有者的语言数据, 计算中可能引起危险的碳足迹,以及 在选取涉及个人健康、财务和其他与招聘相关的敏感领域的数据,以及人际交往中产生的数据时出现偏见。 注意,在上述情况下,翻译这一行为或事实都不是其关键因素。因此,数据业务在这些方面与人工智能和翻译涉及的相关道德问题没有直接关系。但是我们生活在一个多疑的环境中,而翻译现在已经被分割成许多截然不同的任务,因此认识到潜在的问题并作出相关的回应是很重要的。 首先,翻译在人工智能环境中可以减少问题,它有什么特别之处? 一个显而易见的答案是:翻译涉及到将一段内容的语义转换为另一种语言,即数据流中的代码转化,因此这个过程本身既不增加也不减少现有内容。源语言的含义和相关隐含意义及推论(当然还有社会/道德偏见!)在目标语中保持原样。你在语言A中看到的内容和你在语言B中看到的是一样的。 如果这种代码转换不准确,大家熟知的编辑程序将会对这些错误进行处理,和处理任何书面或转录内容中的拼写或归属错误一样。因此,翻译错误不属于道德或法律上的错误。公众可以嘲笑机器翻译出现的荒谬错误,但我们不可以混淆信息! 内容可能包含不符合道德错伦理的信息 事实上,翻译中的伦理传统可用一句名言来概括,即“不要开枪,我们只是信使”。其实,即使译文内容令人难以忍受或含虚假信息,但译者只要准确、恰当地将原文翻译或编辑成目标语言,就不应承担任何责任。源语言内容可能包含道德错误,例如译员在纽伦堡审判或南非真相与和解委员会听证会上的痛苦经历,但这不是译员的过错或责任。 但是,从定义上讲,保证所述或所写内容的真实是译者使命的一部分。因此,人们对1991年一位日本文学翻译家被谋杀的可怕事件表示强烈抗议,他是一本英文小说的译者,提到了伊斯兰历史上一个具有破坏性的神话。他代表故事叙述者,作为一个中间人再现了真相。原则上,译文总是能使人了解所述/所写的真实内容;因此,译者帮助丰富了人类的故事。 这是很重的包袱。看个有趣的故事吧,匈牙利作家Dezso Kosztolanyi在Kornel Esti中写了一个极具颠覆性的故事,一个有偷窃癖的译者从小说人物身上“偷窃”了物品和金钱!因为在他翻译时更改了原文的数额。虽然这是一个哲理笑话,但我们现在需要一点幽默感。 hbspt.cta._relativeUrls=true;hbspt.cta.load(2734675, '18445e49-2db2-4712-a9f8-8a809fe0149c', {}); 利用人工智能识别社会偏见 随着翻译越来越深地嵌入到语言数据驱动过程中,我们会看到越来越多能够识别“道德问题”的智能工具对源文本和目标文本进行分析并提醒利益相关者采取行动。例如,利用关键词分析解决欧洲可信人工智能指南中提到的问题。而后人和机器均可在社会、医学或政治敏感问题的语言表达中找出众所周知的偏差。其中可能包括注意种族包容和性别包容,提示纠正有歧义的语言,或收集相关的信号数据帮助提高质量/工作评估。他们也会工具一样,不可避免地会遗漏其中一些问题。 现在,已经可以根据机器翻译引擎处理性别相关翻译问题的能力来进行精确选择。因此,总体而言,系统性偏差似乎不太可能成为主要的翻译问题。一部分是因为有人类参与其中——一部分是因为这些问题几乎可以忽略了! 用户发布社交媒体内容时产生的问题 显然,一个需要注意的热点是用户在社交媒体及网上发布的内容和评论,这些内容的质量无法一直由购买翻译内容的人在上游控制。不过翻译文本会自动扫描,识别可能含社会偏见、虚假内容等的单词/短语。然而,在许多情况下,这些含“偏见”的内容正是一些翻译内容购买方所需要的有用数据!他们可以利用这些数据估计读者的认知习惯和期望,这样就可以在未来的交流中调整自己的信息传递方式。 AI监控内容消费 也许未来人工智能最突出的发展将不仅仅是改善翻译行业的细节;它还将用于监控翻译内容的消费方式。不过,除非用户隐私被侵犯,否则这还不会对该行业造成重大的道德问题,但它可能会引起终端客户的担心,担心语言表达中涉及社交、语义和语用适用性等方面的修辞是否充分。 问题是,这种语言上的编辑可能会对一些翻译不利,这些翻译版本无法达到原文的信号产生份额(点击或引起反应的数量)。因此,通信监控很可能与销售转换、情绪反应及其他数据点等高度具体、可衡量的结果紧密相连。(关于数据和信号在翻译业务中的作用,请参阅本博客了解背景信息)。毫无疑问,道德伦理检查将被计入这些算法中。 因此,在管理数据标注和翻译人员团队时,了解翻译内容在目标语言中的敏感度总是明智的,可以防止在目标语社会和/或目标读者群体中出现意想不到的问题。翻译工作的人文价值以及为他人提供自由获取信息或知识的机会,无疑是使翻译具有价值的首要因素。 当然,风险在于人们对人工智能伦理问题兴起的兴趣,以及对代表人类经验财富和个体差异的数据源的偏好,最终只会成为利用机器学习驱动内容的又一个机会。我们的工作将永远重视人类语言的力量,而非假装人类的模仿机器! hbspt.cta._relativeUrls=true;hbspt.cta.load(2734675, '93c72fc5-17f4-4e74-b9d2-0d33f09adebb', {});

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文