The Human Factor in the Development of Translation Software

翻译软件开发中的人的因素

2022-04-27 20:50 MemoQ

本文共1574个字,阅读需16分钟

阅读模式 切换至中文

Humans and AI: Friends or Enemies? Not all translation is suitable for machine translation—otherwise stated, the world will always need “premium translation” that cannot be produced without human involvement. End-users of translations are humans. Also, the users of translation software are humans. In any technology setup, the voluntary agent is always human. No AI is currently capable of using will to make a choice. AI can, however, with various degrees of success, mimic, or rather recycle decisions that certain humans have made in the past. As a result, no technology should ever suggest to replace human agents or not to place the human agent in the middle. Stanisław Lem says in Summa Technologiae that “Every technology is actually an artificial extension of the innate tendency possessed by all living beings to gain mastery over their environment, or at least not to surrender to it in their struggle for survival” and he also suggests that humans use technology as another organ. Building on this notion, we can think of any technology as an extension of the human body and/or the human mind. Technology will give its human users “superpowers.” It will “augment” their capabilities, so that, as in the case of translation, humans will be able to translate more at the same speed, while maintaining good quality, or even improving quality due to the computer’s unique ability to remember things precisely. Humans are indeed not only the users of translation, but also the users, designers, and developers of translation software. This understanding came in handy when I was teaching humanities students to use translation software and had to tackle the technophobia that sometimes occurred among them. In my opinion, the prime source of technophobia is the idea that the computer is an alien or a futuristic robot —a contraption that has its own mind and will. The use (and overuse) of the term ‘artificial intelligence’ does suggest this, but I usually ask my students to imagine a piece of software as yet another method of communication between humans—the designers/developers and the users. In this context, the software is the tool and also the channel. I don’t intend to diminish attempts to create a fully automatic “translation machine”. There have always been dreams of effortlessly bridging language gaps. Especially since Biblical heritage depicts the presence of a multitude of languages as punishment for humankind’s greed for power (cf. the Tower of Babel). The Bible presents an image of the Pentecost where everyone hears the Apostles’ speech in their own tongues. Throughout the Middle Ages, there were countless efforts to find the perfect language, the one God allegedly spoke when creating the world (cf. Umberto Eco). Today we have concepts like the Babel fish and the Universal Translator, which would implement the ultimate and perfect machine translation. I think these dreams are completely legitimate, and it is also worth it to try to create technology that would make translations fully automatic. What is not right—because it is not truthful—is to claim that it is ready. That any machine is ready to replace human agents of translation. That AI has reached “human parity”. There used to be a lot of talk about this, but I think for now it is obvious that those claims were exaggerated at best. In the scientific community, there are assertions and well-founded speculations (albeit without proof to date) that translation as a cognitive task is AI-complete. This means that, before we achieve human-equivalent machine translation, we need to achieve singularity, that is, human-equivalent (or superior) AI. We know that right now we don’t have it, and we don’t know if it can be created. We’re also not sure if it should be created. Translation Software: The Human Factor Until we have the Universal Translator, a lot of translation (the field of the so-called “premium translation”) remains hard work—for humans. This means that the purpose of at least some of the translation software continues to make this work quicker and easier so that translators—and editors and project managers—don’t simply trudge through their work but thrive in their profession. Let’s face it, a lot depends on the actual tools they use. If this is the case, the translation software is not all “language technology” (as in “natural language processing”) but data management, text manipulation, user interfaces, quality assurance workflows, and the list goes on. Thus, for the foreseeable future, there will be translation software that is built around human users of extraordinary knowledge. The task of such software is to make their work as efficient and enjoyable as possible. The way we say it, they should not simply trudge through, but thrive in their work, partially thanks to the technology they are using. From the perspective of a software development organization, there are three ways to make this happen: Invent new functionality Interview power users and develop new functionality from them Go analytical and work from usage data and automate what can be automated; introduce shortcuts No matter what you do, put a human face on it. As I mentioned earlier, software is a means of communication between developer and user. In this conversation, it is usually the developer who is the more active agent. They are the ones to push ideas and implementations. It is therefore the developer’s responsibility to make this communication bidirectional — to listen to users, and to provide help when help is needed. A structured method of accepting feedback and high-quality human customer support is not a nice-to-have, ‘overhead’, or an optional add-on. It is integral to the business. Disruption for the sake of disruption? I am also wary of trying to come up with ‘disruptive’ ideas. First, there is no agreement on what ‘disruptive’ is. In my opinion, a new method that saves a lot of time for the human user is not necessarily a disruptive feature at all. For a new development to be disruptive, it needs to fundamentally change the way we get something done. For example, an electric car does not disrupt transportation (although, in large enough numbers, it may disrupt the fossil fuel industry)—teleportation does. Disruptive developments are also unpredictable, even to the developer themselves. Whether or not a development becomes disruptive can also depend on the users—and human users’ adoption of new technology is relatively slow. It also takes a lot of attempts and a lot of failures if someone wants to invent disruptive technology. To illustrate, watch the ‘Nothing Works’ speech by Jack Conte (founder of Patreon) where the main message is that someone may look successful, but you don’t know how many failures they had. The point is, don’t try to be disruptive for the sake of disruption. This may not be the best way forward for your users. Ethical technology So where is the responsibility of tech companies concerning ethical software development? What can you do as a company to put the human factor into your development process? Here are a few questions with some examples (mostly from the world of AI) you can ask yourself. First and foremost, does your technology serve an honorable purpose? Even when you believe it does, who is benefiting from it (Cui prodest)? Is your technology what you say it is? The very term ‘AI’ is a fallacy because it suggests to an outsider that they’re dealing with an entity of its own mind and its volition—and it has neither. As Kate Crawford says in ‘Atlas of AI’, it is neither intelligent nor artificial. AI is not intelligent because it copies previous human behavior (and it needs that previous human behavior to copy), and it is not entirely artificial because a single AI model may require years of human work collecting and preparing data. Does your technology implement hidden agendas? Are there hidden costs for the user? Are there hidden gains for the developer or the operator? Does the technology create or facilitate an unfair advantage for the more powerful stakeholder(s) if the technology implements collaboration? Does your technology collect or use data in illicit or disingenuous ways? In your privacy policy and your data processing agreement, do you disclose all manners of collecting and using data? If you use automatic anonymization, do you tell the truth about it (i.e., it isn’t 100% precise)? Do you mix up data from different customers or users so that you have enough to train and retrain your AI? Is your technology climate-conscious—or do you throw deep-learning AI at every less-than-obvious problem? For example, it may be possible to create bilingually sensitive predictive typing using simple locally-trained statistics—or you can create the same by training a neural network. The problem with the latter is that by “converting this energy consumption in approximate carbon emissions and electricity costs, the authors estimated that the carbon footprint of training a single big language model is equal to around 300,000 kg of carbon dioxide emissions. This is of the order of 125 round-trip flights between New York and Beijing.” (Payal Dhar, 2020) Finally, human dignity. It also matters what kind of employer the developer is. Sources: Atlas of AI (Kate Crawford) Artificial Intelligence Has an Enormous Carbon Footprint The carbon impact of artificial intelligence GDPR itself My own blog from the past
人类与人工智能:朋友还是敌人? 并不是所有的翻译都适合机器翻译--否则,世界将永远需要“优质翻译”,没有人类的参与就无法产生。 翻译的最终用户是人类。还有,翻译软件的用户是人类。 在任何技术设置中,自愿代理总是人类。目前没有人工智能能够使用意志来做出选择。然而,人工智能可以以不同程度的成功,模仿或更确切地说是循环某些人类过去所做的决定。因此,任何技术都不应该建议取代人类代理或不将人类代理置于中间。 斯塔尼斯瓦夫·莱姆在《技术总结》中说,“每一项技术实际上都是所有生物掌握环境的先天趋势的人为延伸,或者至少在生存斗争中不屈服于环境”,他还建议人类将技术作为另一个器官。 在这个概念的基础上,我们可以把任何技术看作是人体和/或人类心灵的延伸。技术将赋予人类用户“超能力”,它将“增强”他们的能力,这样,就像翻译一样,人类将能够以同样的速度翻译得更快,同时保持良好的质量,甚至由于计算机精确记住事物的独特能力而提高质量。 人类不仅是翻译的使用者,也是翻译软件的使用者、设计者和开发者。当我教人文学科的学生使用翻译软件时,这种理解派上了用场,我不得不解决他们中间有时出现的技术恐惧症。在我看来,技术恐惧症的主要来源是认为计算机是外星人或未来机器人--一种有自己思想和意志的装置。“人工智能”这个术语的使用(或过度使用)确实表明了这一点,但我通常要求我的学生将一个软件想象成人类--设计师/开发人员和用户--之间交流的另一种方法。在这种背景下,软件是工具,也是渠道。 我不打算减少创建全自动“翻译机”的尝试。人们一直梦想着毫不费力地弥合语言鸿沟。尤其是因为圣经遗产描述了多种语言的存在是对人类对权力的贪婪的惩罚(参见巴别塔)。圣经描绘了五旬节的形象,每个人都听到使徒用自己的语言说话。在整个中世纪,有无数的努力来寻找完美的语言,据称上帝在创造世界时说的语言(参见翁贝托·艾科)。今天,我们有了像巴别塔鱼和通用翻译器这样的概念,它们将实现最终和完美的机器翻译。 我认为这些梦想是完全合理的,也值得尝试创造技术,使翻译完全自动化。不对的--因为它不真实--是声称它已经准备好了。任何机器都准备好取代人类的翻译代理。人工智能已经达到了“人类平等”。过去有很多关于这一点的讨论,但我认为现在很明显,这些说法充其量是夸大了。在科学界,有断言和有根据的推测(尽管迄今为止没有证据)认为翻译作为一项认知任务是人工智能完成的。这意味着,在我们实现人类等价机器翻译之前,我们需要实现奇点,即人类等价(或优越)AI。我们知道现在我们没有它,我们不知道它是否能被创造出来。我们也不确定是否应该创建它。 翻译软件:人的因素 在我们有了通用的翻译器之前,许多翻译(所谓的“高级翻译”领域)仍然是艰苦的工作--对人类来说。这意味着至少一些翻译软件的目的是使这项工作更快更容易,这样翻译人员--编辑和项目经理--就不会只是在他们的工作中跋涉,而是在他们的职业中茁壮成长。让我们面对现实吧,这在很大程度上取决于他们使用的实际工具。如果是这样的话,翻译软件就不全是“语言技术”(如“自然语言处理”),而是数据管理、文本处理、用户界面、质量保证工作流,等等。 因此,在可预见的未来,将会有围绕拥有非凡知识的人类用户构建的翻译软件。这类软件的任务是使他们的工作尽可能高效和愉快。按照我们的说法,他们不应该简单地跋涉,而是在工作中茁壮成长,这部分归功于他们使用的技术。 从软件开发组织的角度来看,有三种方法可以实现这一点: 发明新功能 采访强大的用户并从他们那里开发新的功能 从使用数据中进行分析和工作,并自动化可以自动化的内容;介绍快捷方式 不管你做什么,都要在上面贴一张人的脸。正如我前面提到的,软件是开发人员和用户之间沟通的手段。在这个对话中,通常是开发人员是更积极的代理。他们是推动想法和实现的人。因此,开发人员有责任使这种交流成为双向的--倾听用户的意见,并在需要帮助时提供帮助。接受反馈和高质量人工客户支持的结构化方法不是一个很好的拥有、“开销”或可选的附加项。它是业务的组成部分。 为了破坏而破坏? 我也对试图提出“破坏性”的想法持谨慎态度。首先,关于什么是“破坏性”没有达成一致。在我看来,一个为人类用户节省大量时间的新方法并不一定是一个颠覆性的特性。一个新的发展要想具有颠覆性,就需要从根本上改变我们做事的方式。例如,一辆电动汽车不会扰乱交通运输(尽管在足够大的数量上,它可能会扰乱化石燃料行业)--隐形传送会。 破坏性的开发也是不可预测的,甚至对开发人员本身也是如此。一项开发是否具有破坏性也取决于用户--人类用户对新技术的采用相对较慢。如果有人想发明颠覆性的技术,也需要大量的尝试和大量的失败。为了说明这一点,看看杰克·孔戴(Patreon的创始人)的演讲,他的主要信息是,某人看起来很成功,但你不知道他们有多少次失败。关键是,不要为了破坏而试图破坏。对于您的用户来说,这可能不是最好的前进方式。 伦理技术 那么,科技公司在道德软件开发方面的责任在哪里?作为一家公司,你能做些什么来将人的因素纳入你的发展过程?这里有几个问题和一些例子(大部分来自人工智能领域),你可以问自己。 首先也是最重要的,你的技术服务于一个光荣的目的吗?即使你相信它有,谁从中受益(崔普罗斯特)? 你的技术是你说的那样吗? “AI”这个词本身就是一个谬论,因为它向局外人暗示,他们在处理一个自己的头脑和意志的实体--而它两者都没有。正如凯特·克劳福德在《人工智能地图集》中所说,它既不是智能的,也不是人工的。人工智能不是智能的,因为它复制了以前的人类行为(它需要以前的人类行为来复制),它也不是完全人工的,因为单个人工智能模型可能需要人类多年的工作来收集和准备数据。 你的技术实现了隐藏的议程吗? 用户是否有隐藏的成本?开发商或运营商是否有隐性收益?如果技术实现协作,该技术是否为更强大的利益相关者创造或促进了不公平的优势? 您的技术是否以非法或不真诚的方式收集或使用数据? 在您的私隐政策和数据处理协议中,您是否披露了收集和使用数据的所有方式?如果你使用自动匿名,你是否说出了真相(即,它不是100%精确的)?你是否混合了来自不同客户或用户的数据,以便有足够的数据来训练和再训练你的AI? 你的技术气候意识--还是在每个不太明显的问题上都投入深度学习AI? 例如,可以使用简单的本地训练的统计信息来创建对双语敏感的预测类型--或者可以通过训练神经网络来创建同样的类型。后者的问题是,通过“将这种能源消耗换算成大致的碳排放和电力成本,作者估计训练一个单一的大语言模型的碳足迹相当于大约30万公斤的二氧化碳排放。这相当于纽约和北京之间的125次往返航班。“(Payal Dhar,2020) 最后,人的尊严。开发商是什么样的雇主也很重要。 资料来源: 人工智能地图集(凯特·克劳福德) 人工智能有巨大的碳足迹 人工智能的碳影响 GDPR本身 我过去的博客

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文