2021 According to TAUS

2021年根据TAUS

2020-12-22 19:20 TAUS

本文共1263个字,阅读需13分钟

阅读模式 切换至中文

NO MORE VISION. NO MORE PREDICTIONS. The future is now. No need anymore to predict machine translation. It’s here and it’s working. Our message for the new year this time is therefore very pragmatic and single-minded: fair pay for the translators and data-keepers! Be careful what you wish for TAUS started life in 2005 as a think tank, boldly predicting a future of Machine Translation. In 2008 we launched an industry-shared repository of language data - the TAUS Data Cloud - helping the early adopters of Statistical MT systems squeeze better performance out of their engines. Then came Neural MT which boosted the quality of automatic translation further. COVID-19 and the resulting global crisis finished the job, leaving no doubt that technology will change our work and lives for good. And now we are in the midst of these turbulent times, we realize that to become fully reliant on the machines, we need ever more data and always better data. Data that is most of the time linked to the human instrument of knowledge and understanding: language. Hence the emergence of a new sector: Language Data for AI. Download Language Data for AI Report >>> They say: be careful what you wish for. It’s true, we wished for MT technology to work so well. We envisioned how this revolutionary technology could open knowledge to all citizens in our global society and how it could contribute to our evolution as a human community. And now, we have to be wary of the consequences. A world of inequalities Revolutionary technology breakthroughs often lead to a rethinking of our ethical principles and a shake-up of our economic models. While we are getting closer to unlocking knowledge and sharing information with practically everyone around the globe, we realize that the fundamental ideal of equal access causes other inequalities to grow. One is misinformation and mistranslation caused by bias in the data that’s used to train the models. How reliable is the information and knowledge we are being offered? How colored is the content by whoever controls the data and the algorithms? How complex it is to crack this ethical inequality problem in AI-driven translation in the long run is the subject of another TAUS article on Multilingual Morals, which will be published early in the New Year. Here we zoom in on the more immediate economic consequences of the AI revolution for the translation profession as a business sector. If everything is pretty much automated, we have to ask ourselves the basic economic questions of where cost and value are being added. As we argued in our World-Readiness and Translation Economics article last year we are heading towards a free machines model, where besides the cost of maintaining the IT infrastructure there is almost no variable cost involved for the enterprise buyers of translation. Well, that only exists in an ideal world of course where the machines do perfect jobs. The reality is that human intelligence - linguistic and cultural interpretation - remains indispensable for the machines to constantly learn to do a better job. Fair pay for the translators and data keepers While job opportunities for professional translators may be shrinking as a result of the success of automated translation, the need for a new kind of worker is growing explosively. In the LD4AI report we refer to this trend as the rise of the global cultural professional. It’s hard to frame the profile of this new worker. Unlike the professional translator this worker does not need to be linguistically trained or experienced. The basic requirement is that s/he is deeply rooted in his/her local culture. The way they engage with their work givers is through crowdsourcing platforms. They log in on the platforms to claim simple tasks of transcribing a text, interpreting an image, recording a script and dozens of other human intelligence tasks. Millions of people around the world join the new workforce of crowdworkers, that we officially like to refer to as the data keepers. Professional translators running low on their regular jobs may also be taking jobs as data keepers from time to time. Rates for translation have always been under pressure, but in the crowdsourcing world market dynamics drive the rates further down. One dollar for a task that can easily take fifteen minutes of somebody’s time. Is that fair? The workers have become anonymous and the competition is severe. Our economic model may just not be fit for this new age of AI. (See also this article in MIT Technology Review: AI needs to face up to its invisible-worker problem.) So here is where our message for 2021 comes in: be wary of inequality in our ecosystem as a result of this fantastic AI revolution. Pay your translators and the data keepers who keep your data in optima forma fairly. TAUS Fair Cooperation Principle Full disclosure: TAUS is also in the business of creating and annotating data for AI. We are advocates of the Data-First Paradigm, which means that we believe that it makes total sense to develop and optimize your data for translation first (before doing actual translation production). We use NLP technology to clean data and to cluster and tune corpora for domain adaptation and customization of MT systems. And through our own Human Language Project platform we work with data keepers around the world to create new datasets for low-resource languages and domains, and we engage them to enrich and annotate the data. Customers hiring us for data services can trust that TAUS always pays the Human Language Project workers above the minimum wage thresholds in the respective countries and will never go to the bottom of the ‘market’. We call this the Fair Cooperation Principle. Opening the black box with the Data Marketplace A further step that TAUS takes towards a reform of the translation ecosystem in 2021 is launching the Data Marketplace. The Data Marketplace allows everyone who has invested in good quality translations for years to sell their data directly to the technology companies and enterprises that develop and train MT systems. This means that the players in our ecosystem that have put in the hard work now have an open channel to reap the monetary benefits from it. The translation industry has, over the years, been referred to by many as a black box because of its inherent lack of transparency. You dump in a project, a document to translate, but you have no idea what it takes and who is working on it. On the Data Marketplace we are opening the black box. We put the data keepers in the spotlight. After all, it is them who do the hard work and create the value. See the success stories here from Adéṣinà Ayẹni in Nigeria about his efforts to put his language Yorùbá onto the world stage of languages, and from TransLink in Russia who see their early participation in the Data Marketplace as a great head start advantage over other LSPs. TAUS Content Branches Out And so, with the beginning of the twenties, TAUS as a think tank branches out into topics covering the language data for AI sector. The TAUS writers team is expanding to bring you lots more good content and food for thought to broaden your perspectives on the language data for AI industry, data applications and best practices. TAUS wishes you all an equally healthy and prosperous 2021! hbspt.cta._relativeUrls=true;hbspt.cta.load(2734675, 'e33f8b1d-36ff-4e77-aaa8-f8a064f44ec7', {});
再也看不见了。别再预测了。未来就是现在。再也不需要预测机器翻译了。它就在这里,而且在起作用。因此,我们今次的新年讯息是非常务实和专一的:翻译员和资料保管员应享有公平的薪酬! 小心你的愿望 TAUS于2005年作为一个智囊团成立,大胆预测了机器翻译的未来。2008年,我们推出了一个行业共享的语言数据存储库--TAUS数据云--帮助早期采用统计MT系统的用户从他们的引擎中挤出更好的性能。神经机器翻译的出现进一步提高了自动翻译的质量。COVID-19和由此引发的全球危机完成了这项工作,毫无疑问,科技将永远改变我们的工作和生活。现在,我们正处在这个动荡的时代,我们意识到,要完全依赖机器,我们需要更多的数据和更好的数据。大部分时间与人类的知识和理解工具联系在一起的数据:语言。因此出现了一个新的领域:人工智能的语言数据。 为AI报告下载语言资料>>> 他们说:要小心你的愿望。这是真的,我们希望MT技术工作得这么好。我们设想了这种革命性的技术如何向我们全球社会的所有公民开放知识,以及它如何为我们作为一个人类社会的进化做出贡献。现在,我们必须警惕后果。 不平等的世界 革命性的技术突破常常导致我们对伦理原则的重新思考和我们经济模式的动摇。虽然我们越来越接近于释放知识并与全球几乎所有人分享信息,但我们认识到,平等机会的基本理想导致其他不平等现象加剧。一种是由于训练模型所用的数据中的偏见而造成的错误信息和误译。提供给我们的信息和知识有多可靠?谁控制了数据和算法,内容的色彩如何?从长远来看,破解AI驱动翻译中的这一伦理不平等问题有多复杂,这是TAUS另一篇关于多语言道德的文章的主题,这篇文章将在新年初发表。 在这里,我们将聚焦人工智能革命对作为商业部门的翻译行业带来的更为直接的经济后果。如果一切都是自动化的,我们必须问自己一些基本的经济问题:成本和价值在哪里增加。正如我们在去年的《世界准备和翻译经济学》一文中所说的,我们正在走向一种免费机器模式,在这种模式下,除了维护IT基础设施的成本之外,对于购买翻译的企业来说,几乎不涉及任何可变成本。好吧,那只存在于理想世界,当然,在理想世界里,机器可以完成完美的工作。现实是,人类的智能--语言和文化解释--对于机器不断学习做得更好仍然是不可或缺的。 笔译员和数据保管员的公平薪酬 虽然由于自动化翻译的成功,专业翻译的工作机会可能正在减少,但对一种新型工作者的需求正在爆炸式增长。在LD4AI报告中,我们将这种趋势称为全球文化专业人士的崛起。很难确定这位新工人的特征。与专业翻译不同,这位工作者不需要受过语言训练或经验丰富。最基本的要求是他/她深深植根于他/她的当地文化。他们与给他们工作的人接触的方式是通过众包平台。他们登录这些平台,要求完成一些简单的任务,比如抄写文本,解读图像,录制脚本以及其他几十项人类智能任务。世界各地数以百万计的人加入了众筹工作者这一新的劳动力队伍,我们正式称之为数据保管员。专业翻译人员的常规工作越来越少,也可能会时不时地担任数据保管员的工作。 翻译的费率一直处于压力之下,但在众包的世界市场中,动态驱动着费率进一步下降。花一美元做一项可以轻易占用某人15分钟时间的工作。这公平吗?工人们变得无名氏,竞争激烈。我们的经济模式可能不适合这个人工智能的新时代。(另见《麻省理工科技评论》的这篇文章:AI需要正视其隐形工作者问题。) 因此,这就是我们2021年要传达的信息:警惕这场奇妙的人工智能革命所带来的生态系统中的不平等。公平地支付翻译和数据保管员,他们将您的数据保存在optima forma中。 公平合作原则 全面披露:TAUS也从事为AI创建和注释数据的业务。我们是数据优先范式的倡导者,这意味着我们相信首先开发和优化您的数据以用于翻译(在进行实际翻译生产之前)是完全有意义的。我们使用NLP技术来清理数据,并对语料库进行聚类和调优,以便于MT系统的领域适应和定制。通过我们自己的人类语言项目平台,我们与世界各地的数据管理员合作,为低资源的语言和领域创建新的数据集,我们让他们来丰富和注释数据。雇用我们提供数据服务的客户可以相信,TAUS支付给人类语言项目工人的工资总是高于各自国家的最低工资标准,而且永远不会跌到“市场”的最低点。我们称之为公平合作原则。 打开数据市场的黑盒子 TAUS向2021年改革翻译生态系统迈出的另一步是推出数据市场。Data Marketplace允许所有多年来投资于高质量翻译的人直接向开发和培训MT系统的技术公司和企业出售他们的数据。这意味着,在我们的生态系统中投入了辛勤工作的参与者现在有了一个开放的渠道,可以从中收获金钱利益。多年来,翻译行业一直被许多人称为一个黑箱,因为它固有的缺乏透明度。你把一个项目,一个要翻译的文档转储进去,但是你不知道它需要什么,也不知道谁在做它。在数据市场上,我们正在打开黑匣子。我们把数据保管员放在聚光灯下。毕竟,是他们做了辛苦的工作,创造了价值。 这里有来自尼日利亚的AdéMiniinàAy Ni的成功故事,他努力将自己的语言Yoru.Bá推向世界语言舞台,还有来自俄罗斯的TransLink,他们认为自己早期参与数据市场是领先于其他LSP的一大优势。 TAUS含量分枝 因此,从20年代开始,TAUS作为一个智囊团,将话题扩展到了人工智能领域的语言数据。TAUS writers团队正在不断扩大,为您带来更多好的内容和启发思考的食粮,以拓宽您对人工智能行业语言数据,数据应用和最佳实践的视角。 TAUS祝愿大家拥有一个同样健康,繁荣的2021年! hbspt.cta._relativeURLS=true;hbspt.cta.load(2734675,'E33F8B1D-36FF-4E77-AAA8-F8A064F44EC7',{});

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文