Here’s What Happened at ATA’s Expert-in-the-Loop and AI Conference


2023-05-29 08:19 slator


阅读模式 切换至中文

The American Translators Association (ATA) held a virtual conference on May 20, 2023, focusing on machine translation (MT) and artificial intelligence (AI), and the implications for the way translators and interpreters work. The conference was moderated by Nora Díaz, Robert Sette, and Andy Benzo. The keynote speech was delivered by Jay Marciano, current director of MT outreach and strategy at Lengoo and president of the Association for Machine Translation in the Americas (AMTA). Marciano introduced his speech with the idea of thinking about language in a mathematical way, “working with language algorithmically,” and walked attendees through a few examples of how this applies (e.g., the many pages of stats produced from a single Wordle game and the billions of ways a deck of cards can be shuffled). The point of the examples was to show that language is far more complex than the calculations used for the examples, and yet humans can understand the calculations. Natural language processing (NLP) scientists had to work with this complexity to get to today’s technology, including MT and large language models (LLMs). Marciano explained the application of generative AI to translation as a way to have an LLM conduct an analysis of existing translations that results in high quality output. He then mentioned the emergence of AI-generated multilingual content, with which he welcomed attendees to the “post-post editing world.” Finally, Marciano provided a list of AI-related jobs for which linguists already have many applicable skills (e.g., data curator, terminologist, language process analyst, etc.), even in the midst of great uncertainty. He also encouraged people to connect with others and understand that their competition is not AI, but those individuals who are better at using AI. Jonathan Downie, an interpreting consultant, researcher, conference interpreter, and author, presented a session titled “Finding the Value in Human Translation and Interpreting When Machines Are So Good.” Downie began by giving attendees what he called a dose of reality, exemplified by technologies that can translate and interpret well. He also spoke about the parallel situation of language experts being underpaid and struggling for some time now, as is the case with professionals in the audiovisual translation and interpreting sectors (supported by AVTE and ITI survey figures). Many linguists are diversifying inside and outside of their professions or wondering if it is time to do that or leave their professions entirely, added Downie. According to the speaker, technology vendors are portraying language differences as problems that need solving through “smart engineering,” whereas humans market their services as qualified, accurate, but also invisible. He added that being invisible does not help the linguists’ case against machines. Downie recommended using marketing messaging to convey how translators and interpreters make a difference, and reminded attendees that some work is definitely going to the machines and that linguists will need to systematize, specialize or diversify. Matthew Schlecht, a chemist and a scientific/medical polyglot translator, editor, and writer, showed attendees the perspective of a linguist in today’s MT post-editing (MTPE). Schlecht explained the differences between light and full editing, and proceeded to explain his workflow on patent MTPE. After showing a few examples of the kinds of issues found in MT output in several language combinations, including different quality levels from MT engines like Google Translate and DeepL, Schlecht also mentioned how segmentation in certain language pairs can pose a problem, such as between English and Japanese. To the question of whether a linguist can make a living doing MTPE, Schlecht replied yes, and he has been doing this work for over six years. He also echoed the message from the two previous speakers in that the survivors of a technological tidal wave will be those who adapt to changes. Yuri Balashov, an ATA-certified translator, professor of philosophy and faculty fellow in the Institute for Artificial Intelligence at the University of Georgia, offered a presentation centered around topics like domain adaptation and emerging trends in LLMs. The presentation started with a summary of the history of TM and MT, to today’s neural MT and transformers, and moved on to how humans work with these technologies. On the subject of domain adaptation, Balashov stressed the value of translator specialization to make contextual decisions in machine-translated texts. He proposes that domain adaptation is easy for humans as they are able to discern multiple combined/connected meanings (drawing a comparison with neural networks). At present, added Balashov, MT has trouble with domain adaptation, even as numerous groups of researchers work on fine-tuning engines using various methods. Some of the largest MT engines, like Google AutoML Translation, however, show promising domain-specific results. To illustrate his own experience with MT-TMS integrations, Balashov described how he worked with DeepL and ModernMT, which he considered easy to integrate. He mentioned how DeepL is superior to a lot of other MT engines, and attributes the MT performance to better data, not better algorithms. ModernMT was superior in autocorrecting MT output in his experience. Balashov also gave examples of tone, terminology/glossary adaptation, and fuzzy match “repair” functionalities, as well as ways to easily incorporate certain MT engines in a few TMSs, like memoQ. The presentation also included an overview of Large Language Models. Balashov finished with a few closing quotes from industry experts and firms, including Slator’s Florian Faes. The conference ended with a virtual town hall meeting that included Matthew Schlecht, Jost Zetzsche, Carola F. Berger, Daniel Sebesta, and Johanna Klemm. The discussion began with Jost’s explanation of the difference between different types of artificial intelligence, reminding attendees that the tools discussed all “fall into the category of narrow artificial intelligence,” defined as “the ability of a machine to process large amounts of data and make predictions exclusively on the basis of that data.” The panelists also discussed how the expertise of translators is still valid today, what’s next now that all data has already been used to train GPT-4, and the need to be knowledgeable about MT and not be afraid of changes, among other topics.
美国翻译协会(ATA)于2023年5月20日举行了一次虚拟会议,重点讨论了机器翻译(MT)和人工智能(AI),以及对笔译和口译工作方式的影响。会议由Nora Díaz、Robert Sette和Andy Benzo主持。 主题演讲由Lengoo的MT推广和战略现任总监兼美洲机器翻译协会(AMTA)主席Jay Marciano发表。 Marciano介绍了他的演讲,用数学的方式思考语言,“用算法处理语言”,并向与会者介绍了几个如何应用的例子(例如,从一个单词游戏中产生的许多页的统计数据和一副牌可以洗牌的数十亿种方式)。 这些例子的目的是表明,语言远比用于例子的计算复杂得多,但人类可以理解这些计算。自然语言处理(NLP)科学家必须处理这种复杂性才能获得今天的技术,包括MT和大型语言模型(LLM)。 Marciano解释了生成式AI在翻译中的应用,作为LLM对现有翻译进行分析的一种方式,从而获得高质量的输出。然后,他提到了人工智能生成的多语言内容的出现,他欢迎与会者进入“后编辑世界”。 最后,Marciano提供了一个与AI相关的工作列表,语言学家已经拥有许多适用的技能(例如,数据管理员、术语专家、语言处理分析员等),即使在巨大的不确定性中。 他还鼓励人们与他人建立联系,并明白他们的竞争对手不是AI,而是那些更善于使用AI的人。 口译顾问、研究员、会议口译员和作家乔纳森·唐尼(Jonathan Downie)在题为“当机器如此优秀时,发现人类翻译和口译的价值”的会议上发表了演讲。 唐尼首先向与会者介绍了他所谓的现实,例如可以很好地翻译和解释的技术。他还谈到了语言专家的工资过低和挣扎了一段时间的平行情况,视听翻译和口译部门的专业人员也是如此(得到了AVTE和ITI调查数字的支持)。 唐尼补充说,许多语言学家正在他们的专业内外进行多元化,或者想知道是时候这样做还是完全离开他们的专业。 根据演讲者的说法,技术供应商将语言差异描述为需要通过“智能工程”解决的问题,而人类将他们的服务营销为合格,准确,但也是无形的。他补充说,隐形并不能帮助语言学家反对机器。 Downie建议使用营销信息来传达笔译和口译人员如何发挥作用,并提醒与会者,一些工作肯定会交给机器,语言学家需要系统化,专业化或多样化。 Matthew Schlecht是一位化学家和科学/医学多语种翻译,编辑和作家,他向与会者展示了语言学家在当今MT后期编辑(MTPE)中的观点。 Schlecht解释了轻编辑和全编辑之间的区别,并继续解释他在专利MTPE上的工作流程。 在展示了几种语言组合中MT输出中发现的各种问题的几个例子之后,包括来自Google翻译和DeepL等MT引擎的不同质量级别,Schlecht还提到了某些语言对的分割如何造成问题,例如英语和日语之间。 对于语言学家是否可以通过MTPE谋生的问题,Schlecht回答说是的,他已经做了六年多的工作。他还重复了前两位发言者的信息,即技术浪潮的幸存者将是那些适应变化的人。 佐治亚大学人工智能研究所的ATA认证翻译,哲学教授和教员Yuri Balashov围绕领域适应和LLM新兴趋势等主题进行了演讲。 演讲首先总结了TM和MT的历史,到今天的神经MT和变压器,然后讨论了人类如何使用这些技术。 在领域适应问题上,Balashov强调了译者专业化在机器翻译文本中做出语境决定的价值。他提出,领域适应对人类来说很容易,因为他们能够识别多个组合/连接的含义(与神经网络进行比较)。 Balashov补充说,目前,MT在领域适应方面遇到了麻烦,尽管许多研究人员正在使用各种方法对引擎进行微调。然而,一些最大的机器翻译引擎,如Google AutoML翻译,显示出有希望的特定领域的结果。 为了说明他自己在MT-TMS集成方面的经验,Balashov描述了他是如何与DeepL和ModernMT合作的,他认为这很容易集成。他提到DeepL如何优于许多其他MT引擎,并将MT性能归因于更好的数据,而不是更好的算法。根据他的经验,ModernMT在自动纠正MT输出方面更胜一筹。 Balashov还给出了音调,术语/词汇表适应和模糊匹配“修复”功能的例子,以及在一些TMS中轻松整合某些MT引擎的方法,如memoQ。 该演示文稿还包括大型语言模型的概述。 Balashov最后引用了一些行业专家和公司的话,包括Slator的Florian Faes。 会议以一个虚拟的市政厅会议结束,与会者包括马修·施莱赫特、约斯特·泽策、卡罗拉·F·Berger,Daniel Sebesta,and Johanna Klemm. 讨论开始于Jost对不同类型人工智能之间的差异的解释,提醒与会者,所讨论的工具都“属于狭义人工智能的范畴”,定义为“机器处理大量数据并完全基于该数据进行预测的能力。 小组成员还讨论了翻译人员的专业知识如何在今天仍然有效,现在所有数据都已用于训练GPT-4,以及需要了解MT,不要害怕变化等主题。