Neural Machine Translation (NMT): Translating Emojis

神经机器翻译(NMT):翻译表情符号

2021-03-26 07:00 CSOFT

本文共574个字,阅读需6分钟

阅读模式 切换至中文

2016 was the year of Neural Machine Translation (NMT). After Facebook’s announcement in June 2016 and Google’s in September 2016, that the giants’ translation platforms were being powered by Neural Machine Translation, the tech and translation worlds went wild speculating on the future of cross-cultural communication and the relevance of human-powered translation in the coming years. (We have to admit, we did as well.) But a year on from Facebook’s announcement, NMT hasn’t fully blossomed into the language-demystifying super-machine it was ordained to be. Google continues to announce every few weeks or so that they’ve added another language pair to their NMT-compatible list. However, according to Facebook Engineering Manager Necip Fazil Ayan, as of April 2017, Facebook is still only halfway to their goal of powering all translations across Instagram, Facebook, and Workplace by NMT. What’s the holdup? The fickle human mind, that’s what’s up. Pop culture references have always evaded dictionaries for some time after their initial adoption, and now due to the voracity and speed at which the internet consumes and discards new slang and references, machines can’t quite keep up. Ayan has pointed to “odd spellings, hashtags, urban slang, dialects, hybrid words, and emoticons” as being the major hurdles for Neural Machine Translation. And here we digress. It’s emoji time. Emoji were invented in Japan in 1997/98 by Shigetaku Kurita, who was working at cellphone provider NTT Docomo. They were widely-used in Japan for nearly ten years before crossing the Pacific to the West, but once they arrived, they took over texting by storm. As their popularity has spread worldwide, linguists and laymen alike have questioned their impact on written language, and whether they even constitute a new language entirely. A study by an Emory University undergrad has shown that not only were participants unable translate sentences to and from strings of emoji with any level of consistency, they could not even agree on the meaning behind a single emoji. Individual participants would even attach multiple meanings to a single emoji or change their concept of the emoji over the course of the study. These inconsistencies made it clear that these ideograms are nowhere near capable of precisely communicating what a real language can. With this debate out of the way, what are emoji useful for? Conveying emotion, primarily. Like the first smiley faces created by punctuation marks, emoji were invented to convey information faster and to give cold, black-and-white texts a human component. Linguists have found that “[of] the 20 most frequently used emoji, nearly all are hearts, smilies, or hand gestures—the ones that emote”. So, when character limits and tired thumbs constrain us to only sending tiny snippets of text, emoji can help convey a heart’s worth of prose in just the width of a letter. It’s clear that emoji aren’t a language, as it’s impossible to pin down their exact meanings. This fact has made them nearly impossible to translate, and is why Facebook’s NMT has chosen to simply ignore them and those other pesky language inputs like hashtags and slang. So, when it comes to the original question of whether or not machines can translate emojis, the answer is a resounding “no”—but, neither can humans. Perhaps, this is why it’s best to leave translating Facebook posts up to Facebook and the serious documents up to the professionals.
2016年是神经机器翻译(NMT)年。Facebook于2016年6月和谷歌于2016年9月宣布,这两家巨头的翻译平台将由神经机器翻译提供动力,科技界和翻译界开始疯狂猜测跨文化交流的未来,以及人工翻译在未来几年的重要性。(我们必须承认,我们也这么做了。) 但在Facebook宣布一年之后,NMT还没有完全发展成它注定要成为的那种语言解密的超级机器 谷歌每隔几周左右就会宣布,他们已经向nmt兼容列表中添加了另一种语言对。然而,根据Facebook工程经理Necip Fazil Ayan的说法,截至2017年4月,Facebook的目标是通过NMT在Instagram、Facebook和Workplace上提供所有翻译服务,但这一目标只实现了一半。 抢劫是什么?人类善变的头脑,就是这样。 流行文化的参考文献在最初被采用后,一段时间里一直避开词典,而现在,由于互联网对新的俚语和参考文献的贪婪和丢弃速度,机器无法跟上。阿扬指出,“奇怪的拼写、标签、城市俚语、方言、混合词和表情符号”是神经机器翻译的主要障碍。 这里我们离题了。这是emoji时间。 表情符号是1997/98年由日本手机供应商NTT Docomo的栗田重拓发明的。在跨越太平洋来到西方之前,手机在日本被广泛使用了近十年,但一旦它们到达日本,它们就像风暴一样席卷了短信 随着它们在世界各地的普及,语言学家和外行都在质疑它们对书面语的影响,甚至质疑它们是否构成了一种全新的语言。 一项由埃默里大学(Emory University)本科生进行的研究表明,参与者不仅不能在表情符号串之间以任何程度的一致性来翻译句子,他们甚至不能就单个表情符号背后的含义达成一致。在整个研究过程中,个体参与者甚至会在一个表情符号上附加多个含义,或者改变他们对表情符号的概念。这些不一致清楚地表明,这些表意文字远不能像真正的语言那样精确地传达信息。 这个争论已经结束了,表情符号有什么用呢?传达情感,为主。 就像第一个由标点符号创造出来的笑脸一样,emoji被发明出来是为了更快地传递信息,给冰冷的黑白文本添加人类元素。语言学家发现,“在20个最常用的表情符号中,几乎都是心形、微笑或手势,这些都是表达情感的表情。” 所以,当字符限制和疲惫的拇指限制我们只能发送很小的文本片段时,表情符号可以帮助我们在一个字母的宽度内传达出有价值的散文。 很明显,表情符号不是一种语言,因为不可能确定它们的确切含义。这一事实使得它们几乎不可能被翻译,这就是为什么Facebook的NMT选择简单地忽略它们以及其他烦人的语言输入,如标签和俚语。 所以,当问到机器是否能翻译表情符号这个最初的问题时,答案是响亮的“不能”——但是,人类也不能。也许,这就是为什么最好把翻译Facebook上的帖子留给Facebook,把严肃的文件留给专业人士。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文