Emotion Is What You Need — Emotional Context Improves Translation Quality of LLMs

情感是你所需要的——情感语境提高法学硕士的翻译质量

2024-08-15 10:50 slator

本文共390个字,阅读需4分钟

阅读模式 切换至中文

In May 2024, researchers emphasized the crucial role that emotions play in human communication and introduced a new dataset designed to enhance speech-to-text and speech-to-speech translation by integrating emotional context into the translation process. In July 2024, Alibaba incorporated speech emotion recognition (SER) into its FunAudioLLM to retain original emotions in AI-powered interpreting. Building on this, an August 6, 2024, paper by Charles Brazier and Jean-Luc Rouas from the University of Bordeaux demonstrated how to integrate emotional context into large language models (LLMs) to condition translation and improve quality. They argue that “conditioning the translation with a specific emotion would use a suitable vocabulary in the translation.” This research builds on the authors’ previous work, which was the first to explore combining machine translation (MT) models with emotion information. Their earlier study demonstrated that adding emotion-related data to input sentences could enhance translation quality. In this latest study, Brazier and Rouas take the concept further by replacing the MT model used in their prior work with a fine-tuned LLM. They introduced a pipeline where emotions — such as arousal, dominance, and valence — are embedded into LLM prompts. They utilized a SER model to extract emotional dimensions from audio recordings, which were then incorporated into the LLM’s input prompts to guide the translation process. To test this approach, they fine-tuned five LLMs for English-to-French translation and identified the best-performing model, Unbabel’s TowerBase-7B-v0.1, for further experimentation. For each input sentence, the SER model analyzed the corresponding audio to automatically estimate its emotional dimensions, which were then included in the translation prompts. Brazier and Rous compared translation performance with and without the inclusion of emotional dimensions as extra information added to each input prompt. According to the authors, the integration of emotional data into the translation process resulted in “notable improvements” in BLEU and COMET scores compared to those without emotion integration, especially when arousal was considered. The TowerBase-7B-v0.1 model showed the most significant performance gains when emotional context was included, suggesting that incorporating emotional context can lead to more accurate and contextually appropriate translations, especially in scenarios where emotion plays a crucial role. “Incorporating emotion information into the translation process appears to enhance translation quality,” said Brazier and Rous. They also plan to extend their method to speech translation.
2024年5月,研究人员强调了情感在人类交流中发挥的关键作用,并引入了一个新的数据集,旨在通过将情感背景整合到翻译过程中来增强语音到文本和语音到语音的翻译。 2024年7月,阿里巴巴将语音情感识别(SER)纳入其FunAudioLLM,以在人工智能翻译中保留原始情感。 在此基础上,波尔多大学的Charles Brazier和Jean-Luc Rouas于2024年8月6日发表的一篇论文展示了如何将情感上下文集成到大型语言模型(LLM)中,以调节翻译并提高质量。 他们认为“用特定的情感来制约翻译会在翻译中使用合适的词汇。” 这项研究建立在作者之前的工作基础上,该工作首次探索了将机器翻译(MT)模型与情感信息相结合。他们早期的研究表明,在输入句子中添加情感相关数据可以提高翻译质量。在这项最新的研究中,Brazier和Rouas进一步发展了这一概念,用微调的LLM取代了他们之前工作中使用的MT模型。 他们引入了一个管道,将情感——如唤醒、支配和效价——嵌入到LLM提示中。他们利用SER模型从录音中提取情感维度,然后将其纳入LLM的输入提示中,以指导翻译过程。 为了测试这种方法,他们微调了五个LLM的英语到法语翻译,并确定了性能最佳的模型,Unbabel的TowerBase-7B-v0.1,用于进一步的实验。对于每个输入句子,SER模型分析相应的音频,以自动估计其情感维度,然后将其包含在翻译提示中。 Brazier和Rous比较了在每个输入提示中添加和不添加情感维度作为额外信息的情况下的翻译性能。 根据作者的说法,与没有情绪整合的人相比,将情绪数据整合到翻译过程中导致BLEU和COMET分数的“显著改善”,特别是当考虑唤醒时。 TowerBase-7B-v0.1模型显示了当包含情感上下文时最显著的性能提升,这表明包含情感上下文可以导致更准确和上下文适当的翻译,特别是在情感起关键作用的场景中。 “将情感信息融入翻译过程似乎可以提高翻译质量,”Brazier和Rous说。他们还计划将他们的方法扩展到语音翻译。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文