Facebook Ramps Up Open Source Drive Into Speech Translation

Facebook加速开源驱动到语音翻译

2020-08-06 16:00 slator

本文共329个字,阅读需4分钟

阅读模式 切换至中文

Facebook continues to pour considerable resources into machine translation (MT); but, as evidenced by a recent Thai translation snafu, language technology remains a major challenge for the social media giant. In addition to improving quality estimation and various other initiatives, Facebook is currently working on two others that share information with the broader open source community, allowing developers to improve the technology. In a July 2020 blog post, Facebook AI made available CoVoST V2, a “massively multilingual” speech-to-text translation dataset. The original CoVoST was built on Mozilla’s Common Voice, a database of crowdsourced voice recordings. This new version boasts 2,900 hours of speech, as well as speech translation data from 21 languages into English and from English into 15 languages. “With CoVoST V2, our aim is to foster research into massive multilingual speech translation and move toward a single model that covers many language pairs,” the Facebook AI blog post stated. “We want no language left behind, and that’s why we’re open-sourcing CoVoST V2.” Facebook AI’s other initiative, SimulEval, is “an easy-to-use and general evaluation toolkit for both simultaneous text and speech translation,” according to a July 31, 2020 paper. SimulEval simulates a real-time scenario and evaluates both translation quality and latency, defined as the model’s ability to translate simultaneously. The toolkit provides support for quality metrics such as BLEU, TER, and METEOR, but also allows users to customize evaluation functions. Noting that the code will be released upon publication, the authors encouraged “future research […] to make use of this toolkit in order to obtain an accurate and standard comparison of the latency between different systems.” Of course, Facebook is not the only major company exploring speech translation: all of them do. In a July 2020 paper, the world’s most valuable company, Apple, detailed recent research into speech transcription and translation. The paper was published just one month after Apple announced that the iPhone’s latest operating system, iOS 14, will include a Translate app.
Facebook继续在机器翻译(MT)中投入大量资源; 但是,正如最近的泰文翻译混乱现象证明,语言技术仍然是社交媒体巨头的主要挑战。 除了改善质量评估和其他各种举措外,Facebook目前还在与另外两个更广泛的开源社区共享信息的团队合作,以允许开发人员改进技术。 在2020年7月的一篇博客文章中,Facebook AI发布了CoVoST V2,这是一个“大规模多语言”语音到文本翻译数据集。 原始的CoVoST是建立在Mozilla的Common Voice(通用语音)上,该数据库是众包语音记录的数据库。 这个新版本拥有2,900个小时的语音,以及21种语言的英语语音翻译数据和从英语翻译成15种语言的语音翻译数据。 Facebook AI博客文章中写道:“借助CoVoST V2,我们的目标是促进对大规模多语言语音翻译的研究,并朝着涵盖多种语言对的单一模型发展。” “我们不希望留下任何语言,这就是我们开源CoVoST V2的原因。” 根据2020年7月31日的一篇论文,Facebook AI的另一项举措SimulEval是“一种同时进行文本和语音翻译的易用且通用的评估工具包”。 SimulEval模拟实时场景并评估翻译质量和延迟,即为模型同时翻译的能力。 该工具包支持BLEU,TER和METEOR等质量指标,但也允许用户自定义评估功能。 作者注意到该代码将在出版时发布,以此鼓励“未来的研究[…]使用此工具包,以便对不同系统之间的延迟进行准确和标准化的比较。” 当然,Facebook并不是唯一一家探索语音翻译的巨头:所有人都这么做。 在2020年7月的一篇论文中,全球最有价值的公司—苹果公司详细介绍了有关语音转录和翻译的最新研究。 在苹果宣布iPhone的最新操作系统iOS 14将包含Translate应用程序一个月后,该论文就发表了。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文