TAUS Launches Data-Enhanced Machine Translation

TAUS推出数据优化版机器翻译

2022-01-19 19:25 TAUS

本文共477个字,阅读需5分钟

阅读模式 切换至中文

Amsterdam, January 19, 2022 - TAUS, the one-stop language data shop established through deep knowledge of the language industry, globally sourced community talent, and in-house NLP expertise, launches a new service: Data-Enhanced Machine Translation (DEMT) on their Data Marketplace. MT customization essentially requires two elements: an MT engine and training data. By combining both into a single online service, DEMT offers an end-to-end solution to those who wish to produce customized MT output for their specific domains, without the hassle of going through the actual MT training process. Users can simply drop the file they would like to machine translate and select the datasets that they wish to be used in their customization. In the background, our technology processes the file through an Amazon Active Custom Translate integration by feeding the selected training dataset into the engine to produce a highly customized output. The translated file is then directly sent to the user’s inbox. “Generic MT engines are widely available. But to ensure that MT can handle domain-specific content well, proper customization is key,” says Jaap van der Meer, Director at TAUS. “With the TAUS DEMT service, we have made customized, affordable and high-quality machine translation accessible to anyone, regardless of their expertise or access to relevant training data. Based on the independent analysis, BLEU score points are proven to increase by 15.3% on average in the Ecommerce, Medical and Financial domains with the TAUS DEMT.” The impact of the training datasets available for the DEMT service has been independently evaluated by Polyglot Technology LLC. “In total, we evaluated 8 language pairs for the E-Commerce domain, 18 language pairs for the Medical/Pharma domain and 4 language pairs for the Financial domain,” says Achim Ruopp, Owner at Polyglot Technology. “The customization of Amazon Translate with TAUS Data always improved the BLEU score measured on the test sets by more than 6 BLEU points on average and 2 BLEU points at a minimum. These are significant improvements that demonstrate the superiority of this customized translation for the E-Commerce, Medical/Pharma and Financial domains over non-customized MT outputs.” The detailed analysis can be downloaded here. The library of available datasets for the DEMT service is planned to grow. You can try the TAUS DEMT service here. About TAUS TAUS was founded in 2005 as a think tank with a mission to automate and innovate translation. Ideas transformed into actions. TAUS has become the one-stop language data shop, established through deep knowledge of the language industry, globally sourced community talent, and in-house NLP expertise. We create and enhance language data for the training of better, human-informed AI services. Our mission today is to empower global enterprises and their service and technology providers with data solutions that help them to communicate in all languages, faster, better, and more efficiently. For more information, visit https://www.taus.net/
来自阿姆斯特丹的TAUS是一家一站式的语言数据商店——它依靠着对语言行业的深厚知识、来自全球各地的社区人才以及内部的NLP专业知识而成立。在2022年1月19日,它在其数据市场上推出了一项新服务:数据优化机器翻译(DEMT)。 定制MT在本质上需要两个元素:MT引擎和训练数据。DEMT为那些希望给特定领域进行定制MT输出的人提供了端对端的一条龙解决方案,不再需要经历实际的MT培训过程——通过将MT引擎和训练数据结合到一个单一的在线服务中。用户只需要删除他们想要机翻的文件,然后选择他们想要在自定义中使用的数据库——而在后台,我们的技术就通过Amazon Active Custom Translate集成来处理文件:将选定的训练数据集输入MT引擎,以产生高度定制的输出,然后就可以将翻译后的文件直接发送给用户了。 通用MT引擎的用途是很广的;但为了确保MT引擎能够很好地处理特定领域的内容,适当的定制是关键。TAUS主管Jaap van der Meer说: “通过TAUS DEMT服务,我们可以让每个人都能获得定制的,并且能享受高质量的机器翻译——无论他们专业知识的高低或能否获得相关的培训数据。根据独立的数据分析,在电子商务、医疗和金融领域,使用TAUS DEMT的BLUE得分平均增加了15.3%.” Polyglot Technology LLC对DEMT服务中可用的训练数据集的影响进行了单独评估——Polyglot Technology的老板阿希姆·鲁普说: “我们总共评估了电子商务领域的8种语言对,医疗或制药领域的18种语言对以及金融领域的4种语言对。我们认为 Amazon Translate与TAUS数据的定制确实提高了测试集上测得的BLEU分数——平均芬超过6个BLEU点,最低分都有至少2个BLEU点。这些都是很鲜明的进步:证明了这种定制翻译在电子商务,医疗或制药以及金融领域的确具有优越性——相比于非定制的MT输出而言。” 详细的分析可以在这里下载。DEMT服务的现有数据集库预计会扩大。你可以点击这里,体验TAUS DEMT服务。 关于TAUS TAUS成立于2005年,是一家以自动化和创新翻译为使命的智慧库。致力于把想法转化为行动。现在,TAUS已经成为一站式的语言数据商店——根据对语言行业的深厚知识、来自全球的社区人才和内部的NLP专业知识而成立。我们创建和增强语言数据,以训练更好的、更具有人性的的人工智能服务。 今天,为了使全球的企业及其服务和技术的提供商能够更快、更好、更高效地使用各种语言,我们提供数据解决方案——这就是我们的使命。 欲知详情,请访问https://www.taus.net/

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文