TAUS MT Training Datasets Now Available on AWS Marketplace

TAUS MT训练数据集现可在AWS Marketplace上获得

2022-02-16 21:50 TAUS

本文共493个字,阅读需5分钟

阅读模式 切换至中文

Amsterdam, February 16, 2022 - TAUS, the one-stop language data shop, is pleased to announce that TAUS data products and data services are now available to users of Amazon Translate on the AWS Marketplace. This first step marks the beginning of continuous collaboration between TAUS and Amazon Translate. As of today, AWS customers can review and buy 30 bilingual corpora in the Ecommerce, Medical/pharmaceutical and Finance domains. The languages available are as follows: 9 datasets in the Retail & Wholesale Distribution/E-Commerce domain in the following language pairs: English (US) to Danish, Dutch, French, Finnish, German, Italian, Polish, Spanish, and Swedish. 17 datasets in the Pharmaceuticals & Biotechnology domain in the following language pairs: English (US) to Bulgarian, Czech, Danish, German, Greek, Spanish, Estonian, Finnish, French, Hungarian, Italian, Latvian, Dutch, Norwegian, Slovenian, and Swedish. 4 datasets in the Financial Services domain in the following language pairs: English (US) to Czech, Hungarian, Dutch and Romanian. Visit the TAUS kiosk on AWS Marketplace. In addition, TAUS has also listed offers for data creation and data annotation and relevant NLP services to AWS customers through the AWS Marketplace. Domain-specific datasets are very useful for companies that need to customize MT engines. All of these new corpora have been evaluated by Polyglot Technology LLC as an objective third-party MT training expert. “The customization of Amazon Translate with TAUS Data improved the BLEU score measured on the test sets by more than 6 BLEU points on average and 2 BLEU points at a minimum. These are significant improvements that demonstrate the superiority of this customized Amazon Translation Active Custom Translation for the Ecommerce, Medical/Pharma and Financial domain over non-customized Amazon Translate,” says Achim Ruopp, Owner at Polyglot Technology LLC. The full report on the customization of Amazon Translate with the TAUS datasets can be accessed here. TAUS has also completed integration between Amazon Translate and the TAUS Data Marketplace. TAUS now provides Data Enhanced MT (DEMT) service, a new layer on top of Amazon Translate, enabling a customized MT service that is offered on a usage basis through the TAUS Data Marketplace. “As AI-enabled translation becomes more and more mainstream, the quality of the language data powering the MT models takes on high importance,” says Jaap van der Meer, CEO at TAUS. “This collaboration with AWS allows TAUS to reach a much bigger audience with our data and data services”. About TAUS TAUS was founded in 2005 as a think tank with a mission to automate and innovate translation. Ideas transformed into actions. TAUS has become the one-stop language data shop, established through deep knowledge of the language industry, globally sourced community talent, and in-house NLP expertise. We create and enhance language data for the training of better, human-informed AI services. Our mission today is to empower global enterprises and their service and technology providers with data solutions that help them to communicate in all languages, faster, better, and more efficiently. For more information, visit https://www.taus.net/
阿姆斯特丹,2022年2月16日—一站式语言数据商店TAUS很高兴地宣布,TAUS数据产品和数据服务现在可以在AWS市场上向亚马逊翻译的用户提供。这第一步标志着TAUS和亚马逊翻译之间持续合作的开始。截至目前,AWS客户可以查看和购买电子商务、医疗/制药和金融领域的30个双语语料库。可供选择的语言如下: 零售和批发分销/电子商务领域的9个数据集,以以下语言对表示:英语(美国)到丹麦语、荷兰语、法语、芬兰语、德语、意大利语、波兰语、西班牙语和瑞典语。 制药和生物技术领域的17个数据集,以以下语言对表示:英语(美国)到保加利亚语、捷克语、丹麦语、德语、希腊语、西班牙语、爱沙尼亚语、芬兰语、法语、匈牙利语、意大利语、拉脱维亚语、荷兰语、挪威语、斯洛文尼亚语和瑞典语。 金融服务领域的4个数据集,以下语言对:英语(美国)到捷克语、匈牙利语、荷兰语和罗马尼亚语。 访问AWS市场上的TAUS亭。此外,TAUS还通过AWS市场向AWS客户提供数据创建和数据注释以及相关的NLP服务。 领域特定的数据集对于需要定制MT引擎的公司非常有用。所有这些新语料库都由Polyglot Technology LLC作为客观的第三方MT培训专家进行了评估。“使用TAUS数据定制的亚马逊翻译将测试集上测量的BLEU得分平均提高了6分以上,最低提高了2分。这些都是显著的改进,证明了这种定制亚马逊翻译主动自定义翻译的优势,电子商务,医疗/制药和金融领域的非定制亚马逊翻译,”Polyglot Technology LLC的所有者Achim Ruopp说 关于使用TAUS数据集定制亚马逊翻译的完整报告可以在这里访问。 TAUS还完成了亚马逊翻译和TAUS数据市场之间的集成。TAUS现在提供数据增强MT (DEMT)服务,这是亚马逊翻译之上的一个新层,可以通过TAUS数据市场提供定制的MT服务。 TAUS首席执行官Jaap van der Meer表示:“随着人工智能翻译变得越来越主流,为MT模型提供支持的语言数据质量变得非常重要。”“与AWS的合作使TAUS能够通过我们的数据和数据服务接触到更多的受众。” 关于TAUS TAUS成立于2005年,是一家以自动化和创新翻译为使命的智库。想法转化为行动。TAUS已经成为一站式语言数据商店,通过对语言行业的深入了解,全球范围内的社区人才和内部NLP专业知识建立起来。我们创建并增强语言数据,以培训更好的、人性化的人工智能服务。 我们今天的使命是为全球企业及其服务和技术提供商提供数据解决方案,帮助他们以各种语言更快、更好、更高效地进行交流。 想要了解更多信息,请访问https://www.taus.net/

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文