Domain-Specific Training Data Generation for SYSTRAN

用于SYSTRAN的特定于领域的训练数据生成

2021-02-19 18:50 TAUS

本文共128个字,阅读需2分钟

阅读模式 切换至中文

When the global pandemic hit the world in 2020, TAUS created a starter kit in several languages to train high-quality translation models customized for the pandemic domain. SYSTRAN, a leading AI-based translation technology company, partnered with TAUS to use these datasets to produce twelve translation models for English to/from French, Spanish, German, Italian, Chinese and Russian and make them available on SYSTRAN Marketplace where NMT models are offered to a network of language experts to train models in any language pair and domain. After the training with the TAUS Corona datasets, the SYSTRAN engines improved on average 18% across all twelve language pairs compared to the SYSTRAN baseline engines. hbspt.cta._relativeUrls=true;hbspt.cta.load(2734675, 'f8f8c2e6-d44e-4bce-9c74-ae35b718a827', {});
当全球大流行在2020年袭击世界时,TAUS创建了一个几种语言的初学者工具包,用于训练为大流行领域定制的高质量翻译模型。领先的基于AI的翻译技术公司SYSTRAN与TAUS合作,利用这些数据集生成12个英语与法语,西班牙语,德语,意大利语,汉语和俄语之间的翻译模型,并将它们提供给SYSTRAN Marketplace,在那里,NMT模型提供给语言专家网络,以训练任何语言对和领域的模型。 在使用TAUS Corona数据集进行训练后,与SYSTRAN基线引擎相比,SYSTRAN引擎在所有12种语言对上平均提高了18%。 hbspt.cta._relativeURLS=true;hbspt.cta.load(2734675,'F8F8C2E6-D44E-4BCE-9C74-AE35B718A827',{});

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文