Term Extraction: XTM Blazes a Trail with New Norms

术语摘录:XTM公司用新的规范开辟了一条道路

2020-11-21 19:30 Nimdzi Insights

本文共748个字,阅读需8分钟

阅读模式 切换至中文

New challenges brought about by doing business in our digital world demand new solutions. Some constants still remain, however, without which a text and the quality of its translation would be less than satisfactory. One good example of such a constant is terminology and terminology management. Terminology management includes a number of different aspects, but it usually starts with terminology extraction. As we wrote in 2018, if there’s no glossary, the first task is terminology mining (or, terminology harvesting or gathering). What is term extraction? ‘Term Extraction’ is understood as the formation of a list of terms, the translation of which should be consistent within the framework of a project. The result of extracting terminology is a list of terms with contexts listed in the glossary. Some extraction tools provide statistical solutions for gathering a list of terms for which translations do not yet exist. The translations are then created either in the course of a project, or as a separate process, by delegating this task, for example, to a terminologist. Bilingual Term Extraction Software To speed up the process, some tools offer extraction of bilingual terms from reference files and from previous translations. SynchroTerm (part of the Terminotix Solution by LogiTerm), for example, automatically extracts terms, their equivalents, and contexts from file pairs in any format, bitexts, SDLXLIFF, XLIFF, or TMX files. Most terminology management systems (TBS) feature term extraction functionality, but some rely on third-party extraction tools like MultiTerm Extract. The same situation is observed with translation management systems (TMS). This means that in a regular translation workflow inside a TMS, a linguist would probably use third-party statistical terminology extractors. However, there are TMS that offer built-in options for this process. You can have a look at some examples of such TMS by selecting the “terminology extraction” filter on the Nimdzi TMS feature overview page. Four examples of mainstream TMS with term extraction capabilities. Source: Nimdzi TMS feature overview tool. Bilingual terminology extraction productivity, rethought Terminology management is an essential step in any successful translation project workflow — and productivity norms to measure it have been evolving. Earlier in 2020, we published a post about productivity in terminology management. It garnered attention from academic circles, representatives of which pointed out that the productivity metric used for the translation of a term should be less than that widely used within the localization industry. Indeed, in some cases, five seconds for a terminologist to decide on a term candidate may be unrealistic and an hour may not be enough to translate 50 terms into one target language. In other instances, though, even higher productivity rates of constructing terminology lists are already being successfully achieved. For instance, Omniscien offers a solution with productivity already three times higher — their terminology extraction of subtitles and automatic terminology translation presents options to the user who then votes for the best suggestion. Of course, the machine may or may not be wrong, but, according to Omniscien, this scheme helps achieve a translation productivity rate of 180 terms per hour. Another milestone in bilingual terminology extraction has been recently set by XTM. Their newly developed feature available in XTM v12.4 and later helps build terminology lists from existing translations with up to 90 percent accuracy. Source: Process Innovation Challenge, Locworld Term extraction productivity gains: XTM sets the bar “XTM is an innovative company, more so than many other TMS providers. It invests in linguistic intelligence. Innovation is not something you can put amongst TMS requirements, but if you were to do so, then XTM would score very well.” István Lengyel, Belazy Ltd. For their automatic extraction of bilingual terminology, XTM utilizes Big Data, AI, and advances in computational linguistic technology including Inter-language vector space. The feature already works for 50 languages helping XTM customers save up to 80 percent of time on glossary creation. “The XTM AI team has developed a new technology to take a mundane and tedious process away from the terminologist. The bilingual term extraction performed during the alignment of the parallel source and target texts produces a spreadsheet with the data required to review and add terminology. One implication of this is that XTM users will see 80% productivity improvement over manual methods.” Sara Basile, XTM International XTM sells both to enterprises and language service providers (LSPs). This presents an opportunity for many different localization industry players to try this promising automated approach which makes smart choices and helps tackle the challenge of aligning and extracting terminology in an efficient and innovative way.
在数字时代,商业带来的新挑战需要新的解决方案。然而,一些不变因素仍然存在,如果没有这些因素,文本及其翻译的质量就不能令人满意。术语和术语管理就是这种不变的因素。 术语管理包括多个不同的方面,但通常从术语提取开始。正如我们在2018年所指出的,如果没有术语表,第一个任务就是术语提取。 什么是术语抽取? “术语提取”可以理解为形成一个术语清单,术语的翻译应在项目框架内保持一致。提取术语的结果是形成具有上下文的术语列表。一些提取工具提供了统计解决方案,用于收集尚未存在翻译的术语列表。然后,翻译要么在项目过程中创建,要么作为一个单独的过程创建,例如,通过将提取术语的任务委托给术语学家。 双语术语提取软件 为了加快术语提取进程,一些工具提供从参考文件和以前的译文中提取双语术语。例如,SynchroTerm自动从任何格式的文件对(双语文本,SDLXLIFF,XLIFF或TMX文件)中提取术语以及它们的等效项和上下文。 大多数术语管理系统(TBS)都具有术语提取功能,但有些依赖于第三方提取工具,如MultiTerm Extract。翻译管理系统(TMS)也出现了同样的情况。这意味着在TMS内部的常规翻译工作流中,语言学家可能会使用第三方统计术语提取器。但是,有一些TMS为这个过程提供了内置选项。通过选择Nimdzi TMS feature overview页面上的“Termology Extraction”过滤器,您可以查看此类TMS的一些示例。 四个具有术语提取功能的主流TMS示例。来源:Nimdzi TMS特征概述工具。 双语术语提取生产率的再思考 在任何成功的翻译项目工作流程中,术语管理都是必不可少的一步--而衡量术语管理的生产力准则一直在变化。在2020年早些时候,我们发表了一篇关于术语管理中的生产力的帖子,引起了学术界的关注,学术界代表指出,翻译一个术语时所使用的生产率度量标准应该低于本地化行业内广泛使用的生产率度量标准。 的确,在某些情况下,一个术语学家用5秒钟来决定一个候选术语可能是不现实的,但一个小时可能又不足以将50个术语翻译成一种目标语言。在其他情况下,更高的构建术语列表的生产率已经实现。例如,Omniscien提供的解决方案已将生产力已经提高了三倍--他们的字幕术语提取和自动术语翻译为用户提供选项,然后由用户投票选出最佳建议。当然,机器可能是错误的,也可能是正确的,但是,根据Omniscien的说法,这种方案帮助实现了每小时180个词的翻译生产率。 双语术语抽取领域的另一个里程碑是最近由XTM建立的。他们新开发的功能在XTMV12.4和更高版本中可用,这些功能可以从现有翻译中创建术语列表,准确率高达90%。 来源:流程创新挑战,Locworld 术语提取生产率的提高:XTM定下了标准 “XTM是一家创新型公司,比许多其他翻译管理系统提供商更有创新精神。XTM投资语言智能领域。创新并不能够在翻译管理系统的要求中体现,但是如果你确实有创新,那么你就会像XTM一样更胜一筹。“ 贝拉齐公司伊斯特万·伦格尔。 为了实现自动抽取双语术语,XTM利用了大数据,人工智能和包括中介语向量空间在内的计算语言技术的进步,并将这些特色应用于于50种语言,帮助XTM客户节省了80%的术语表创建时间。 “XTM 人工智能团队开发了一种新技术,让术语学家们从一个平凡而乏味的过程中解放出来。在平行源文本和目标文本的对齐过程中执行的双语术语抽取产生一个电子表格,其中包含审查和添加术语所需的数据。这意味着,相比于人工的方法,XTM用户用此技术可提高80%的生产率。” Sara Basile,XTM International XTM既面向企业,也面向语言服务提供商 (LSPs)。这为许多不同的本地化行业参与者提供了一个机会来尝试这种前途无量的自动化术语提取方法,该方法可以做出明智的选择,并以一种有效和创新的方式帮助客户解决对齐和提取术语的相关挑战。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文