The Abcs of Morphology Falese Positives in Terminology Management

形态学在术语管理中的应用

2020-04-27 21:28 nimdzi

本文共405个字,阅读需5分钟

阅读模式 切换至中文

Morphology is all about how words are formed: their roots, compounds, declensions, conjugations, and relations with other words. To make the word analysis, morphology looks into the structure of words and their parts (e.g., stems, prefixes, suffixes). Some languages are more "highly inflected," meaning the word form may change depending on grammatical case, gender, and number. Examples of highly inflected languages include Slavic languages, Latin and Romance languages and certain Germanic languages. In translation, morphology becomes crucial for terminology control. A simple word form that can be used as a noun, a verb, and part of a compound noun in English can translate into multiple word forms in other languages. Take, for example, the English word “database.” You store data in a database, export from a database, and you have multiple databases. When you translate these word forms (database/databases) into an inflected language such as Russian, you may have up to 12 word forms in a single text. When you run a terminology check (with a tool comparing a translation with nominative glossary entries), you may get false positives in 10 of these cases. This escalates into wasted hours of running through terminology reports full of false positive errors that could have been avoided. As the strategy of morphological control is different for different languages, some language technology providers argue that the morphology-related functionality would be better supported in Translation Management Systems and CAT-tools. But others treat morphology with due respect. For example, they may use specifically developed morphological engines. There are also tools like Term Morphology Editor which helps during the preparation of termbases for efficient term recognition. Some examples of dealing with morphology in terminology management: In Kaleidoskope’s quickTerm the search for terminology can include stemming and decomposition: users can search in several languages simultaneously and benefit from the morphological stemming of checkTerm. Lingo24's termfinder has statistical forms for terms. memoQ's QTerm features prefix-based term matching. When Terminotix’s LogiTerm searches a term, it searches for many different forms (plural/singular, gender, tense, etc). Tilde uses a more sophisticated approach of dividing languages into groups and supporting the morphology of each language in a distinct way. All these methods help control terminology more efficiently. When you run a terminology check, you won’t get as many false positives as you would in tools where morphology is not supported. This helps save time and effort on terminology maintenance.
形态学是研究构词法的科学,包括词根、复合词、词形变化、变体及词间关系。 为了分析词形,形态学会研究词及其组成(如词干、前缀、后缀)结构。 有些语言“屈折度高”,意味着词形可能会根据语法、性数和单复数而变化。 高度屈折的语言包括斯拉夫语、拉丁语和罗曼语以及某些日耳曼语。 在翻译中,形态学对术语控制至关重要。 在英语中,一个简单词形可以用作名词、动词、复合成其他名词,译入其他语言则会出现多种词形。 以英文单词“database”为例,您将数据(data)存储在一个数据库(database)中,从一个数据库(database)中导出,并且您拥有多个数据库(databases)。 当您将这些单词(database/databases)翻译成一种屈折语言(如俄语)时,单个文本中的词形可能多达12个。 当您进行术语检查(使用工具比较翻译和主格术语条目)时,其中可能会出现10种假阳性错误结果。 术语报告中充满假阳性错误,大大浪费检查时间,而这些错误本可以避免。 由于不同语言的词形控制策略不同,一些语言技术提供商认为,翻译管理系统和CAT工具可以更好地支持形态学相关的功能。 但另一些人则专门对待形态学。 例如,他们可能使用专门开发的形态学引擎。 还有如术语形态学编辑器(Term Morphology Editor)等工具,能在术语库准备的过程中帮助有效识别术语。 术语管理中处理词法的一些例子: 在Kaleidoskope的quickTerm中,对术语的搜索可以包括词干和词素,用户可以同时搜索几种语言,并受助于CheckTerm的形态词干。 Lingo24的termfinder提供术语的统计表。 Memoq的QTerm有前缀术语匹配功能。 在Terminotix的LogiTerm搜索术语时,它会提供不同的词形(复数/单数、性别、时态等)。 Tilde采用了一种更复杂的方法,将语言分成组,以不同的方式支持每种语言的形态学。 所有这些方法都能帮助更有效地控制术语。 当您运行术语检查时,不像之前不支持形态学功能的工具那样,假阳性错误要少的多。这能节省您维护术语的时间和精力。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文