The Importance of Translation Memory

翻译记忆的重要性

2021-07-20 21:25 MemoQ

本文共1646个字,阅读需17分钟

阅读模式 切换至中文

In the language industry, translation memory technology has been part of the standard toolset for most language service companies since the 1990s. Translation memory software began to be commercialized in the 1980s and hit the mainstream by the mid-90s. Many language professionals now see translation memory software (also known as TM tools or CAT--computer-aided translation) as passé with the advent of other tools such as machine translation. Nothing could be further from the truth, though, and here’s why: A translation memory tool… provides a standard interface for linguists simplifies the interaction with increasingly complex file types makes the sharing of work easier and more flexible provides equal access to translation memory and terminology databases streamlines project workflows secures and protects translations automates the translation process organizes complex projects allows for larger translation teams creates valuable linguistic resources Translation memory tools provide the human translator a familiar and efficient interface for working with content to be translated. The example below shows a translation editor interface. The grid model is now the de facto standard for translation memory tools. In this example, the original document is a simple Microsoft Word file, but most translation memory tools can import myriad file types, including raw software code and other structured content types such as XML. These are two types of technology that are often discussed together, since their functionality overlaps; however, it is important to understand the difference. Translation memory tools are technology that enables recording, storage, and recall of translated content in Translation Units. Translation Management Systems (TMS) are a much broader type of tool that includes, among other things, translation memory functionality. A TMS may also have functions related to project management, terminology management, connectivity to third-party content management systems, and user access and management. Figure 1 The standard translation grid has revolutionized translator productivity. Standardizing the translator’s experience across file types is still a revolutionary feature of translation memory tools. Few translators possess both the linguistic and technical skills to translate content within the code of a website or mobile app, nor do many have all the software that is used to publish the hundreds of billions of words that are translated globally each year. As you can see from additional examples below, translating raw XML is daunting compared to translating the same content after it has been imported into a translation memory environment. Figure 2 Translating raw XML is cumbersome. Figure 3 Translating the same XML content in a translation memory tool makes the task much easier. To meet market demand for translated content, language service companies sometimes need multiple translators working in simultaneously. Translation memory tools—especially those that are server-based—allow for collaboration of multiple translators in real time. Robust tools enable the slicing and batching of substantial amounts of content. In addition to distributing content, the server hosts centralized translation memory databases that translators can access in real-time while translating. This means that a translation created by one translator can be reused by many to help them translate faster and maintain consistency. However, that is not all the server can share. It also hosts terminology databases, so technical terminology can be managed and shared amongst the translation team. This saves the translators from having to do extensive research confirming which terms must be used to provide an accurate and consistent translation. Prior to the advent of translation memory tools, the translation process consisted of moving files from one translator to another depending on their role. It was common for a project manager to send files to a translator who would translate them and deliver them back to the project manager, who would then send them to an editor who would edit and revise the translation. The editor would then deliver his completed work back to the project manager, who would send them to a proofreader to finalize the translation prior to delivery. The “TEP” workflow has been standard for publishing translations for centuries. When working on a server, for example, you can avoid moving files as part of your workflow. Instead, linguists work on translation segments or groups of segments assigned to them depending on the segments’ status. This innovation offers unique benefits, such as overlapping workflows, in which an editor can review a segment once a translator has completed their translation of the segment. Such a “dovetailed” workflow can save as much as 30% in the translation production’s timeline. Server-based translation not only streamlines the translation workflow, but it also secures the translations. By capturing translations as they are completed in a translation database, the risk of losing that content is mitigated once the translated segment is committed to the translation memory database. Besides safeguarding the translated content itself, translation memory tools also safeguard the functional structure of the documents being translated. Since files are imported into the translation memory software, the content is parsed into segments, which separates them from the layout of the document. The file’s structure is retained and cannot be changed by the translator. The translator can only affect the formatting of content within a sentence (for example deciding which word should be bold or italic). Beyond formatting, they are responsible for placing any formatting tags (such as markup used in HTML or XML). The translation memory software will also check that these tags are all accounted for and properly recorded. These are critical quality assurance measures that guarantee the translated version of a file will function just like the original. In addition to capturing content in real time, having all translations in a translation memory tool helps mitigate for other issues like file corruption. Often in post-production (following translation) a file might become corrupted while being moved or during desktop publishing. Should this occur, it is quite easy to recreate the translation by using all the translated segments stored in the translation memory to automatically re-translate the original document with minimal rework. Prior to the use of translation memory, if file corruption occurred, a project manager would have to track down another version of the translation—with all the editor’s changes in place—to continue the process. If the cause of a mishap was a hard drive failure on the part of translator, the work could be lost for good. Aside from standardizing the translation workflow, translation management systems can also automate that workflow. For example, they can automate the pre-translation of documents, initiation of the translation, and passing the process from translation to editing—all without project manager intervention. Most tools also enable a connection with a machine translation engine by way of an API. Such seamless connectivity allows for the use of both translation memory and machine translation within the translator’s interface to help speed up their work. Equipped with a well-established translation memory and effective machine translation engine, a translator will rarely have to translate any content from scratch, rather they only need to review and revise translation memory or MT engine matches. This can typically double their productivity. Organizationally, translation memory tools provide a robust environment to manage complex projects that may consist of thousands of files. Using functions that allow filtering and grouping of segments, it is possible to combine content thematically and assign it to specialist translators to help improve translation quality and efficiency. The translation memory system will also retain the structure of the source files regarding where they are in a complex file structure. Translators have no way of altering the structure of file names while working. The ability to manage and serve up centralized resources like translation memory and terminology databases play a critical role in not only allowing multiple translators to work collaboratively in real time, but to have large teams of them doing their work. By having access to shared resources, translators can maintain better consistency. By properly configuring a server-based project, a project manager can enforce proper terminology usage and stylistic consistency such as number and date formats in the translation. Outside of a translation memory system, this would be a manual effort, which would make it virtually impossible for a large team to collaborate with consistency in mind. In the modern context of translation automation, the most valuable contribution of translation memory systems is the creation of high-quality language resources—namely translation memories and terminology databases. Aside from the multiple benefits highlighted above, these bilingual resources are the veritable gold of translation production. Within the context of daily translation work, translation memory provides immediate returns to translation teams working on similar documents, usually for the same customer. In addition, for organizations looking to improve translation automation, translation memories can provide much longer-term returns as a basis for training machine translation engines. High-quality translation memories are the foundation for training specialized and accurate machine translation engines. Bilingual sources of data that establish the equivalence between the source and target languages enable machine learning and can have meaningful impact on the quality of the generated content with lesser amounts of data. Terminology databases (termbases) are highly valuable because they help linguists precisely manage critical terminology. Using the correct terminology for the subject matter being translated drastically improves accuracy, readability, and acceptance of the translation by the target audience. With machine translation engines that support the use of run-time glossaries, terminology databases can help improve the accuracy of machine translation engines as well. Translation memory may seem like old technology given its ubiquity and longevity in the language industry. However, the age of this technology does not signal impending obsolescence. On the contrary, it underscores its critical position in the age of translation automation. Look at how your organization is utilizing translation memory technology and if it is getting full benefit from it for translator productivity, managing your projects, and accelerating your use of automation for creating new translations.
自20世纪90年代以来,在语言行业,翻译记忆技术已经成为大多数语言服务公司的标准工具集的一部分。翻译记忆软件在20世纪80年代开始商业化,到90年代中期成为主流。许多语言专业人士现在认为翻译记忆软件(也称为TM工具或CAT——计算机辅助翻译)随着机器翻译等其他工具的出现而过时。 不过,没有什么比这更离题了,原因是:翻译记忆工具… 为语言学家提供标准接口 简化越来越难的文件交互类型 让工作分担更加容易更加便捷 平等访问翻译记忆和术语数据库 优化项目工作流 保护翻译 使翻译过程自动化 自动化翻译过程 创建规模更大的翻译队伍 创造有价值的语言资源 翻译人员熟悉翻译记忆库工具的界面,在处理翻译内容的时候,并且这个界面也很高效。用于处理要翻译的内容。以下示例所展示的是一页翻译编辑机器的界面。 网格模型现在是翻译记忆库工具的事实上的标准。 在此示例中,原始文档是一个简单的 Microsoft Word 文件,但大多数翻译记忆库工具可以导入无数文件类型,包括原始软件代码和其他结构化内容类型,例如 XML。 这是两种经常一起讨论的技术,因为它们的功能重叠; 但是,了解差异很重要。 翻译记忆工具是一种技术,这种技术能够在翻译单元中录入、存储和回忆翻译内容的技术。 翻译管理系统(translationmanagementsystems,TMS)是一种范围更广的工具,其中包括翻译记忆功能。TMS还可以具有与项目管理、术语管理、到第三方内容管理系统的连接以及用户访问和管理相关的功能。 图 1 标准翻译网格彻底提升了翻译人员的工作效率。 跨文件类型标准化翻译人员的体验仍然是翻译记忆库工具的一项革命性功能。 很少有翻译人员同时具备翻译网站或移动应用程序代码中内容的语言和技术技能,也没有许多翻译人员拥有用于发布每年在全球翻译的数千亿字的所有软件。 正如您从下面的其他示例中看到的那样,与在将相同内容导入翻译记忆库环境后进行翻译相比,翻译原始 XML 是一项艰巨的任务。 图2转换原始XML很麻烦的一件事。 图3在翻译内存工具中翻译相同的XML内容使任务容易得多。 为了满足市场对翻译内容的需求,语言服务公司有时需要多个翻译人员同时工作。翻译内存工具,特别是那些基于服务器的工具,允许多个翻译器进行实时协作。强大的工具可以对大量内容进行切片和批处理。图3在翻译记忆工具中翻译相同的XML内容使任务更容易。 除了分发内容外,服务器还托管集中式翻译内存数据库,翻译人员可以在翻译时实时访问这些数据库。这意味着,由一个翻译器创建的翻译可以被许多人重用,以帮助他们更快地翻译并保持一致性。但是,这并不是服务器可以共享的全部内容。它还托管术语数据库,因此技术术语可以在翻译团队之间进行管理和共享。这就使译者不必进行大量的研究,以确定必须使用哪些术语来提供准确和一致的翻译。 在翻译记忆工具出现之前,翻译过程包括根据文件的角色将文件从一个翻译器移动到另一个翻译器。项目经理通常会将文件发送给翻译,翻译并将其交付给项目经理,项目经理将文件发送给编辑,后者编辑和修改翻译。然后编辑将他完成的工作交给项目经理,项目经理将工作交给校对员,在交付前完成翻译。几个世纪以来,“TEP”工作流一直是出版翻译本的标准。 例如,在服务器上工作时,可以避免在工作流程中移动文件。相反,语言学家根据翻译片段的状态来研究翻译片段或分配给他们的一组片段。这种创新提供了独特的好处,例如重叠的工作流程,在这种工作流程中,一旦翻译人员完成了对某个片段的翻译,编辑就可以对该片段进行审阅。这样的“燕尾式”工作流程可以节省多达30%的翻译产品的时间。 基于服务器的翻译优化了翻译工作流程,还保证了翻译的安全性。通过在翻译数据库中完成翻译时捕获翻译,一旦翻译的片段被提交到翻译记忆数据库,就可以降低丢失内容的风险。 除了保护翻译的内容本身,翻译记忆工具还保护被翻译的文档的功能结构。由于文件被导入到翻译内存软件中,因此内容将被解析为段,这将它们与文档的布局分开。文件的结构被保留,转换器无法更改。翻译器只能影响句子中内容的格式设置(例如,决定哪个单词应该是粗体或斜体)。除了格式化之外,它们还负责放置任何格式化标记(例如在HTML或XML中使用的标记)。翻译记忆软件还将检查这些标签是否都已说明并正确记录。这些都是关键的质量保证措施,可以保证文件的翻译版本将与原始版本一样工作。 除了实时捕获内容外,在翻译内存工具中拥有所有翻译还有助于缓解其他问题,如文件损坏。通常在后期制作(翻译之后)中,文件在移动或桌面发布期间可能会损坏。如果发生这种情况,通过使用存储在翻译内存中的所有翻译片段来自动重新转换原始文档,可以很容易地重新创建转换。在使用翻译内存之前,如果发生文件损坏,项目经理必须跟踪其他版本的翻译,以及编辑器的所有更改,才能继续这个过程。如果事故的原因是翻译人员的硬盘故障,那么工作可能会永远丢失。 除了标准化翻译工作流外,翻译管理系统还可以自动化该工作流。例如,它们可以自动完成文档的预翻译、启动翻译以及将过程从翻译传递到编辑——所有这些都不需要项目经理的干预。 大多数工具还支持通过API与机器翻译引擎相互连接。这种无缝连接让在翻译人员使用界面的过程中理用翻译记忆和机器翻译来帮助加快他们的工作。机器翻译引擎记忆能力完美且有效,译者很少需要从头开始翻译任何内容,而只需要检查和修改翻译记忆或机器翻译引擎匹配。通常,这样做可以让他们的生产率翻倍。 从组织上讲,翻译内存工具内有一个安全稳健的环境可以用来管理复杂项目,这些项目可能由数千个文件组成的。使用允许过滤和分组片段的功能,可以以主题方式结合内容,并将其分配给专业的翻译人员,以帮助提高翻译质量和效率。 翻译存储系统还将保留源文件在复杂文件结构中的位置的结构。翻译器在工作时无法更改文件名的结构。 管理和服务集中化资源(如翻译记忆和术语数据库)的能力不仅对允许多个翻译人员实时协作工作,而且对让他们的大型团队完成工作起着关键作用。通过访问共享资源,译者可以保持更好的一致性。通过正确配置基于服务器的项目,项目经理可以在翻译中强制使用适当的术语和风格一致性,例如数字和日期格式。在翻译记忆系统之外,这将是一项手工工作,这将使得一个大型团队几乎不可能在头脑中保持一致性。 在现代翻译自动化的背景下,翻译记忆系统最有价值的贡献是创造高质量的语言资源,即翻译记忆和术语数据库。除了以上强调的多重好处外,这些双语资源是翻译产品的真金白银。在日常翻译工作中,翻译记忆为翻译团队提供了即时的回报,他们通常为同一个客户处理类似的文档。 此外,对于希望改进翻译自动化的组织来说,翻译记忆可以提供更长期的回报,作为训练机器翻译引擎的基础。高质量的翻译记忆器是训练专门和精确的机器翻译引擎的基础。建立源语言和目标语言之间等价性的双语数据源使机器学习成为可能,并可以通过较少的数据对生成的内容的质量产生有意义的影响。 术语数据库(termbase)非常有价值,因为术语数据库可以帮助语言学家精确地管理关键术语。正确使用翻译的主题的术语可以极大地提高译文的准确性、可读性和目标读者对译文的接受性。通过支持使用运行时词汇表的机器翻译引擎,术语数据库也可以帮助提高机器翻译引擎的准确性。 鉴于它在语言工业中的普遍存在和长寿,翻译记忆可能看起来就像一种旧的技术。然而,这项技术的时代并不预示着技术即将过时。相反,它强调了它在翻译自动化时代的关键地位。看看您的组织是如何利用翻译内存技术的,以及它是否能充分利用翻译效率、管理项目以及加速使用自动化来创建新的翻译。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文