Walking the Technical Side of Localization: Orchestrating QA, OCR and DTP to Really Work for You

走向本地化的技术层面。协调QA、OCR和DTP,让它们真正为你工作

2021-11-02 10:12 Ciklopea

本文共784个字,阅读需8分钟

阅读模式 切换至中文

Fixing the Quality Issues in the Source We have touched upon the direct link between the quality of source materials and the quality of translation in our previous article. This theme also extends to the technical realm – poorly formatted, uneditable or poorly scanned source documents will definitely not improve the quality of the localized materials. The main reason for this is that, regardless of whether we use an AI-powered machine translation engine and post-editing or traditional professional human translation, all the major phases of localization process are performed in a CAT (Computer-Assisted Translation) software that can only be used with editable files. CAT infrastructure is what enables boosted productivity, cuts costs and shortens the turnaround time, so it is easy to see why the properties of the source files are one of the essential elements of the localization process. In cases when source files cannot be easily imported into a CAT software, there are ways to process these files to enable smooth execution of the localization project, such as: OCR or Optical Character Recognition In certain cases, the materials for localization exist only in the form of scanned documents, or they include images or scanned elements such as seals, tables, or even written text. To make these files editable, technical localization teams perform OCR processing to generate a searchable and editable textual file that will be used in the further phases of the process. Alignment In cases when there are (editable or uneditable) translated versions of documents that were not generated in a CAT tool, it is possible to perform the OCR and subsequently the alignment process to develop translation memories that will enable faster, cheaper and leaner localization with improved lexical consistency. Development of Terminology Bases It is always recommended to extract specific terms and phrases – from taglines to any project-related terms that require special attention – before the localization process begins, and to develop terminology bases that will contain client-approved translations of these terms and phrases. This approach will save time and resources and ensure consistency and quality of the localized materials. Fixing the Quality Issues in the Target Quality Assurance (QA) process is performed on translated and reviewed files to check and resolve any errors that may have slipped through. The QA process is also computer-assisted and can be performed within the CAT tool or with independent QA applications. What is usually checked are issues that involve spelling, formatting, tags, numbers and consistency. Depending on the project requirements, there are various types and levels of quality assurance, and dedicated QA programs are developed for each project. It should be noted that the QA process is performed on purely textual files before they are sent to further processing, depending on the file format. After this stage, further quality control steps may include: DTP or Desktop Publishing + Proofreading Even the best laid translations may go awry once the translated text is imported into a page layout software. Different languages (and different scripts) have different lengths, there may be issues with text breaks, fonts, layouts and everything in between. This is why DTP should be performed with proofreading that will enable the linguist team to check their translation in context and address any issues before the localized materials are published. Software testing In cases when the subject of localization is software, testing of localized apps (or, alternatively, screenshots) is performed for the same reasons as proofreading – to enable the linguists and QA teams to check their work in context and address any issues. Under One Roof While each of these processes can be performed by different teams and by various organizations, the record shows that centralizing the entire localization process and the supporting processes provides higher quality and faster turnaround times – simply because the linguist and technical teams within the same organization who work in accordance with harmonized rules and procedures are essentially all working on the same task with the same goal. There is little room for “not my job” mentality, misunderstandings of technical and language aspects of the source material and similar weak spots that may lead to errors, delayed product launches or suboptimal localization. Lest we forget, we localize content to localize the experience, and one of the fastest and safest ways to do it is by bringing various talents whose services are required for localization under the same roof. This enables linguists to know how their translations will be used and they will have an opportunity to adapt their work accordingly and to check it in context, while the tech people will be aware of the importance of linguistic subtleties in their line of work, and the result can only be a high quality localization.
修复来源中的质量问题 我们在之前的文章中已经谈到了源材料的质量和翻译质量之间的直接联系。这个主题也延伸到了技术领域--格式不佳、不可编辑或扫描不良的源文件肯定不会提高本地化材料的质量。 其主要原因是,无论我们是使用人工智能驱动的机器翻译引擎和后期编辑,还是使用传统的专业人工翻译,本地化过程的所有主要阶段都是在CAT(计算机辅助翻译)软件中进行的,只能使用可编辑文件。 CAT基础设施是提高生产力、削减成本和缩短周转时间的关键,因此不难看出为什么源文件的属性是本地化过程的基本要素之一。 在源文件不能轻易导入CAT软件的情况下,有一些方法可以处理这些文件,以使本地化项目顺利执行,例如: OCR或光学字符识别 在某些情况下,用于本地化的材料仅以扫描文件的形式存在,或者它们包括图像或扫描元素,如印章、表格,甚至是书面文本。为了使这些文件可以编辑,本地化技术团队会进行OCR处理,以生成一个可搜索和可编辑的文本文件,并在流程的后续阶段使用。 对齐 在有(可编辑或不可编辑的)不是由CAT工具生成的文件翻译版本的情况下,有可能进行OCR和随后的对齐过程,以开发翻译记忆,从而实现更快、更便宜和更精简的本地化,提高词汇的一致性。 术语库的开发 我们总是建议在本地化过程开始前提取特定的术语和短语--从标语到任何需要特别注意的项目相关术语--并开发术语库,其中将包含客户认可的这些术语和短语的翻译。这种方法可以节省时间和资源,并确保本地化材料的一致性和质量。 修复目标中的质量问题 质量保证(QA)过程是对翻译和审查的文件进行的,以检查和解决任何可能漏掉的错误。质量保证过程也是计算机辅助的,可以在CAT工具中进行,也可以使用独立的质量保证应用程序。 通常要检查的是涉及拼写、格式、标签、数字和一致性的问题。根据项目的要求,有各种类型和级别的质量保证,并为每个项目制定专门的质量保证方案。 应该注意的是,根据文件格式的不同,在送去进一步处理之前,对纯文本文件进行QA处理。在这个阶段之后,进一步的质量控制步骤可能包括。 DTP或桌面出版+校对 即使是最好的翻译,一旦译文被导入到页面布局软件中,也可能出现问题。不同的语言(和不同的脚本)有不同的长度,可能会有文字断裂、字体、布局和之间的一切问题。 这就是为什么DTP应该进行校对的原因,这将使语言学家团队在本地化材料出版之前,能够根据语境检查他们的翻译,并解决任何问题。 软件测试 在本地化的对象是软件的情况下,对本地化的应用程序(或者,截图)进行测试的原因与校对相同--使语言学家和QA团队能够在上下文中检查他们的工作并解决任何问题。 在一个屋檐下 虽然这些过程中的每一个都可以由不同的团队和不同的组织来完成,但记录显示,集中整个本地化过程和支持过程可以提供更高的质量和更快的周转时间--仅仅是因为同一组织内按照统一的规则和程序工作的语言学家和技术团队基本上都在为同一任务而工作,目标一致。 几乎不存在 "不是我的工作 "的心态,对原始材料的技术和语言方面的误解,以及可能导致错误、推迟产品上市或次优本地化的类似弱点。 我们不要忘记,我们对内容进行本地化是为了实现体验的本地化,而最快和最安全的方法之一是将本地化所需服务的各种人才聚集在同一屋檐下。这使语言学家能够知道他们的翻译将被如何使用,他们将有机会相应地调整他们的工作,并在上下文中进行检查,而技术人员将意识到语言的微妙性在其工作中的重要性,其结果只能是高质量的本地化。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文