A Beginner’s Guide to Machine Translation

机器翻译入门指南

2020-03-29 18:31 memsource

本文共1076个字,阅读需11分钟

阅读模式 切换至中文

Looking for ways to cut translation costs and turnaround times without compromising quality? Machine translation might be your answer. Enter the world of machine translation with confidence thanks to our quick guide that walks you through the basics of this growing translation technology. What is Machine Translation? Machine translation (MT) is automated translation by computer software. MT can be used to translate entire texts without any human input, or alongside human translators i.e.: machine translation post-editing. MT started gaining traction in the early 50s, and has come a long way since. Currently, the value of the MT market is estimated between USD 130 million to USD 400 million and, as the technology continues to improve, more and more companies are turning to MT to aid human translators and optimize the localization process. Then And Now: How Has Machine Translation Evolved? There are several different kinds of MT and as the technology has progressed, the older systems have been replaced by newer technologies. Rule-based machine translation (RBMT) is the forefather of MT and is now somewhat obsolete. It is based on sets of grammatical and syntactical rules and phraseology of a language. RBMT links the structure of the source segment to the target segment, producing a result based on analysis of the rules of the source and target languages. The rules are developed by linguists and users can add terminology to override the MT and improve the translation quality. Statistical MT (SMT) started in the age of big data and uses large amounts of existing translated texts and statistical models and algorithms to generate translations. This system relies heavily on available multilingual corpora and an average of two million words are needed to train the engine for a specific domain – which can be time and resource intensive. Statistical is fast being overshadowed by newer technologies but, when using domain-specific data, SMT can still produce good quality translations, especially in the technical, medical, and financial fields. Neural MT (NMT) is a newer approach that is built on deep neural networks. There are a variety of network architectures used in NMT but typically, the network can be divided into two components: an encoder which reads the input sentence and generates a representation suitable for translation, and a decoder which generates the actual translation. Words and even whole sentences are represented as vectors of real numbers in NMT. Compared to the previous generation of MT, NMT generates outputs which tend to be more fluent and grammatically accurate. Overall, NMT is a major step in MT quality. However, NMT may slightly lack behind previous approaches when it comes to translating rare words and terminology. Custom vs Generic Machine Translation Machine translation engines are trained on different kinds of data. Generic MT engines, like Google Translate and Microsoft Translator, are for more general-purposes and are not trained with data for a specific domain or topic. Custom MT engines, on the other hand, are more fine-tuned as they are trained with specific data, resulting in more accurate MT output but also come with a higher price tag. Regardless of whether you are using a custom engine or generic, the engines will need to be retrained from time to time to improve the results. Thanks to continuous development, improvements have been made to this retaining process. With adaptive MT, the system updates in real-time based on edits made to the content. It is constantly learning and improving. The Pros and Cons of Machine Translation So now you have a brief understanding of MT – but what does it mean for your translation workflow? How does it benefit you? MT is incredibly fast. It can translate into multiple languages at once which drastically reduces the amount of manpower needed. Implementing MT into your localization process can do the heavy lifting for translators and free up their valuable time, allowing them to focus on the more intricate aspects of translation. MT technology is developing rapidly and is constantly advancing towards producing higher quality translations and reducing the need for post-editing. There are many advantages of using MT but we can’t ignore the disadvantages. MT does not always produce perfect translations. Unlike human translators, computers can’t understand context and culture, therefore MT can’t be used to translate anything and everything. When Should You Use Machine Translation? In some situations MT alone is suitable, while in others a combination of MT and human translation is best. Sometimes MT is not suitable at all. MT is not a one-size-fits-all translation solution. For large volumes of content, especially if it has a short turnaround time, MT is very effective. If accuracy is not vital, raw MT (without human post-editing) can produce suitable translations at a fraction of the cost. Customer reviews, news monitoring, internal documents, and product descriptions are all good candidates. When translating creative or literary content, MT is not a suitable choice. This can also be the case when translating culturally specific-texts. A good rule of thumb is the more complex your content is, the less suitable it is for MT. That being said, using a combination of MT along with a post-editor opens the doors to a wider variety of suitable content. Which MT Engine Should You Use? There is no specific MT engine for a specific kind of content. Generic MT engines are designed to be able to translate most types of content, however, with custom MT engines the training data can be tailored to a specific domain or content type. Ultimately, choosing an MT engine can be a long process. You need to choose the kind of content you wish to translate, review security and privacy policies, run tests on text samples, choose post-editors, and several other considerations. The key is to do your research before making a decision. And, if you are using a translation management system (TMS) be sure it is able to support your chosen MT engine. Using Machine Translation and a Translation Management System You can use MT on its own, but to get the maximum benefits we suggest integrating it with a Translation Management System (TMS). With these technologies combined, you will be able to leverage additional tools such as translation memories, term bases, and project management features to help streamline and optimize your localization strategy. You will have greater control over your translations, and be able to analyze the effectiveness of your MT engine.
还在寻找降低成本和周转时间而又不影响质量的方法吗?机器翻译可能就是你要找的答案。借助我们的快速指南,您可以自如地进入机器翻译的世界,它为您提供了这种不断发展的翻译技术的基础知识。 什么是机器翻译? 机器翻译(MT)是计算机软件的自动翻译。 MT可以用于翻译整个文本而无需任何人工输入,也可以与人工翻译一起使用,即机器翻译后编辑。MT在50年代初期就开始受到关注,并且已经发展了很长一段时间。目前,机器翻译市场的价值估计在1.3亿美元至4亿美元之间,随着技术的不断发展,越来越多的公司开始使用机器翻译来帮助翻译人员并优化本地化流程。 过去和现在:机器翻译如何发展? MT有几种不同的类型,并且随着技术的进步,旧系统已被更新的技术所取代。 基于规则的机器翻译(RBMT)是MT的前身,现在已经过时了。它基于一套语法和句法规则以及一种语言的措辞。RBMT将源句段的结构链接到目标句段,并根据对源语言和目标语言规则的分析得出结果。这些规则是由语言学家制定的,用户可以添加术语以覆盖MT并提高翻译质量。 统计MT(SMT)始于大数据时代,它使用大量现有的翻译文本以及统计模型和算法来生成翻译。该系统严重依赖于可用的多语言语料库,并且平均需要200万个单词来为特定领域训练引擎——这可能会费时,占用大量资源。新兴技术迅速掩盖了统计信息,但是当使用特定领域的数据时,SMT仍然可以产生高质量的翻译,尤其是在技术、医疗和金融领域。 神经MT(NMT)是一种基于深度神经网络的新方法。 NMT中使用了各种各样的网络体系结构,但是通常,网络可以分为两个部分:一个读取输入语句并生成适合翻译的表示形式的编码器,以及一个生成实际翻译的解码器。在NMT中,单词甚至整个句子都表示为实数向量。与上一代MT相比,NMT的输出更流畅,语法更准确。总体而言,NMT是MT质量提升的重要一步。但是,在翻译生僻字词和术语时,NMT可能会稍微落后于以前的方法。 自定义机器翻译vs通用机器翻译 MT引擎是用各种数据进行培训的。通用的MT引擎(例如Google Translate和Microsoft Translator)是为更多用途而设计的,未经特定领域或主题的数据训练。另一方面,自定义MT引擎在使用特定数据进行训练时会进行更精细的调整,从而获得更准确的MT输出,但价格也更高。无论您使用的是自定义引擎还是通用引擎,都需要不时对引擎进行重新培训以改善输入结果。 通过不断的发展,该保留过程得到了改进。使用自适应MT,系统会根据对内容的编辑实时更新。它正在不断学习和改进。 机器翻译的利与弊 现在,您对MT有了一个简单的了解——但是这对您的翻译工作流程意味着什么?它对您有什么好处? MT非常快。 它可以一次翻译成多种语言,从而大大减少了所需的人力。 在您的本地化流程中使用MT可以为译员分担繁重的工作,并节省他们的宝贵时间,使他们专注于翻译中更复杂的方面。 MT技术发展迅速,并且正在不断朝着提供更高质量的翻译和减少译后编辑的需求前进。 MT有很多优点,但我们不能忽略它的缺点。 MT并不总是能产生完美的翻译。与人工翻译不同,计算机无法理解上下文和文化,因此不是所有的内容都可以用MT来翻译。 什么时候应该使用机器翻译? 在某些情况下,单独使用MT是合适的,而在另一些情况下,最好将MT和人工翻译结合起来。有时MT根本不合适。 MT并非万能的翻译解决方案。 对于内容量大,尤其是周转时间短的情况,MT非常有效。如果准确性不是至关重要的,则原始MT(无需人工译后编辑)可以以很小的成本生成合适的翻译。客户评论、新闻监控、内部文档和产品描述都是合适的类型。 翻译创意或文学内容时,MT不是合适的选择。翻译特定文化的文本也不合适。一条经验法则是,您的内容越复杂,就越不适合MT。 话虽如此,结合使用MT和译后编辑器,为更多合适的内容打开了大门。 您应该使用哪个MT引擎? 没有针对特定种类内容的特定MT引擎。通用MT引擎旨在能够翻译大多数类型的内容,但是,使用自定义MT引擎,可以将培训数据自定义为特定的领域或内容类型。 最终,选择MT引擎可能是一个漫长的过程。您需要选择您想要翻译的内容类型,查看安全和隐私策略,对文本样本进行测试,选择译后编辑器,以及其他一些考虑因素。关键是要在做出决定之前先进行研究。而且,如果您使用的是翻译管理系统(TMS),请确保它支持您选择的MT引擎。 使用机器翻译和翻译管理系统 您可以单独使用MT,但是为了将利益最大化,我们建议将其与翻译管理系统(TMS)集成使用。结合这些技术,您将能够利用翻译记忆库、术语库和项目管理功能等其他工具来帮助简化和优化本地化策略。您将更好地控制翻译,并能够分析MT引擎的有效性。 译后编辑:王思晴(中山大学)

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文