How does DeepL work?

DeepL是怎么工作的?

2021-12-06 23:50 DeepL

本文共2587个字,阅读需26分钟

阅读模式 切换至中文

We are frequently asked how it is that DeepL Translator often works better than competing systems from major tech companies. There are several reasons for this. Like most translation systems, DeepL Translator translates texts using artificial neural networks. These networks are trained on many millions of translated texts. However, our researchers have been able to make many improvements to the overall neural network methodology, mainly in four areas. It is well known that most publicly available translation systems are direct modifications of the Transformer architecture. Of course, the neural networks of DeepL also contain parts of this architecture, such as attention mechanisms. However, there are also significant differences in the topology of the networks that lead to an overall significant improvement in translation quality over the public research state of the art. We see these differences in network architecture quality clearly when we internally train and compare our architectures and the best known Transformer architectures on the same data. Most of our direct competitors are major tech companies, which have a history of many years developing web crawlers. They therefore have a distinct advantage in the amount of training data available. We, on the other hand, place great emphasis on the targeted acquisition of special training data that helps our network to achieve higher translation quality. For this purpose, we have developed, among other things, special crawlers that automatically find translations on the internet and assess their quality. In public research, training networks are usually trained using the “supervised learning” method. The network is shown different examples over and over again. The network repeatedly compares its own translations with the translations from the training data. If there are discrepancies, the weights of the network are adjusted accordingly. We also use other techniques from other areas of machine learning when training the neural networks. This also allows us to achieve significant improvements. Meanwhile, we (like our largest competitors) train translation networks with many billions of parameters. These networks are so large that they can only be trained in a distributed fashion on very large dedicated compute clusters. However, in our research we attach great importance to the fact that the parameters of the network are used very efficiently. This is how we have managed to achieve a similar translation quality even with our smaller and faster networks. We can therefore also offer very high translation quality to users of our free service. Of course, we are always on the lookout for very good mathematicians and computer scientists who would like to help drive development, further improve DeepL Translator, and break down language barriers around the world. If you also have experience with mathematics and neural network training, and if it fulfills you to work on a product that is used worldwide for free, then please apply to DeepL! We’re excited to announce glossary support for the DeepL API! Glossaries are available to both DeepL API Free and Pro subscribers. You can create an account here if you don’t already have one. We first added glossary support for our web translator and desktop apps in 2020, and it’s become a much-loved feature amongst our users. The use cases are many and varied—whether you’re translating… ...there’s often brand- and industry-specific vocabulary that you need to account for. Glossaries allow you to specify your own translations for words and phrases, making it possible to customize your translations consistently and at scale so you can deliver the best possible results to your users without pushing a bunch of manual work onto your translation and content teams. And it's an especially important capability for us to offer to our API users, who are often building automated translation workflows that, with the help of glossaries, can be automated even further. Glossary support for the API will also make it possible for CAT tool providers with a DeepL plug-in to build glossary functionality into their products. If you’d like to learn how to use glossaries in the API, or you’re ready to get started, you can check out the API documentation. In the rest of this post, we’ll share a bit more about what you can expect from the feature. Currently, glossaries for the API can be created for the same language combinations available for glossaries on the DeepL web translator and desktop apps: Our team is working on adding more language combinations soon. Glossaries created via the DeepL API are distinct from glossaries created via the DeepL web translator and desktop apps. This means that API glossaries can't be used on the web translator and desktop apps and vice versa. And both DeepL API Free and Pro users can create up to 1000 glossaries. The maximum size limit for a glossary is 10 MB. To wrap it up, here’s everything you’ll need to get started with glossaries in the API and to get help if you need it: We’re excited to announce the release of our Python client library for the DeepL API. This is the first programming language-specific library we’ve built for the API, and our goal is to make it much easier for developers working with Python to build applications with DeepL. Just how much easier? Here's a text translation example. What used to be: Is now: We intend to support all functions of the DeepL API with the Python library, though support for new features may be added to the Python library after they're added to the API. If that’s all you needed to hear, and you’d like to get started right away, you can find the Python client library and documentation here. (By the way: you'll need an API authentication key to use the Python library. If you don't yet have a key, you can sign up for a Free or Pro API plan here.) In the rest of this post, we’ll share more about the Python library project and how you can get involved. Python was an ideal fit for our first client library. It's one of the most widely used programming languages in the world, and it’s popular amongst our API user base. It’s also a language that we have a lot of experience with at DeepL. That said, use cases for the DeepL API are wide ranging, and there are many languages beyond Python that are important to our users. We’re exploring other client libraries we might develop in the future, and if you have any feedback on the languages that are most important to you, please get in touch and let us know. The Python client library is also DeepL’s first ever open source project. We’re always looking for new ways to learn from our users and ensure that our products are solving real problems for them. We believe that open sourcing the project is the best way to foster transparent discussion about what users need and what problems they’re running into, and we encourage you to create an issue if you have ideas or feedback. Of course, GitHub issues isn’t the right place for every challenge you might run into while working with the Python library. If your help request includes sensitive data, please contact our support team. We’re happy to help. To recap, if you’d like to get started with DeepL’s Python client library, you can: Thanks for reading. We’re excited to see what you build. Have you ever wished for instant fluency in another language? Maybe you were traveling and couldn’t read a menu, or you met a new friend but didn’t speak the same language. With our new mobile translation app for iOS, you can say goodbye to language barriers. No matter where you are or who you’re with, you can get your iOS device out and easily access DeepL’s accurate translations. The app is accessible on iPhone and iPad and allows users to translate on the go. It supports all 26 DeepL languages and language variants, as well as other DeepL benefits like fast translations, quick language detection, and best-in-class translation quality. Additionally, you can try the speech to text translation feature, which allows you to use your device’s microphone to translate speech input. It also works the other way around with the text to speech feature, which allows you to listen to your text input and its translation. This is yet another exciting milestone for our team at DeepL as we continue to work towards eliminating language barriers and bringing cultures together. Eager to start translating on your iOS device? You can download the app for free here. We're excited to share that DeepL Translator now supports thirteen new European languages: Bulgarian, Czech, Danish, Estonian, Finnish, Greek, Hungarian, Latvian, Lithuanian, Romanian, Slovak, Slovenian, and Swedish. Over the years, many people have asked us to add more languages to our translator so that they could enjoy DeepL’s accurate translations. We’re thrilled to be able to honor several of those requests at once with the release of the new languages. With this launch, DeepL can now reach an estimated 105 million more native speakers around the world with natural-sounding, high-quality translations. This allows us to further our company vision of breaking down language barriers and bringing cultures closer together. We’re continually working to ensure the translation quality for all languages remains exceptional and that the translation process is seamless for all our users. And if your preferred language isn’t offered yet, don’t worry, we will add more languages in the future. From “Sziasztok” to “Hej”, we’re ready to say hello to our new users and are already looking forward to their feedback. All thirteen languages are now available on DeepL.com, via the DeepL API, and in our desktop apps, DeepL for Windows and DeepL for Mac. Ready to try out the new languages for yourself? Head over to DeepL Translator now to start translating. DeepL has now made it easier to get English translations in your preferred style: DeepL Translator can now produce translations that reflect the particularities of American and British English. Variations in spelling are covered: American neighbors become British neighbours, and they can call each other on their cell phones or on their mobile phones, depending on your preference. Similarly, terms that have completely different equivalents across the pond are also taken into account. American cookies become British biscuits. That is, of course, unless they’re on your web browser. This expanded control over DeepL Translator enables you to more quickly get translations that respect the language conventions that you prefer. To set the language variety you would like, simply select either “English (American)” or “English (British)” as the target language. American and British English are now available as target languages on DeepL.com, via the DeepL API, and in our desktop applications, DeepL for Windows and DeepL for Mac. Combining this feature with other features, such as the Glossary, allows you to take full control of the English texts DeepL Translator produces for you. *Please note that these two language variants are not currently available for translations from Japanese and Chinese. These two source languages will be added in the near future. Since we added Japanese to DeepL Translator, we’ve been amazed by how quickly people in Japan have taken to using our services. We made it a priority to make our paid subscription service, DeepL Pro, available in Japan and we can now announce that that has become a reality. All DeepL Pro subscription plans, providing added data protection, unlimited translation capacity, API access, and more, are now available in Japan. Japanese customers also benefit from prices in Yen. We continue to expand DeepL Pro coverage around the world and will add further countries and currencies in the near future. What’s better than finding the exact right word to express yourself? Always finding the exact right word to express yourself! Our new Glossary feature gives you the power to determine exactly how DeepL Translator should translate terms for you. Whether you translate legal contracts, technical manuals, or company newsletters, the Glossary can save the terms you use and the translations you like, ensuring consistency and saving you time. All you need to do is click on a word in the translated text, click on your preferred formulation, and save the Glossary pair when asked. Alternately, you can click on “Customization” and enter it manually. You can create Glossary entries for nouns, verbs, adjectives, adverbs, and even multi-word entries, and DeepL Translator will adapt the grammar and formulation to accommodate your preferences. You always have the option to turn the Customization feature on or off, giving you full control over how and when the rules you create affect DeepL Translator. Free users of DeepL Translator can create a limited number of Glossary pairs, whereas DeepL Pro subscribers can create as many as they like! Currently, Glossary pairs can be created for the following language combinations: German into English, French into English, English into German, and English into French. Further language combinations will be added in the near future, as will further options to customize DeepL Translator’s results and make the translations your own. We are delighted to announce that DeepL Pro is now available in the Land of the Free and the True North! This has been a longstanding goal of ours and we are very proud to welcome our American and Canadian friends. All DeepL Pro subscription plans, providing added data protection, unlimited translation capacity, API access, and more, are available immediately. Additionally, we are very happy to allow our North-America-based clients to pay for their DeepL Pro subscriptions in their local currencies, US and Canadian dollars. We’re constantly working to enter new markets and will be making DeepL Pro available in further countries in the near future. Please note that while businesses in Quebec may sign up as of today, registration will be open for private citizens shortly. Que legal! DeepL Translator is now also capable of translating into Brazilian Portuguese! The Portuguese language, in all its varieties, is spoken all around the world by millions of people on several continents. The country with the most Lusophones is by far Brazil, with over 200 million people. To reflect some of this linguistic diversity, our researchers have taught our algorithms to also produce translations in the variety of Portuguese used in Brazil. This allows DeepL Translator to capture the nuances and unique vocabulary of Brazil, as well as the language of millions more in Portugal, Angola, Mozambique, and elsewhere. Some variations in the Portuguese language are clear to see in the translations of the phrase “The girl is having breakfast.” Portuguese: A rapariga está a tomar o pequeno-almoço.Brazilian Portuguese: A menina está tomando o café da manhã. The terms for “the girl” (a rapariga/a menina) and “breakfast” (o pequeno-almoço/o café da manhã) reflect regional variations, as does the form of the verb tomar; Brazilian Portuguese prefers the gerund, whereas a construction using the infinitive is more common in Portugal and other countries. The two variations of Portuguese are now available on DeepL.com. To select the version you prefer, simply click on the language next to “Translate into” and select either “Portuguese” or “Portuguese (Brazilian).” Enjoy the many delights of the Portuguese-speaking world!
我们经常被问到,为什么DeepL Translator通常比主要科技公司的竞争系统工作得更好。这有几个原因。和大多数翻译系统一样,DeepL翻译器使用人工神经网络翻译文本。这些网络接受数百万翻译文本的培训。然而,我们的研究人员已经能够对整个神经网络方法进行许多改进,主要在四个方面。 众所周知,大多数公开可用的翻译系统都是对Transformer架构的直接修改。当然,DeepL的神经网络也包含这种结构的一部分,比如注意机制。然而,网络的拓扑结构也存在显著差异,这使得翻译质量总体上比公共研究的最新水平有了显著提高。当我们在同一数据上对我们的体系结构和最著名的变压器体系结构进行内部培训和比较时,我们清楚地看到了网络体系结构质量的这些差异。 我们的大多数直接竞争对手都是大型科技公司,它们有多年开发网络爬虫的历史。因此,它们在可用的训练数据量方面具有明显的优势。 另一方面,我们非常重视有针对性地获取特殊培训数据,以帮助我们的网络实现更高的翻译质量。为此,除其他外,我们开发了特殊的爬虫程序,可以在互联网上自动查找翻译并评估其质量。 在公共研究中,培训网络通常使用“监督学习”方法进行培训。网络一遍又一遍地显示不同的示例。网络反复将自己的翻译与来自训练数据的翻译进行比较。如果存在差异,则相应地调整网络的权重。 在训练神经网络时,我们还使用机器学习其他领域的其他技术。这也使我们能够实现重大改进。 与此同时,我们(和我们最大的竞争对手一样)培训具有数十亿参数的翻译网络。这些网络非常大,只能在非常大的专用计算集群上以分布式方式进行训练。 然而,在我们的研究中,我们非常重视这样一个事实,即网络的参数得到了非常有效的利用。这就是我们如何通过更小更快的网络实现类似的翻译质量。因此,我们还可以为免费服务的用户提供非常高的翻译质量。 当然,我们一直在寻找非常优秀的数学家和计算机科学家,他们希望帮助推动发展,进一步改进DeepL翻译,打破世界各地的语言障碍。如果您还具有数学和神经网络培训的经验,并且如果您能够免费使用全球范围内的产品,请向DeepL申请! 我们很高兴宣布对DeepL API的术语表支持! 词汇表可供DeepL API免费用户和专业用户使用。如果您还没有帐户,可以在此处创建帐户。 我们在2020年首次为我们的web翻译器和桌面应用程序添加了词汇表支持,它已成为我们用户非常喜爱的功能。无论您是否正在翻译,用例都是多种多样的… ...通常有一些特定于品牌和行业的词汇需要你去解释。 词汇表允许您为单词和短语指定自己的翻译,从而使您能够以一致性和规模定制翻译,这样您就可以向用户提供尽可能最好的结果,而无需将大量手动工作推给翻译和内容团队。对于我们的API用户来说,这是一项特别重要的功能,他们通常在构建自动化的翻译工作流,在词汇表的帮助下,可以进一步自动化。 对API的词汇表支持还将使具有DeepL插件的CAT工具提供商能够在其产品中构建词汇表功能。 如果您想学习如何在API中使用词汇表,或者您已经准备好开始,您可以查看API文档。 在这篇文章的其余部分中,我们将更多地分享您可以从该特性中期待的东西。 目前,API的词汇表可以为DeepL web translator和桌面应用程序上的词汇表创建相同的语言组合: 我们的团队正在致力于添加更多的语言组合很快。 通过DeepL API创建的词汇表不同于通过DeepL web translator和桌面应用程序创建的词汇表。这意味着API词汇表不能用于web翻译程序和桌面应用程序,反之亦然。 DeepL API Free和Pro用户都可以创建多达1000个词汇表。词汇表的最大大小限制为10 MB。 总而言之,以下是开始使用API中的词汇表以及在需要时获得帮助所需的一切: 我们很高兴地宣布,我们为DeeplAPI发布了Python客户端库。这是我们为API构建的第一个特定于编程语言的库,我们的目标是让使用Python的开发人员更容易使用DeepL构建应用程序。 简单多少? 下面是一个文本翻译示例。过去是什么: 现在是: 我们打算在Python库中支持DeepL API的所有函数,尽管在将新特性添加到API后,可能会向Python库中添加对它们的支持。 如果您只需要听这些,并且希望立即开始,您可以在这里找到Python客户机库和文档。 (顺便说一下:使用Python库需要API身份验证密钥。如果您还没有密钥,可以在此处注册免费或专业API计划。) 在本文的其余部分中,我们将分享更多关于Python库项目的信息,以及您可以如何参与其中。 Python非常适合我们的第一个客户端库。它是世界上使用最广泛的编程语言之一,在我们的API用户群中很受欢迎。这也是一种我们在DeepL有很多经验的语言。 也就是说,DeeplAPI的使用案例非常广泛,除了Python之外,还有许多语言对我们的用户非常重要。我们正在探索未来可能开发的其他客户端库,如果您对最重要的语言有任何反馈,请联系我们并让我们知道 Python客户端库也是DeepL有史以来第一个开源项目。我们一直在寻找向用户学习的新方法,并确保我们的产品能够为他们解决实际问题。 我们相信,开源项目是促进关于用户需要什么以及他们遇到什么问题的透明讨论的最佳方式,如果您有想法或反馈,我们鼓励您提出问题。 当然,GitHub问题并不是解决在使用Python库时可能遇到的每一个挑战的合适地方。如果您的帮助请求包含敏感数据,请联系我们的支持团队。我们很乐意帮忙。 总而言之,如果您想开始使用DeepL的Python客户端库,您可以: 谢谢你的阅读。我们很高兴看到你的作品。 你有没有想过能立即流利地使用另一种语言?也许你在旅行时看不懂菜单,或者你遇到了一个新朋友,但不会说同样的语言。 使用我们新的iOS移动翻译应用程序,您可以告别语言障碍。无论您身在何处或与谁在一起,您都可以取出iOS设备,轻松访问DeepL的准确翻译。 该应用程序可以在iPhone和iPad上访问,并允许用户在移动中进行翻译。它支持所有26种DeepL语言和语言变体,以及其他DeepL优势,如快速翻译、快速语言检测和一流的翻译质量。 此外,您还可以尝试语音到文本转换功能,该功能允许您使用设备的麦克风转换语音输入。它还可以通过另一种方式使用文本到语音功能,该功能允许您收听文本输入及其翻译。 这是我们DeepL团队的又一个激动人心的里程碑,我们将继续努力消除语言障碍,将文化融合在一起。 渴望在你的iOS设备上开始翻译?你可以在这里免费下载应用程序。 我们很高兴与大家分享,DeepL Translator现在支持13种新的欧洲语言:保加利亚语、捷克语、丹麦语、爱沙尼亚语、芬兰语、希腊语、匈牙利语、拉脱维亚语、立陶宛语、罗马尼亚语、斯洛伐克语、斯洛文尼亚语和瑞典语。 多年来,许多人要求我们为我们的翻译人员添加更多的语言,以便他们能够享受DeepL的准确翻译。随着新语言的发布,我们很高兴能够同时满足其中几个请求。 通过此次发布,DeepL现在可以为全球约1.05亿母语人士提供自然、高质量的翻译。这使我们能够进一步推进公司的愿景,打破语言障碍,使文化更紧密地结合在一起。 我们不断努力确保所有语言的翻译质量保持卓越,并确保所有用户的翻译过程无缝。如果您的首选语言尚未提供,请不要担心,我们将在未来添加更多语言。 从“Sziasztok”到“Hej”,我们已经准备好向新用户问好,并期待他们的反馈。所有十三种语言现在都可以在DeepL.com上通过DeepL API获得,也可以在我们的桌面应用程序DeepL for Windows和DeepL for Mac中获得。 准备好为自己尝试新语言了吗?现在前往DeepL Translator开始翻译。 DeepL现在可以更轻松地获得您喜欢的风格的英语翻译:DeepL翻译人员现在可以生成反映美国和英国英语特点的翻译。 本书涵盖了不同的拼写:美国邻居变成了英国邻居,他们可以通过手机或手机互相通话,这取决于你的喜好。 同样,整个池塘中具有完全不同等价物的术语也被考虑在内。美国饼干变成了英国饼干。当然,也就是说,除非它们在您的web浏览器上。 这种对DeepL Translator的扩展控制使您能够更快地获得尊重您喜欢的语言约定的翻译。要设置您想要的语言种类,只需选择“英语(美国)”或“英语(英国)”作为目标语言。 美国英语和英国英语现在可以通过DeepL API在DeepL.com上作为目标语言使用,也可以在我们的桌面应用程序中作为目标语言使用,DeepL for Windows和DeepL for Mac。将此功能与其他功能(如词汇表)相结合,可以让您完全控制DeepL Translator为您生成的英语文本。 *请注意,这两种语言变体目前不适用于日文和中文的翻译。这两种源语言将在不久的将来添加。 自从我们将日语添加到DeepL Translator中以来,我们对日本人使用我们服务的速度感到惊讶。我们优先考虑在日本提供付费订阅服务DeepL Pro,现在我们可以宣布这已经成为现实。 所有提供额外数据保护、无限翻译能力、API访问等功能的DeepL Pro订阅计划现已在日本推出。日本客户也从日元价格中受益。 我们继续在世界各地扩大DeepL Pro的覆盖范围,并将在不久的将来增加更多的国家和货币。 有什么比找到准确的词来表达自己更好呢?总是找到准确的词来表达自己!我们新的术语表功能使您能够准确地确定DeepL Translator应该如何为您翻译术语。 无论您翻译法律合同、技术手册或公司通讯,术语表都可以保存您使用的术语和喜欢的翻译,确保一致性并节省您的时间。 你所需要做的就是点击翻译文本中的一个单词,点击你喜欢的公式,并在被询问时保存词汇表对。或者,您可以单击“自定义”并手动输入。 您可以为名词、动词、形容词、副词甚至多词条目创建词汇表条目,DeepL Translator将调整语法和公式以适应您的偏好。您始终可以选择打开或关闭自定义功能,从而完全控制您创建的规则如何以及何时影响DeepL Translator。DeepL Translator的免费用户可以创建数量有限的词汇表对,而DeepL Pro订户可以创建任意数量的词汇表对! 目前,可以为以下语言组合创建词汇表对:德语到英语、法语到英语、英语到德语和英语到法语。在不久的将来,将添加更多的语言组合,以及定制DeepL Translator结果并使翻译成为您自己的翻译的更多选项。 我们很高兴地宣布,DeepL Pro现已在自由之地和真正的北方上市!这是我们的一个长期目标,我们非常自豪地欢迎我们的美国和加拿大朋友。 所有DeepL Pro订阅计划,提供额外的数据保护、无限的翻译能力、API访问等,都可以立即使用。此外,我们非常高兴地允许我们的北美客户以当地货币、美元和加元支付他们的DeepL Pro订阅费。 我们不断努力进入新市场,并将在不久的将来在更多的国家提供DeepL Pro。 请注意,虽然魁北克的企业可能从今天开始注册,但不久将向公民开放注册。 你是合法的!DeepL Translator现在还可以翻译成巴西葡萄牙语! 全世界各大洲的数百万人都在说各种各样的葡萄牙语。到目前为止,葡语最多的国家是巴西,人口超过2亿。 为了反映这种语言的多样性,我们的研究人员已经教我们的算法也产生在巴西使用的各种葡萄牙语的翻译。这使DeepL Translator能够捕捉到巴西的细微差别和独特词汇,以及葡萄牙、安哥拉、莫桑比克和其他国家数百万人的语言。 葡萄牙语的一些变体在“女孩正在吃早餐”这一短语的翻译中显而易见 葡萄牙人:女孩在吃早餐。巴西葡萄牙人:女孩在吃早餐。 “女孩”(女孩/女孩)和“早餐”(早餐/早餐)这两个词反映了地区差异,动词的形式也是如此;巴西葡萄牙语更喜欢动名词,而使用不定式的结构在葡萄牙和其他国家更为常见。 葡萄牙语的两种变体现在可以在DeepL.com上找到。要选择您喜欢的版本,只需单击“翻译成”旁边的语言,然后选择“葡萄牙语”或“葡萄牙语(巴西)” 享受葡萄牙语世界的诸多乐趣!

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文