27 Web Application Localization Best Practices

27个Web应用本地化最佳实践

2020-06-20 03:50 Lingua Greca

本文共2622个字,阅读需27分钟

阅读模式 切换至中文

Creating a web app? It’s important to make sure what you’re building can be quickly and easily localized. Here’s a list of 27 web application localization best practices to help you on your way. Following these guidelines will help make your content easier to localize, so that you can provide an equitable experience to users regardless of country or language. Key Terms for Understanding Web Application Localization Best Practices First, a quick note on terminology. There is a lot of nerdy vocabulary we use in the field of localization (you can read more about localization definitions, and why we need to uncomplicate them, here).  The term localization actually comes from the term locale, which refers to the combination of the language a user speaks and the place they are from. So, localization is really just a fancy way of saying “making something appropriate for a given locale.” Internationalization on the other hand, is what enables localization to even be possible (or not) in the first place. It takes place at the coding and development stage. To ensure that a product can be localized, it first has to be internationalized. So, internationalization is really just a fancy way of saying “making something localizable.” Here’s a simple way to think about it: first comes internationalization (global-ready code), then comes localization (local experience). 1. Assume text will grow or shrink.  Some languages, like Spanish and German, might require about more space for your text, in terms of the number of allowable characters.  Languages like Japanese and Chinese might require fewer characters. I generally suggest that your layouts in English allow for up to 100% more text in other languages as well as 50% less than English. The amount of text expansion or text condensation you need to consider depends greatly on what type of content it is too. If it’s a label or menu item that might be only one word in English, just know that it might take 4 or 5 words to say the same thing in another language. Or, it might take the same number of words but they might simply be longer. Generally, the smaller the number of words in English, the more you’ll need to allow for text expansion.  Otherwise, the text will most likely be truncated. See the example below, which shows the remaining text that won’t fit onto the button, meaning that the user can’t see it. This phrase takes only 7 characters in English, but 12 in German and 18 in French. And there are examples in which German takes more space than French and so on. This is why it’s important to build with flexibility in mind. Why does this matter so much? One of the problems I’ve noticed over the years is that when translators face this challenge at the translation phase, and they don’t have enough space to fit a good translation in, they are forced to either keep the term in English, or use a word that doesn’t really make sense. Either way, it’s a bad experience for your users.  In fact, if you don’t design in a way that accommodates this, you might be actually be contributing to language decay in some languages by forcing them to adopt terms in English that are less accessible to the majority of people. Basically, when we do that, we’re often imposing our English-centric norms onto them, and not allowing their own culture and language to shine. Worse yet, you might be really limiting your potential user base to just those who can understand English terminology.  2. Allow for larger font sizes.  While some languages might require fewer characters, some languages actually need larger font sizes. In other words, if you’re designing a button on a page in English, you might find that when it goes into German, it needs to grow wider, but for some languages like Japanese and Chinese, the same box might need to grow taller.  I’m sorry, I know it might be harder to design this way, but if you can allow your text boxes, buttons, and so on — basically anywhere that text might appear — to be flexible to accommodate the reality that languages are different — you’ll thank me later when users all over the world are raving about how well designed your experience is, no matter what the language!  3. Separate text from images. This means absolutely no images with text embedded in them whenever possible. If you can create a “translatable” layer of text that can be overlaid onto the image, using CSS or some other method, this will make everything so much more localization-friendly. This is actually super important. When you go to localize your content into another language, if the images have text in English in them, it basically tells the user that you designed this experience for someone else who lives in another country and speaks a different language. It can make them feel like a second-class user, one who gets a lower-quality experience. 4. Make image files easy to swap or hide. While it would be great to separate text from images in every single instance, that isn’t always realistic. There are times when you need to show product screenshots or charts and you simply won’t be able to have every single bit of text on the image extracted from the images. What can you do in these cases? Ideally, when you take screenshots, you take the same ones in every language your UI is available in at the same time. Or, you write two versions of the same content, one which references the image, and one that doesn’t. Basically, you’ll have to have a plan for how to swap those images out — or hide them — depending on which language and country you are targeting with the content. 5. Make image source files easy to access. One more note on image files. If you’re creating them in some sort of design software, make sure you not only give your localization partners access to the “final” file in the image format, but the source files too in your design software (like InDesign for example). Or, if you are using an online design tool (such as Canva), you might need to provide access to enable the images to be recreated in other languages. As for the decision on when to swap and when to hide, you don’t need to make that decision at the design or coding stage. You’ll just want to keep in mind that if you’re putting in images where the text can’t be extracted, that you might later need to swap or hide, and design accordingly to enable that level of flexibility. 6. Resource your strings. Ideally, string externalization, which is moving the strings from the code into resource files, becomes part of your company’s ongoing development processes. If not, it can be so time-consuming and difficult to do later on that your development teams simply will not agree to do it. It will become a cost-benefit discussion, and it will be hard to make the case later on that you should refactor your code in order to make it localizable later.  To prevent all that unwanted re-work, don’t put development teams in this negative position. Make it easy for developers to externalize their strings and get them into resource files from the start. The more you can operationalize it as part of something that just needs to happen before code gets deployed, the better able you are to build a truly global-friendly development process. There are plenty of options for appropriate resource file formats, such as JSON, XML, YAML, or gettext. There is usually a de facto standard format that depends on the framework or programming language you’re using. What happens if you avoid this critical step? If strings don’t get externalized, they just can’t be localized without writing additional code. This means that any hard-coded strings will show up in their original language, no matter what language the user speaks.  You don’t want to send a negative message to the user. Yet, the lack of localization-friendly development is a big reason why elsewhere in the world, lots of US software companies have a reputation for creating US-centric software that falls short in other markets. You don’t want your company or your developers to have that kind of reputation. But more importantly, you don’t want users to have a bad experience. 6. Consider implementing a pseudo-localization step. One of the most popular techniques for automatically checking for string externalization is something called pseudo-localization. In this process, you convert all characters in your application to another language, ideally using a different character set, even if it’s just machine translation, to ensure the entire process can work from end to end.  The point of this is basically to test out your code, build, and CI/CD process, to make sure they can support localization — before you actually invest in doing it for a release in other languages. This is quite simply a method you can use to identify challenges you’ll encounter with user-facing strings later when you decide to localize, to prevent you from committing code that might later be hard to refactor. With a pseudo-localization process, you essentially transform the resource files written by your developers and display what will happen to the strings when you pass a specific locale (a fake one) is passed to the application. Pseudo-localization will help you in many ways. It can help you flag concatenation issues.  Also, I find that many developers love the idea of pseudo-localization because they get to see what their product will look like from the perspective of another language while they are building it. It opens up their eyes to the possibilities of what their product will someday look like in Japanese or Finnish or Arabic. And, yes, you can even choose invented languages, like Dothraki or Klingon. I am all for whatever will get people to actually look at it on an ongoing basis! No matter what language you pick, I do recommend you throw in a few Japanese kanji characters into the mix while you’re at it just to ensure you’re testing to ensure the encoding works as well as the character sets. There are also some commercially available tools that will automatically pseudo-localize all of your resource files every time you make changes to them, or on a certain frequency that you can specify (daily, twice-daily, weekly, etc) depending on your needs. 7. Check for garbled characters. Another great thing about pseudo-localization is that you can use it to check for garbled characters, which are also known as mojibake (文字化け). This word is often used in English by localization professionals, but it comes from Japanese, because as you can probably imagine, Japanese users are the ones who happen to see garbled characters like this pretty frequently. So of course they created a word for it! Mojibake is the end result of when symbols are replaced with unrelated ones systematically, usually because the target character set is not supported.  If you have garbled characters, chances are that your text is being decoded using a character encoding that was unintentional. If you see a generic replacement character, like (“�”) show up in various places, it’s probably because the binary representation is considered invalid. This typically results from differences in encoding.  8. Train developers to consider multiple locales. What you’ll start to realize as you gain experience with internationalization is that you’ll need to get developers to avoid creating programmatic functions and patterns that are built around just one locale. Building something this way is something you might call a “locale-limiting architecture” or “locale-limiting design.”  Some might even call this way of building a product “ethnocentric,” which implies that a developer believes their own country and language are the center of the universe. Most developers don’t actually think that way — and in fact, a huge number of them have international backgrounds themselves! But, when they are building a product, it makes total sense that they think about the user and the use case with English (often American English) as the default. That’s user-centric design. And that is where they usually take the first steps away from creating “global-ready” code. This type of “anti-global” development could be considered “locale-centric.”. It basically means that the development team isn’t considering use cases from other countries and languages while they are building it. And for product teams where localization is an afterthought, that makes it soooo much harder to take a product to the world! The barrier I always hear from developers’ point of view on this is that building a global-ready product is a form of “premature optimization.” In reality, it’s smart economics. You take small steps today to avoid blowing your chances of giving your product the ability to succeed globally later. It’s a small up-front investment, for huge dividends later on. It’s like maintaining fast, reliable, automated tests and writing highly maintainable code from the start. Developers already optimize for the future by adopting clean coding practices and maintaining their work according to programming language standards. Internationalization should be no different. 9. Make sure your database schema supports international variants. Often, even if the code itself – and even the database technology itself – can support an international use case and the strings can be localized, the underlying database schema might not. This can be very tricky, and is a good reason why it’s helpful to consult with localization professionals at the earliest stage, ideally, when you’re conducting UX research, and before the first line of code gets written. Or, if you don’t have access to any, talk to international customers — early on during the design phase.  If you’re building many features and functionality that will leverage a database, you need to ensure that the database itself will accommodate use cases from more than one country. Sometimes, there might not be any need to change your database structure. In other cases, simply adding a few fields to it can prevent you from doing a huge amount of re-engineering work later on. A typical example is a database that will display a field to users in one language, but the field itself was built in a way that it’s not possible to display multi-language variants. When this happens, localization simply isn’t possible. It’s blocked by the database itself! So, it’s helpful to think about the international use case, including how your database will be used, while you’re designing the experience. Remember, pseudo-localization will help you catch these issues. 10. Use standard and consistent file structures. You can make localization much faster and easier if you use standard file formats (Java properties, XML, .NET resx, and so on). If you work in a custom development scenario, you can at least use a consistent, standards-based file format such as XLIFF.  Localization teams love standard file formats like the ones listed above. Their translation management systems can easily ingest them and process them, so that the localized output also works. If you do this, you’ll be saving yourselves tons of re-engineering, and you can bypass much of the QA work you’d otherwise require. Bottom line, if you want expensive, time-consuming, and human-centric localization with tons of human testing requirements, go ahead and use any old file format.
要想创建一个网络应用程序,确保你创建的内容能够快速方便地进行本地化是很重要的。这里列出了27个Web应用程序本地化最佳实践,会对你有所帮助。遵循这些指导方针有助于你创建的内容更容易本地化,便于摆脱国家和语言的限制,为不同用户提供同样愉快的体验。 了解Web应用程序本地化最佳实践的关键术语 首先,简单介绍一下术语。在本地化领域,我们使用了许多书呆子词汇(你可以在这里阅读更多关于本地化定义的内容,以及我们为什么需要简化它们)。 “本地化”这一术语其实是来自术语“locale”这个词,“locale”指的是用户语言及其所在国的组合。 因此,本地化实际上只是“使某物适合于给定的语言环境”的一种花哨的说法。 另一方面,国际化在一开始就决定了本地化是可能还是不可能。 本地化发生在产品的编码和开发阶段,所以为确保产品能够本地化,它首先必须国际化。 因此,国际化实际上只是“使某物可本地化”的一种别出心裁的说法。 有一种简单的思考方式:首先是国际化(全球就绪的代码),然后是本地化(本地经验)。 1.预判文本会增加还是缩短。 关于字符数量,像西班牙语和德语这样的语言,可能需要更多的文本空间。 像日语和汉语这样的语言,则可能需要更少的字符。 建议你的英语版式允许其他语言的文本最多比英语增加100%,最多减少50%。 你需要考虑的文本扩展或文本压缩的数量,在很大程度上取决于它是什么类型的内容。 如果它是一个标签或菜单项,在英语中可能只需要一个单词,但是同样的东西用另一种语言表达,可能就需要4或5个单词。 换句话说,在不同语言中该单词的字数可能相同,也可能长短不一。 一般来说,英语中单词的数量越少,就越需要允许文本扩展。 否则,文本很可能会被截断。 请参见下面的示例,其中显示了按钮无法容纳的剩余文本,这意味着用户看不到它。 这个短语在英语中只有7个字符,但在德语中有12个字符,在法语中有18个字符。 德语比法语占用更多空间,这样的例子还有很多。 这就是为什么创作者在构建时需要充分考虑到灵活性的原因。 为什么这一点如此重要?这些年来我注意到一个问题,当译者在翻译阶段面临这一挑战时,他们没有足够的空间来容纳一个好的译文,于是被迫保持术语的英语版式,或者使用一个不合适的单词。无论哪种方式,对用户来说都是糟糕的体验。 事实上,如果你以一种不合适的方式进行设计,那么可能会导致某些语言的退化,因为你迫使某些语言采用了大多数人不太容易理解的英语术语。基本上,当我们这样做的时候,我们经常把自己以英语为中心的规范强加给他们,从而阻碍了他们自己的文化和语言发光发亮。更糟糕的是,你可能真的把你的潜在用户群限制在了那些能够理解英语术语的人身上。 2.允许更大的字体尺寸。 虽然有些语言可能需要更少的字符,但它实际上可能需要更大的字体。换句话说,如果你在英文页面上设计一个按钮,你可能会发现当它变成德文时,它需要变宽,但对于另一些语言,比如日语和中文,同样的按钮框可能会需要变高。 很抱歉,这样的设计可能十分困难,但如果你能让你的文本框、按钮等——基本上文本可能会出现的任何地方——灵活地适应不同的语言,那么以后当用户称赞你的精心设计时(无论哪种语言),你都会感谢我的! 3.将文本与图像分离。 这意味着在任何可能的情况下,绝对不要使用嵌入文本的图像。如果你可以使用CSS或其他方法创建一个“可翻译”的文本层,将它覆盖在图像上,这会对一切本地化更加友好。 这点其实非常重要。当你把你的内容本地化成另一种语言时,如果图片中有英语文本,它就会告诉用户:设计者为生活在另一个国家说不同语言的人设计的这种体验。这会让他们感觉自己像个二流用户,得到的是低质量的体验。 4.使图像文件易于交换或隐藏。 虽然在每个单独的实例中将文本与图像分离是很好的,但这并不现实。有时候,你需要显示产品截图或图表,但无法从图像中提取图像上的每一个文本。 这种情况下的你能做什么?理想状态下,当你截屏的时候,你在你的UI可用的每种语言中取相同的东西。或者,为同一内容编写两个版本,一个引用图像,另一个不引用图像。基本上,你必须有一个计划,如何交换这些图像-或隐藏它们-取决于你的目标语言和国家的内容。 5.使图像源文件易于访问。 关于图像文件还有一点需要注意。如果你用某种设计软件创建它们,确保让你的本地化合作伙伴可以访问图像格式的“最终”文件,还可以访问你的设计软件(比如InDesign)中的源文件。或者,如果你正在使用在线设计工具(如Canva),你可能需要提供访问权限,使图像能够以其他语言重新创建。 至于何时交换和何时隐藏,你不需要在设计或编码阶段做出决定。只需记住,如果你在无法提取文本的地方放置图像,那么以后可能需要交换或隐藏图像,并相应地进行设计以启用该级别的灵活性。 6.利用字符串。 理想情况下,字符串外部化,即将字符串从代码移动到资源文件中,成为公司正在进行的开发过程的一部分。如果不这样做,以后做起来会非常耗时,困难重重,甚至你的开发团队根本不会同意这样做。这将成为一个成本效益的讨论,而且以后很难提出应该重构代码以使其可本地化的理由。 为了防止所有不必要的重复工作,不要把开发团队置于这种消极的位置。确保开发人员能够轻松地将字符串外部化,并从一开始就将它们放入资源文件中。你越多地将其作为部署代码之前需要完成的事情来操作,你就越能够构建一个真正的全球友好的开发过程。 有很多适当的资源文件格式可供选择,如JSON、XML、YAML或gettext。通常有一个标准格式,它取决于你正在使用的框架或编程语言。 如果你忽视了这一关键步骤,会发生什么?如果字符串没有被外部化,它们只能在不编写额外代码的情况下被本地化。这意味着无论用户使用的是哪种语言,任何硬编码的字符串都将以其原始语言显示。 你一定不想向用户发送负面消息。然而,在世界其他地方,许多美国软件公司因为开发以美国为中心的软件而闻名,但这些软件在其他市场上的表现却不尽如人意,其中一个重要原因就是缺乏本地化友好。你一定不希望你的公司或开发人员有这样的名声。更重要的是,你不希望用户有糟糕的体验。 6.考虑实现伪本地化步骤。 自动检查字符串外部化的最流行技术之一叫做伪本地化。在这个过程中,你将应用程序中的所有字符转换为另一种语言,理想情况下使用不同的字符集(即使只是机器翻译),以确保整个过程可以从头到尾工作。 在你真正投入到用其他语言发布的版本之前,这样做的主要目的是测试你的代码、构建和CI/CD过程,确保它们能够支持本地化。这是一种非常简单的方法,之后决定本地化时,你可以使用它来应对面向用户的字符串中遇到的挑战,以防止提交以后可能难以重构的代码。 在伪本地化过程中,本质上你是在转换开发人员编写的资源文件,并显示在传递特定语言环境(假语言环境)给应用程序时字符串将发生的情况。伪本地化会给予你许多方面的帮助。它还可以帮助你标记连接问题。 另外,我发现许多开发人员都很喜欢伪本地化这一概念,因为他们可以在构建产品时从另一种语言的角度来观察产品的外观。它开拓了他们的视野,可以看到产品在转化成日语、芬兰语或阿拉伯语时将会是什么样子。而且,你甚至可以选择自己发明的语言,比如多斯拉克语或克林贡语。我支持任何能让人们真正看到它的东西!无论你选择哪种语言,建议你在进行测试时加入一些日文汉字,以确保编码和字符集都能正常工作。 还有一些商业工具,它们可以在你每次修改资源文件时自动进行伪本地化,或者根据你的需要指定特定的频率(每天、每天两次、每周等等)。 7.检查是否有乱码。 伪本地化的另一个伟大之处是,你可以用它来检查混乱的字符,这也被称为mojibake(文字化け)。这个词在英语中经常被本地化专业人员使用,但它来自日语,因为正如你想象的那样,日本用户经常遇到像这样混淆的字符。所以他们为它创造了一个这个词。Mojibake是符号被系统地替换为不相关的符号的最终结果,通常是因为不支持目标字符集。 如果你有字符混淆了,那么很有可能是因为你的文本正在使用无意的字符编码进行解码。如果你看到一个通用替换字符,比如(" ")出现在不同的地方,这可能是因为二进制表示被认为是无效的。这通常是由于编码的差异造成的。 8.训练开发人员的地区多元化思维。 随着国际化经验的积累,你会意识到,需要让开发人员避免这种思维——创建仅围绕一种语言环境构建的编程函数和模式。以这种方式构建的东西可以称为“区域限制架构”或“区域限制设计”。 有些人甚至称这种构建产品的方式为“种族中心主义”,这意味着开发者认为他们自己的国家和语言是宇宙的中心。但大多数开发人员实际上并没有这样想——事实上,他们中有相当一部分人本身就有着国际背景。但是,当他们构建一个产品时,通常会以英语(通常是美式英语)作为缺省值来考虑用户和用例,这是可以理解的。这是以用户为中心的设计。这也是他们远离创建“全球就绪”代码的第一步。 这种“反全球化”的发展被认为是“以地区为中心”。它基本上意味着开发团队在构建时没有考虑来自其他国家和语言的用例。而对于产品团队来说,本地化是事后才考虑的事情,这使得将产品推向世界变得非常困难! 在这个问题上,我听到开发人员的想法是,构建一个全球就绪的产品是一种“过早优化”。事实上,这是明智的经济学。你今天迈出的一小步可能以后会让你的产品在全球取得成功。这是一个前期的小投资,以后会有巨大的红利。 这就像从一开始就保持快速、可靠、自动化的测试,并编写高度可维护的代码。开发人员已经通过采用安全的编码实践和按照编程语言标准去维护工作,优化未来。国际化应该也不例外。 9.确保你的数据库模式支持国际变体。 通常,即使代码本身——甚至数据库技术本身——能够支持国际用例,并且字符串能够本地化,但是底层数据库模式可能不行。这可能非常棘手,这也是为什么在最早期,(理想情况下,在进行UX研究时,在编写第一行代码之前)咨询本地化专业人员是有帮助的。或者,如果你无法进行咨询,那么需要在设计初期就与国际客户交流。 如果你正在构建许多将需要借助数据库的特性和功能,那么需要确保数据库本身能够容纳来自多个国家的用例。有时,可能无需更改数据库结构。在其他情况下,简单地向它添加几个字段就可以为你免去以后大量的重新设计工作。 典型的例子是这样一种数据库——用一种语言向用户显示字段,但是字段本身的构建方式不可能显示多语言变体。当这种情况发生时,本地化就不可能了。它被数据库本身阻止。因此,在设计用户体验时,考虑国际化的用例是很有帮助的,包括如何使用数据库。记住,伪本地化将帮助你解决这些问题。 10.使用标准一致的文件结构。 使用标准文件格式(Java属性,XML,。NET resx等),可以使本地化更加快捷容易。 如果你在自定义开发场景中工作,那么至少可以使用一种统一的、基于标准的文件格式,如XLIFF。 本地化团队倾向于使用上面列出的标准文件格式。他们的翻译管理系统可以很容易提取并处理这些格式,这样本地化的输出也能正常工作。如果这样做,你将不必重复大量设计工作,并且可以绕过许多QA工作。总之,如果你不再想要昂贵的、耗时的、以人为中心的本地化,并且不想要大量的人工测试需求,那么就可以使用这些新的文件格式。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文