自然语言处理和BERT预训练模型时代的内容营销--翻译技术速递

Technological innovations like voice-based digital assistants have changed the ways that users search for and interact with content. Search engines like Google have sought to adapt to these changes by upgrading their algorithms to produce more relevant and user-friendly search results. Google has achieved this with its recent unveiling of BERT. What is BERT? BERT, short for Bidirectional Encoder Representations from Transformers, is Google’s relatively new neural network technique for natural language processing (NLP) pre-training. BERT was the product of research on transformers conducted by Jacob Devlin and his colleagues from Google in 2018. Transformers are models that process words in relation to all the other words in a sentence, as opposed to looking at words one-by-one in sequential order. BERT is the most significant change in search since Google released RankBrain in 2015. BERT was rolled out in October of 2019 for English-language queries; it was then expanded to 70 languages in December of the same year. Here’s a history of Google updates from Moz. How does BERT work? Computers have never been good at understanding language. To improve their comprehension, researchers turned to NLP. Some common NLP models include named entity recognition, classification and question answering. Yet a significant limitation of NLP models is that they are each only fit to solve one specific language task. The limitations of these common models are overcome by BERT. BERT can essentially handle all NLP tasks. Researchers developed BERT through a technique known as masking which involves hiding a random word in a sentence. BERT looks at the words before and after the masked word to predict what that word is (this is called a bi-directional model). Through repetition, the system gets better at predicting what the masked word is as well as understanding language in general. In essence, BERT helps train cutting-edge question and answer systems. Why should we care about BERT? The purpose of BERT, or any Google algorithm, is to help users find relevant and useful content. For instance, Google provides an illustration of search results before and after BERT for the search phrase “2019 Brazil traveller to USA need visa” (sic). Google points out that the word “to” is particularly important to understanding the meaning of the phrase. The traveller is looking to go from Brazil to the US, not the other way around. RankBrain and earlier Google algorithms were unable to understand the context of “to”, resulting in also returning results for US citizens travelling to Brazil. BERT can, however, pick up on the nuance of “to” and will only return results for users looking to go to the US. Google estimates that BERT will initially help Search better understand one in ten searches in the US in English and impact ranking for featured snippets. (Featured snippets are the most desired real estate in search rankings.) With so much at stake, marketers must change the ways that they write content and closely monitor the changes Google and other search engines apply to their algorithms to ensure that their online assets are employing the latest best practices for optimal visibility results. How to rank well with BERT Following the implementation of BERT, it is more important than ever that marketers write quality content. In the past, marketers attempted to rank well by stuffing a number of high-value keywords into a piece of content. This SEO method produced results in terms of ranking but significantly impacted readability. With BERT, this kind of poorly written content will have difficulty finding its way to the top of the search rankings. This is not to say that you should not continue to employ SEO best practices. You will still want to adhere to foundational SEO techniques including researching and transcreating keywords (particularly long-tail keywords), internal and external links, headings, etc. But you will want to use these strategies in a way that does not sacrifice the quality of your content. So, how do you go about writing good content to rank well in the BERT era? Start by focusing on the following six tactics. Match questions to answers exactly When crafting an answer, think of a one- or two-sentence response that Siri or Alexa would give that exactly matches the question. The basic format for answering a question should be [entity or subject matter] is [answer]. Imagine a user enters “where is Disney World?” The entity or subject is Disney World. Your content should include a simple answer to this question in the suggested format. So, “Disney World is located in Orlando, Florida”. Identify units and classifications Pay attention to the meanings of words and if they imply specific units and/or classifications. NLP will look for these when determining if content contains the answer to a question. For example, consider the query “temperature to cook pork chops”. Fahrenheit and Celsius are both units of temperature, so the answer must include the number and unit of measurement. So, a viable answer is “Cook all raw pork chops to a minimum internal temperature of 145°F”. Note that if you were writing for an audience in Europe, you would provide your answer in Celsius. Get to the point While long-flowing speech may sound nice and even have some artistic quality to it, BERT does not like it. One of the most important things that you should do is answer a question as clearly and concisely as possible. Use keywords naturally We mentioned above that keyword stuffing impacts readability and gets you penalized by Google. With BERT, you really need to get rid of that practice. Instead, use single keywords like you would in a normal conversation; they must not sound forced. Long-tail keywords have excellent SEO value, but they must be used in a way that caters to voice-based queries. Use related keywords Related keywords are those that commonly appear alongside a specific term. Using related keywords improves the relevancy of the target keyword. These terms should be identified prior to writing and sprinkled into content naturally. You are not trying to rank for these related terms, but to improve the recognition or understanding of the target phrase. For example, if you were trying to rank for “content marketing”, you could include related terms like SEO, SERP, word count, meta tags and so on to make the page more relevant. Answer all aspects of a question Get into the habit of following a query all the way through to answering follow-up or additional questions a user may have. In other words, your content should strive to answer all questions that a user could pose when they conduct a search. And as mentioned before, the more precise and useful your content is, the better it will rank. For example, if the user searches for “checking account”, you may also want to address “types of checking accounts” and “how to open a checking account”. Pay attention to the usage of certain words In the Brazil-to-US traveller example that we mentioned above, we highlighted the significance of the word “to”. Some additional words you want to pay attention to include but, not, except, from, in and about. With BERT, the meanings of these words matter; make sure you are using them in a way that is consistent with the intent of your content. BERT and other NLP technologies will continue to change the ways that Google and other search engines rank content. An increased emphasis will be placed on the user experience by returning search results that are more conversational and relevant. To optimize performance with BERT, it is more important than ever for marketers to deliver high-quality content. You don’t want to be behind the curve compared to your competitors. Get started by following the tips mentioned in this blog post and continue to monitor the use of BERT and other NLP models to stay up to date on the best content marketing practices.

像基于语音数字助理这样的技术创新改变了用户搜索及与内容交互的方式。像谷歌这样的搜索引擎寻求通过升级算法来适应这些变化，以产生更多相关性和用户友好的搜索结果。谷歌通过最近发布的预训练模型做到了这一点。预训练模型是什么？预训练模型是来自变压器双向编码器表示的缩写，也是谷歌相对较新的用于自然语言处理(NLP)预训练的神经网络技术。预训练模型是谷歌的雅各布·德弗林和他的同事在2018年对变压器进行研究的产物。变压器是一种模型，是将单词与句子中所有其他词的关系进行处理，而不是按顺序逐个查看单词。自2015年谷歌发布RankBrain以来，预训练模型可以说是搜索领域发生的重大变化。预训练模型于2019年10月推出，用于英语查询；随后于同年12月扩大到70种语言。以下为Moz从谷歌更新的历史。预训练模型如何运作？计算机无法理解语言。为了提高计算机的理解力，研究人员转向了自然语言处理。一些常见的自然语言处理模型包括命名实体识别，分类和问答。然而，自然语言处理模型一个显著的局限性是只适合于解决一个特定的语言任务。预训练模型克服了通用模型的局限性。预训练模型基本上可以处理所有的自然语言任务。研究人员通过一种名为掩蔽的技术开发出了预训练模型，该技术包括在句子中隐藏一个随机的单词。预训练模型用于查看被屏蔽单词前后的单词来预测那个单词是什么（这被称为双向模型）。系统通过反复能更好地预测被遮蔽的词是什么，也可以更好地理解一般的语言。预训练模型在本质上为帮助训练尖端的问答系统。为何关注预训练模型？预训练模型或者任何谷歌算法旨在帮助用户找到相关的，有用的内容。例如，谷歌为搜索短语“巴西旅行者在2019年到美国需要签证”（原文如此）提供了预训练模型前后的搜索结果说明。谷歌指出，“to”这个词对理解句中短语的意思特别重要。这位旅行者希望从巴西到美国，而不是从巴西到美国。 RankBrain和早期的谷歌算法无法理解“to”的上下文，美国公民到巴西旅行的结果也会返回。然而，预训练模型可以理解“to”的细微差别，并且返回结果只为用户寻找美国。谷歌估计预训练模型最初将帮助Search更好地理解在美国用英语进行十个搜索中的一个，并影响精选片段的排名。（精选片段在搜索排名中最受欢迎。）面对如此多的利害攸关，营销人员必须改变撰写内容的方式，并密切关注谷歌和其他搜索引擎对算法的改变，以确保在线资产采用最新的最佳实践而获得最佳可见性结果。如何与预训练模型并驾齐驱得益于预训练模型的实施和采用，营销者编写高质量的内容比以往任何时候都更加重要。过去为了排名靠前，营销人员试图通过在一个内容中填充大量高价值的关键词。这种搜索引擎优化方法在排名方面产生结果，但很显然影响了可读性。有了预训练模型，低质量内容将很难找到进入搜索排名前列的路。这并不意味着不应该继续采用引擎优化方法最佳实践。我相信你仍然希望坚持基本的引擎优化方法技术，包括研究和转换关键字（特别是长尾关键字），内部和外部链接，标题等。但与此同时是希望使用这些策略的方式，而不是不牺牲内容质量。那么，该如何在预训练模型时代写出高质量的内容来获得排名呢？从以下六个策略开始。将问题与答案精确匹配想出一个答案后，思考Siri或Alexa会给出一两句话的答案，而这些答案完全符合你的问题。回答问题的基本格式应该是[实体或主题]是[答案]。想象一个用户输入“迪斯尼世界在哪里？”实体或主体就是迪斯尼世界。你的内容应该包括以建议的格式对这个问题的简单回答。所以，“迪斯尼乐园位于佛罗里达州奥兰多”。确定单位和分类注意单词的含义，以及辨析是否暗示特定的单位和/或分类。要确定内容是否包含问题的答案，NLP就会查找相关内容。例如，考虑查询“煮猪排的温度”。华氏和摄氏都是温度单位，所以答案必须包括数字和计量单位。因此，一个可行的答案是“将所有生猪排的内部温度控制在145°F以下”。请注意，如果是为欧洲的听众写作，要以摄氏为单位来提供答案。直奔主题长篇大论的演讲听起来妙趣横生，甚至带有艺术效果，但预训练模型不吃这一套。所应该做的最重要的一件事就是尽可能清晰简洁地回答问题。自然地使用关键字前面提到关键字填充会影响可读性，还会受到谷歌的惩罚。可以通过预训练模型摆脱这种做法而是像在正常的对话中那样使用单个关键词；对话不能让人听起来像被迫的。长尾关键词具有出色的引擎优化方法价值，但必须以迎合基于语音查询的方式使用。使用关键字关键字通常出现在特定术语旁边。使用关键字可以提高目标的相关性。写作前先识别好术语，并将其自然地融入到内容中。不要为试图为相关术语排名，而是为了提高对目标短语的识别或理解。如果试图排名“内容营销”，就可以包括相关的术语，如引擎优化方法，搜索引擎结果页面，字数，元标签等，使页面更具相关性。回答各个方面的问题养成这样的习惯:始终跟踪一个查询，直到回答用户可能出现的后续问题或其他问题。换句话说，你的内容应该尽可能回答用户在进行搜索时可能提出的所有问题。而且就像之前提到的，内容越精确，越有用，排名就越好。例如，如果用户搜索“支票帐户”，那么可能还想寻址“支票帐户的类型”和“如何开立支票帐户”。注意某些词的用法在我们上面提到的巴西到美国旅行者的例子中，强调了单词“to”的意义。你要注意的一些附加单词包括but，not，except，from，in和about。对预训练模型来说，这些词的含义很重要；能够确保使用它们的方式与你的内容的意图一致。预训练模型和其他自然语言处理技术将继续改变谷歌和其他搜索引擎对内容进行排名的方式。通过返回对话性和相关性的更具搜索结果，将更加强调用户体验。对于营销人员来说要用预训练模型优化性能，交付高质量的内容比以往任何时候都更加重要。若不想落后于竞争对手。请按照这篇博客文章中的提示开始，并继续监控预训练模型和其他自然语言处理模型的使用，以保持最佳内容营销实践的最新信息。

以上中文文本为机器翻译，存在不同程度偏差和错误，请理解并参考英文原文阅读。

阅读原文

机器翻译

工具

翻译管理

本地化