Facebook和负责任的人工智能的重要性--翻译技术速递

Does the recent flurry of headlines about Facebook and the negative outcomes produced by its algorithms have you worried about the future and the implications of widespread AI usage? It’s a rational response to have during an alarming news cycle. However, this situation shouldn’t be interpreted as a death knell for the use of AI in human communications. It’s more of a cautionary example of the disastrous consequences that can occur as a result of not using AI in a responsible way. Read on to learn more about ethical technology, data quality, and the significance of human-in-the-loop AI. Facebook and dark AI During a recent Senate hearing, a Facebook whistleblower testified about how the perils of language inequity can become exponentially multiplied when AI is used irresponsibly. Algorithms employed by Facebook prioritize engagement, which often leads to the spread of polarizing content (people love to engage with more extreme viewpoints). To combat the potential for polarization, Facebook relies on integrity and security systems to keep engagement-based ranking algorithms in check. But those systems only operate in certain languages — leaving users who speak other languages vulnerable. “In the case of Ethiopia there are 100 million people and six languages. Facebook only supports two of those languages for integrity systems. This strategy of focusing on language-specific, content-specific systems for AI to save us is doomed to fail,” the whistleblower said. In many cases where the social media giant has neglected to remove inaccurate, hateful, and even violence-inciting content, that dangerous data is then used to train the next iteration of AI language models. Such models, having been fed large quantities of bad information, end up perpetuating these noxious linguistic patterns across the platform. Facebook’s decision (or nondecision) to allow its algorithms to run unchecked in many non-English speaking countries is a harrowing example of “dark AI.” Fueled by biased data and lack of human oversight, AI can transform from an agent of positive exponential change into a sinister force capable of intentionally misleading and agitating populations en masse. Combatting AI bias by focusing on data quality We believe that holding a high standard of data quality is essential for combating dark AI and reducing the capacity for technology to be a negative influence on society. While innovation in model development and evaluation is crucial for the evolution of AI, we must also place a significant emphasis on the quality of the data used to train these models. As Google Brain co-founder Andrew Ng puts it, “Data is food for AI.” When researchers and engineers devalue data quality and rush to put a given model into production, this can lead to a phenomenon called “data cascades.” This is an insidious process in which inadequate data eventually causes unanticipated downstream effects. As we have seen in the Facebook hearing, it’s easy to overlook or conceal the fact that bad data can be the catalyst for a chain of distressing events. The good news is a growing number of organizations share our point of view that a data-centric approach to AI is critical for reducing unintended outcomes. Because the information used in machine learning is largely created or influenced by people, that data is capable of inheriting the biases of any humans who touch it. Biased data can ultimately reflect parts of our world that are untrue or that we are working to leave behind. As an example of gender bias, if you type the sentence, “The doctor spoke to the nurse,” into Google Translate, the Portuguese translation will indicate that the doctor is male and the nurse is female. Of course, hundreds of years of historical texts contribute to the technology producing this outcome — it just doesn’t reflect where we are now and how things have evolved for the better. Why humans are the key to responsible AI Although AI technology has become incredibly advanced in the 21st century, we can’t expect it to always be 100% accurate and behave with the same rationality of a human being. That’s why we think that human-in-the-loop AI is the best way to reduce the risk of the technology going rogue. With Language Operations (LangOps), we have pioneered the use of large-scale, human-in-the-loop AI language translation. This approach combines the speed and efficiency of machine translation with a human translator’s accuracy and ability to preserve cultural nuances. In our case, maintaining human intelligence and ethical judgement as part of the equation helps ensure that an organization’s loyal customers are never misunderstood, overlooked, or offended, no matter what language they speak. To mitigate the impact of potential bias and increase the quality of our data, we rely on our diverse community of editors to provide their perspective from all across the globe. These one-of-a-kind humans review and refine machine translations, not only to ensure high-quality outcomes, but to help train the AI engine so that it’s better at understanding context and cultural differences in the future. At the end of the day, the more diverse humans there are in the loop to keep AI in check and teach it to behave more ethically, the better. The true potential of language Facebook’s misuse of AI is a complex and challenging situation that has reached a breaking point after many years and many iterations of their technology. We’re not saying we hold the answers to remedying such a difficult and pervasive problem. However, by raising awareness of some of the concepts that Unbabel and our partners support to further advance ethical technology, we may be able to help avoid a situation where other companies claim ignorance or turn a blind eye when their AI is doing more harm than good. Interested in learning more about the importance of corporate responsibility and ethics in AI? Register for LangOps Universe 2021 to hear our VP of Engineering Jonathan Sowler discuss this topic in-depth.

最近关于Facebook及其算法产生的负面结果的头条新闻，你是否担心未来及其广泛使用人工智能的影响? 在令人担忧的新闻周期中，这是一种理性的反应。然而，这种情况不应被解读为人工智能在人类通信中使用的丧钟。这更像是一个警告性的例子，说明如果不以负责任的方式使用人工智能，可能会产生灾难性的后果。继续阅读，了解更多有关伦理技术、数据质量和人在环人工智能的重要性。 Facebook和黑暗人工智能在最近的一次参议院听证会上，一名Facebook告密者证实，当人工智能被不负责任地使用时，语言不平等的危险会成倍增加。Facebook采用的算法优先考虑用户参与度，这通常会导致两极分化内容的传播(人们喜欢接触更极端的观点)。为了对抗潜在的两极分化，Facebook依靠完整性和安全系统来控制基于参与度的排名算法。但这些系统只在某些语言中运行，这使得说其他语言的用户容易受到攻击。 “在埃塞俄比亚，有1亿人口和6种语言。Facebook只支持其中两种语言的完整性系统。这种专注于特定语言、特定内容的人工智能系统来拯救我们的策略注定会失败。” 在许多情况下，这家社交媒体巨头忽视了删除不准确、充满仇恨甚至煽动暴力的内容，这些危险的数据随后被用于训练下一代人工智能语言模型。这些模型被灌输了大量的错误信息，最终在整个平台上延续了这些有害的语言模式。 Facebook决定(或未决定)允许其算法在许多非英语国家不受限制地运行，这是一个令人痛心的“黑暗人工智能”例子。在有偏见的数据和缺乏人类监督的推动下，人工智能可以从一种正指数变化的代理人转变为一种能够蓄意误导和煽动大众的邪恶力量。通过关注数据质量来对抗人工智能偏见我们认为，保持高标准的数据质量对于打击黑暗人工智能和降低技术对社会产生负面影响的能力至关重要。虽然模型开发和评估方面的创新对人工智能的发展至关重要，但我们也必须高度重视用于训练这些模型的数据的质量。正如谷歌Brain联合创始人Andrew Ng所言:“数据是人工智能的食物。” 当研究人员和工程师降低数据质量并急于将给定模型投入生产时，这可能会导致一种称为“数据级联”的现象。这是一个潜在的过程，在这个过程中，不充分的数据最终会导致未预料到的下游效应。正如我们在Facebook听证会上看到的那样，我们很容易忽视或掩盖一个事实，即不良数据可能会引发一系列令人不快的事件。好消息是，越来越多的组织认同我们的观点，即以数据为中心的人工智能方法对减少意外结果至关重要。因为在机器学习中使用的信息很大程度上是由人类创造或影响的，这些数据能够继承任何接触到它的人的偏见。有偏见的数据最终会反映出我们世界中不真实的部分，或者我们正在努力抛弃的部分。举个性别歧视的例子，如果你把“the doctor spoke to the nurse”这句话输入谷歌Translate，葡萄牙语翻译会显示医生是男性，而护士是女性。当然，数百年的历史文本对产生这种结果的技术做出了贡献——它只是不能反映我们现在的处境以及事情是如何向更好的方向发展的。为什么人类是负责任的人工智能的关键虽然人工智能技术在21世纪已经变得非常先进，但我们不能期望它总是100%准确，并以人类同样的理性行事。这就是为什么我们认为人工智能是降低技术失控风险的最好方法。通过语言操作(LangOps)，我们率先使用大规模的人工智能语言翻译。这种方法将机器翻译的速度和效率与人工译者保持文化差异的准确性和能力相结合。在我们的例子中，保持人类的智慧和道德判断作为等式的一部分，有助于确保组织的忠实客户不会被误解、忽视或冒犯，无论他们说什么语言。为了减轻潜在偏见的影响并提高我们的数据质量，我们依赖于我们多元化的编辑社区，从全球各地提供他们的观点。这些独一无二的人类审查和完善机器翻译，不仅是为了确保高质量的翻译结果，也是为了帮助训练人工智能引擎，以便它在未来更好地理解语境和文化差异。总而言之，控制人工智能并教其行为更合乎道德的人越多，情况就越好。语言的真正潜力 Facebook对人工智能的滥用是一种复杂而具有挑战性的情况，经过多年和多次迭代，他们的技术已经达到了临界点。我们并不是说我们掌握着解决如此困难和普遍存在的问题的答案。然而，通过提高人们对Unbabel和我们的合作伙伴支持的一些概念的认识，进一步推进伦理技术，我们或许能够帮助避免出现其他公司声称无知或对他们的人工智能弊大于利的情况视而不见。有兴趣了解更多关于企业责任和道德在人工智能中的重要性吗?注册LangOps Universe 2021，听我们的工程副总裁Jonathan Sowler深入讨论这个话题。

以上中文文本为机器翻译，存在不同程度偏差和错误，请理解并参考英文原文阅读。

阅读原文

机器翻译

工具

翻译管理

本地化