Google Translate Not Ready for Use in Medical Emergencies But Improving Fast — Study

谷歌翻译尚未准备好用于医疗紧急情况但正在改进快速研究

2021-03-16 20:50 slator

本文共664个字,阅读需7分钟

阅读模式 切换至中文

In recent years, communication between medical professionals and patients with limited English proficiency (LEP) has attracted attention and big bucks from a range of players across the US, from Google’s USD 100m investment in telehealth platform Amwell to AMN Healthcare’s USD 475m acquisition of remote interpreting provider Stratus Video (Pro). As first covered by The Verge on March 9, 2021, a new study published in the Journal of General Internal Medicine has now shifted the focus from interpreting to translation. While general materials, such as information about medical conditions and diagnoses, are typically translated in advance into commonly spoken languages, patient-specific discharge instructions present a gap that is often bridged by machine translation. The study is a joint effort between Dr Lisa C. Diamond of Memorial Sloan-Kettering Cancer Center in New York and, from Olive View-UCLA Medical Center in California, Drs Breena R. Taira and Vanessa Kreger and nurse practitioner Aristides Orue. Their goal, to objectively assess the accuracy of Google Translate for discharge instructions given patients leaving the ER. As the paper explains, past research on Google Translate in medical contexts has produced mixed results about usability, depending on the languages studied and the latest version of Google Translate’s algorithm. An article published in JAMA Internal Medicine in 2019 concluded that, for Spanish and Chinese, Google Translate could supplement but not replace written English instructions (interpreted for the LEP patient), and should include a warning about potentially inaccurate translations. The UCLA / Sloan-Kettering study differentiates itself from others in a few key ways. The team deliberately analyzed translations into both widely spoken languages (Spanish, Chinese, Vietnamese, Filipino, and Korean) as well as those into languages of lesser diffusion commonly spoken in their area, and thus likely to be encountered in the ER (e.g., Armenian and Farsi). Since Google Translate improves based on user feedback, it performs differently for languages of lesser diffusion versus languages with many users worldwide. Another key difference is the selection of reviewers. The 20 volunteers who evaluated the English discharge instructions and translations were bilingual community members — without professional experience as linguists or as healthcare workers — which the researchers believe more accurately represents typical levels of patient comprehension of medical text. (Most other comparable studies assign this work to translators.) The volunteers analyzed 20 free-form written patient discharge statements frequently used in the ER, and the corresponding translations into seven languages, for a total of 400 translations evaluated. Although mean scores for fluency, adequacy, meaning, and severity were high, they varied significantly by language. “Overall, [Google Translate] accurately conveyed the meaning of 330/400 (82.5%) instructions examined but the accuracy varied by language from 55 to 94%,” the authors wrote, describing some of the errors as “nonsensical.” As expected, Spanish and Chinese translations were most accurate (94 and 82%, respectively), while Armenian and Farsi had accuracy rates of 55 and 67.5%. The bilingual reviewers also pointed out several language-specific issues that might further impede understanding, such as the different writing systems used for traditional versus simplified Chinese, as well as differences among variants of Farsi, including Dari and Tajik. Notably for Farsi, which is written from right to left, Google Translate initially transposed the text from left to right, rendering the translations illegible. The accuracy rates for Chinese and Spanish translations, based on community volunteers’ assessments, nearly matched those established by professional translators in the 2019 Journal of Internal Medicine study. As the researchers put it: “Our accuracy rates for these two languages as assessed by volunteers from the community were almost identical […] to those of professional translators.” The researchers further noted: “This is important information for future work in this area as the difference between patient perception of machine translations and a professional translator’s perception has been an ongoing question.” Citing the inconsistent performance between languages, the authors concluded, “Although the future of written translation in hospitals is likely machine translation, [Google Translate] is not ready for prime time use in the emergency department.” For the time being, best practice relies on professional interpreters.
近年来,医疗专业人员与英语能力有限(LEP)患者之间的交流吸引了美国各地一系列参与者的关注和大笔资金,从谷歌对远程医疗平台Amwell的1亿美元投资,到AMN Healthcare以4.75亿美元收购远程口译提供商Stratus Video(Pro)。 2021年3月9号,《The Verge》杂志首次报道了这项研究,发表在《普通内科杂志》上的一项新研究现在已经将研究重点从口译转移到了笔译上。虽然一般材料,例如关于医疗条件和诊断的信息,通常事先被翻译成通用语言,但患者特定的出院指示存在一个空白,通常通过机器翻译来弥合这个空白。 这项研究是由纽约纪念斯隆-凯特琳癌症中心的丽莎·C·戴蒙德博士和加利福尼亚州Olive View-UCLA医学中心的Breena R.Taira博士和Vanessa Kreger以及护理师Aristides Orue共同进行的。他们的目标是客观地评估谷歌翻译的准确性,以便为离开急诊室的病人提供出院指示。 正如这篇论文所解释的,过去对Google Translate在医疗环境中的研究,根据所研究的语言和Google Translate算法的最新版本,在可用性方面产生了好坏参半的结果。 2019年发表在《美国医学会内科杂志》上的一篇文章总结说,对于西班牙语和汉语,谷歌翻译可以补充但不能取代书面的英语说明(为LEP患者翻译),并且应该包括对潜在不准确翻译的警告。 UCLA/Sloan-Kettering的研究在几个关键方面与其他研究不同。研究小组特意分析了翻译成广泛使用的语言(西班牙语,汉语,越南语,菲律宾语和韩语)以及翻译成在他们的地区普遍使用的,因此可能在ER中遇到的较少传播的语言(例如亚美尼亚语和波斯语)的翻译。由于Google Translation是根据用户反馈进行改进的,因此它对于传播程度较低的语言和全球用户众多的语言的表现有所不同。 另一个关键的区别是评审人员的选择。评估英语出院说明和翻译的20名志愿者都是双语社区成员--没有语言学家或医疗工作者的专业经验--研究人员认为,这更准确地代表了患者对医学文本理解的典型水平。(大多数其他类似的研究都把这项工作交给了译者。) 志愿者分析了20份在急诊室中经常使用的自由形式的患者出院陈述,以及相应的七种语言的翻译,共评估了400份翻译。虽然流利性,充分性,意义和严重性的平均得分较高,但它们因语言而有显著差异。 “总的来说,[谷歌翻译]准确地传达了330/400条(82.5%)被检查的指令的意思,但准确率因语言而异,从55%到94%不等,”作者写道,并将其中一些错误描述为“毫无意义”。 不出所料,西班牙语和中文翻译的准确率最高(分别为94%和82%),亚美尼亚语和波斯语的准确率分别为55%和67.5%。 双语审查员还指出了几个可能进一步妨碍理解的具体语言问题,例如繁体中文与简体中文使用的不同书写系统,以及波斯语变体(包括达里语和塔吉克语)之间的差异。特别是对于从右向左书写的波斯语,Google Translate最初将文本从左向右转换,导致译文难以辨认。 根据社区志愿者的评估,中文和西班牙文翻译的正确率几乎与2019年《内科学杂志》研究中由专业翻译建立的正确率相当。正如研究人员所说:“社区志愿者评估的这两种语言的准确率,与专业翻译的准确率几乎是一样的。”(译者注:译者注:译者注:译者注:译者注:译者注:译者注:译者注:译者注:译者注:译者注:译者注:译者注) 研究人员进一步指出:“这对于该领域未来的工作是重要的信息,因为患者对机器翻译的感知和专业译者的感知之间的差异一直是一个持续的问题。” 引用语言之间不一致的表现,作者们得出结论,“虽然医院里书面翻译的未来很可能是机器翻译,[谷歌翻译]还没有准备好在急诊科的黄金时段使用。” 目前,最佳做法依赖于专业口译员。

以上中文文本为机器翻译,存在不同程度偏差和错误,请理解并参考英文原文阅读。

阅读原文