4 Lessons Healthcare Teaches Us About Applying Large Language Models


2024-03-27 09:00 multilingual


阅读模式 切换至中文

There’s been no shortage of new claims and ideas about what generative AI (GenAI) can, cannot, and should not do. And despite the hype, there are only a handful of successful real-world enterprise projects applying the technology. The healthcare industry is the exception, with a breadth of GenAI use cases under its belt. From using large language models (LLMs) for clinical decision support, patient journey trajectories, and efficient medical documentation, to enabling physicians to build best-in-class medical chatbots, healthcare is making major strides in getting GenAI into production and showing immediate value. So, what can other practitioners take from healthcare’s best practices and lessons learned in applied AI? Here are four lessons from AI applications in healthcare. The More Data, the Better Many traditional healthcare LLMs only consider a patient’s diagnosis and age. But what if that was expanded to several multimodal records, such as demographics, clinical characteristics, vital signs, smoking status, past procedures, medications, and laboratory tests? By unifying these features, a far more comprehensive view of the patient is created, and thus, the potential for a more comprehensive treatment plan. Additional data can significantly improve model performance for various downstream tasks, like disease progression prediction and sub-typing in different diseases. Given the additional features and interpretability, LLMs can then help physicians make more informed decisions about disease trajectories, diagnoses, and risk factors of various diseases. It’s easy to see how this approach could be applied to a customer journey for marketers, or risk assessment for insurance or financial companies — the potential is endless. Clean Data Is Good Data Combining structured data — like electronic health records and prescriptions — and unstructured data — like clinical notes, medical images, and PDFs — to create a complete view of a patient is critical. This data can then be used to provide a user-friendly interface, such as a chatbot, to gather information about a patient or identify a cohort of patients who can be candidates for a clinical trial or research effort. It sounds straightforward, but let’s not forget privacy and data restrictions that make this challenging for healthcare and other high-compliance environments. In order to get the most out of a chatbot and meet regulatory requirements, healthcare users must find solutions that enable them to shift noisy clinical data to a natural language interface that can answer questions automatically — at scale and with full privacy, to boot. Since this cannot be achieved by simply applying LLM or retrieval augmented generation (RAG) LLM solutions, it starts with a healthcare-specific data pre-processing pipeline. Other high-compliance industries like law and finance can take a page from healthcare’s book by preparing their data privately, at scale, on commodity hardware, using other models to query it. Domain Experts Improve Accuracy AI is only as useful as the data scientists and IT professionals behind enterprise-grade use cases — until now. No-code solutions are emerging, specifically designed for the most common healthcare use cases. The most notable being using LLMs to bootstrap task-specific models. Essentially, this enables domain experts to start with a set of prompts and provide feedback to improve accuracy beyond what prompt engineering can provide. The LLMs can then train small, fine-tuned models for that specific task. This approach gets AI into the hands of domain experts, results in higher-accuracy models than what LLMs can deliver on their own, and can be run cheaply at scale. This is particularly useful for high-compliance enterprises, given that no data sharing is required and zero-shot prompts and LLMs can be deployed behind an organization’s firewall. A full range of security controls, including role-based access, data versioning, and full audit trails, can be built in, which makes it simple for even novice AI users to keep track of changes and continue to improve models over time. Ethical Development Builds Trust Ensuring the reliability and explainability of AI-generated outputs is crucial to maintaining patient safety and trust in the healthcare system. Moreover, addressing inherent biases is essential for equitable access to AI-driven healthcare solutions for all patient populations. Collaborative efforts between clinicians, data scientists, ethicists, and regulatory bodies are necessary to establish guidelines for the responsible deployment of AI in healthcare and beyond. It’s for these reasons the Coalition for Health AI (CHAI) was established. CHAI is a non-profit organization tasked with creating concrete guidelines and criteria for responsibly developing and deploying AI applications in healthcare. Working with the US government and healthcare community, CHAI creates a safe environment to deploy GenAI applications in healthcare, covering specific risks and best practices to consider when building products and systems that are fair, equitable, and unbiased. Groups like CHAI could be replicated in any industry to ensure the safe and effective use of AI. Conclusion Healthcare is on the bleeding edge of GenAI, defined by a new era of precision medicine, personalized treatments, and improvements that will lead to better outcomes and quality of life. But this didn’t happen overnight; the integration of GenAI in healthcare has been done thoughtfully, addressing technical challenges, ethical considerations, and regulatory frameworks along the way. Other industries can learn a great deal from healthcare’s commitment to AI-driven innovations that benefit patients and society as a whole. The above areas will be a focus of this year’s Healthcare NLP Summit: a free, virtual community event being held April 3-4 and highlighting real-world use cases of the technology.
关于生成AI(GenAI)可以做什么,不能做什么,不应该做什么,一直不乏新的主张和想法。尽管大肆宣传,但只有少数成功的现实企业项目应用了这项技术。医疗保健行业是个例外,它拥有广泛的GenAI用例。 从使用大型语言模型(LLM)进行临床决策支持,患者旅程轨迹和有效的医疗文档,到使医生能够构建一流的医疗聊天机器人,医疗保健在将GenAI投入生产并显示即时价值方面取得了重大进展。那么,其他从业者可以从医疗保健的最佳实践和应用人工智能的经验教训中吸取什么呢? 以下是AI在医疗保健领域应用的四个教训。 数据越多越好 许多传统的医疗LLM只考虑患者的诊断和年龄。但是,如果将其扩展到多个多模态记录,例如人口统计学,临床特征,生命体征,吸烟状况,过去的程序,药物和实验室检查,会怎么样?通过统一这些功能,可以创建更全面的患者视图,从而有可能制定更全面的治疗计划。 额外的数据可以显着提高模型在各种下游任务中的性能,如疾病进展预测和不同疾病的亚型分型。考虑到额外的功能和可解释性,LLM可以帮助医生对各种疾病的疾病轨迹,诊断和风险因素做出更明智的决定。很容易看出这种方法如何应用于营销人员的客户旅程,或保险或金融公司的风险评估-潜力是无限的。 干净的数据是好数据 结合结构化数据(如电子健康记录和处方)和非结构化数据(如临床笔记、医学图像和PDF)来创建患者的完整视图至关重要。然后,这些数据可以用于提供用户友好的界面,例如聊天机器人,以收集有关患者的信息或识别可以成为临床试验或研究工作候选人的患者队列。这听起来很简单,但我们不要忘记隐私和数据限制,这对医疗保健和其他高合规性环境来说是一个挑战。 为了充分利用聊天机器人并满足监管要求,医疗保健用户必须找到解决方案,使他们能够将嘈杂的临床数据转移到可以自动回答问题的自然语言界面-大规模和完全隐私。由于这不能通过简单地应用LLM或检索增强生成(RAG)LLM解决方案来实现,因此它从医疗保健特定的数据预处理管道开始。其他高合规性行业,如法律和金融,可以从医疗保健的书中学习一页,在商品硬件上私下准备数据,使用其他模型进行查询。 领域专家提高准确性 人工智能只有在企业级用例背后的数据科学家和IT专业人员才有用-直到现在。无代码解决方案正在出现,专为最常见的医疗保健用例而设计。最值得注意的是使用LLM来引导特定于任务的模型。从本质上讲,这使领域专家能够从一组提示开始,并提供反馈,以提高提示工程所能提供的准确性。然后,LLM可以为该特定任务训练小的、微调的模型。 这种方法将人工智能引入到领域专家的手中,从而产生比LLM本身更高精度的模型,并且可以大规模廉价运行。这对于高度合规的企业特别有用,因为不需要数据共享,并且可以在组织的防火墙后面部署零触发提示和LLM。可以内置全方位的安全控制,包括基于角色的访问、数据版本控制和完整的审计跟踪,这使得即使是新手AI用户也可以轻松跟踪更改并随着时间的推移继续改进模型。 道德发展建立信任 确保人工智能生成的输出的可靠性和可解释性对于维护患者安全和对医疗保健系统的信任至关重要。此外,解决固有的偏见对于所有患者群体公平获得人工智能驱动的医疗保健解决方案至关重要。临床医生、数据科学家、伦理学家和监管机构之间的合作是必要的,以建立在医疗保健及其他领域负责任地部署人工智能的指导方针。 正是由于这些原因,健康AI联盟(CHAI)成立了。CHAI是一个非营利组织,负责制定具体的指导方针和标准,以负责任地开发和部署医疗保健中的AI应用程序。CHAI与美国政府和医疗保健社区合作,为在医疗保健领域部署GenAI应用创造了一个安全的环境,涵盖了在构建公平、公正和公正的产品和系统时需要考虑的特定风险和最佳实践。像CHAI这样的组织可以在任何行业复制,以确保人工智能的安全和有效使用。 结论 医疗保健处于GenAI的前沿,由精准医学,个性化治疗和改进的新时代定义,这些改进将带来更好的结果和生活质量。但这并不是一夜之间发生的; GenAI在医疗保健领域的整合是经过深思熟虑的,解决了技术挑战,道德考虑和监管框架。其他行业可以从医疗保健对人工智能驱动的创新的承诺中学到很多东西,这些创新使患者和整个社会受益。 上述领域将是今年医疗保健NLP峰会的重点:一个免费的虚拟社区活动将于4月3日至4日举行,重点介绍该技术的真实用例。