This Week Health

Today: Mass General Brigham tests ChatGPT Clinical Decision Accuracy

"πΆβ„Žπ‘Žπ‘‘πΊπ‘ƒπ‘‡ π‘π‘Ÿπ‘œπ‘£π‘’π‘‘ 𝑏𝑒𝑠𝑑 𝑖𝑛 π‘šπ‘Žπ‘˜π‘–π‘›π‘” π‘Ž π‘“π‘–π‘›π‘Žπ‘™ π‘‘π‘–π‘Žπ‘”π‘›π‘œπ‘ π‘–π‘ , π‘€β„Žπ‘’π‘Ÿπ‘’ π‘‘β„Žπ‘’ 𝐴𝐼 β„Žπ‘Žπ‘‘ 𝟩𝟩% π‘Žπ‘π‘π‘’π‘Ÿπ‘Žπ‘π‘¦ 𝑖𝑛 π‘‘β„Žπ‘’ 𝑠𝑑𝑒𝑑𝑦, 𝑓𝑒𝑛𝑑𝑒𝑑 𝑖𝑛 π‘π‘Žπ‘Ÿπ‘‘ 𝑏𝑦 π‘‘β„Žπ‘’ π‘π‘Žπ‘‘π‘–π‘œπ‘›π‘Žπ‘™ 𝐼𝑛𝑠𝑑𝑖𝑑𝑒𝑑𝑒 π‘œπ‘“ πΊπ‘’π‘›π‘’π‘Ÿπ‘Žπ‘™ π‘€π‘’π‘‘π‘–π‘π‘Žπ‘™ 𝑆𝑐𝑖𝑒𝑛𝑐𝑒𝑠.

𝐼𝑑 π‘€π‘Žπ‘  π‘™π‘œπ‘€π‘’π‘ π‘‘-π‘π‘’π‘Ÿπ‘“π‘œπ‘Ÿπ‘šπ‘–π‘›π‘” 𝑖𝑛 π‘šπ‘Žπ‘˜π‘–π‘›π‘” 𝑑𝑖ffπ‘’π‘Ÿπ‘’π‘›π‘‘π‘–π‘Žπ‘™ π‘‘π‘–π‘Žπ‘”π‘›π‘œπ‘ π‘’π‘ , π‘€β„Žπ‘’π‘Ÿπ‘’ 𝑖𝑑 π‘€π‘Žπ‘  π‘œπ‘›π‘™π‘¦ 𝟨𝟒% π‘Žπ‘π‘π‘’π‘Ÿπ‘Žπ‘‘π‘’, π‘Žπ‘›π‘‘ 𝑖𝑛 π‘π‘™π‘–π‘›π‘–π‘π‘Žπ‘™ π‘šπ‘Žπ‘›π‘Žπ‘”π‘’π‘šπ‘’π‘›π‘‘ π‘‘π‘’π‘π‘–π‘ π‘–π‘œπ‘›π‘ , π‘’π‘›π‘‘π‘’π‘Ÿπ‘π‘’π‘Ÿπ‘“π‘œπ‘Ÿπ‘šπ‘–π‘›π‘” π‘Žπ‘‘ 𝟨πŸͺ% π‘Žπ‘π‘π‘’π‘Ÿπ‘Žπ‘π‘¦ π‘π‘Žπ‘ π‘’π‘‘ π‘œπ‘› π‘‘β„Žπ‘’ π‘π‘™π‘–π‘›π‘–π‘π‘Žπ‘™ π‘‘π‘Žπ‘‘π‘Ž π‘‘β„Žπ‘’ 𝐿𝐿𝑀 π‘€π‘Žπ‘  π‘‘π‘Ÿπ‘Žπ‘–π‘›π‘’π‘‘ π‘œπ‘›."



β€Š Today in health, it mass general, Brigham puts chatty PT to the test on clinical decision accuracy. And today we're going to take a look at that and see what the response was. My name is bill Russell. I'm a former CIO for 16 hospital system and creator of this week health. Set of channels and events dedicated to leveraging the power of community to propel healthcare forward. We want to thank our show sponsors who are investing in developing the next generation of health leaders. Short tests are decide parlance, certified health, notable and service. Now check them out at this week. Health. Dot com. Slash today. Hey, one of the things you can do to help us out, share this podcast with a friend or colleague. Use it as a foundation for daily or weekly discussions on the topics that are relevant to you. And the industry and let them know they can subscribe wherever you listen to podcasts.

Okay, before we get into it, don't forget. We have a great webinar this week, September 7th, 1:00 PM. Eastern time. 10:00 AM. Pacific time. It's part of our leader series. It's on our AI journey in healthcare thus far, we have three great guests, Michael Pfeffer from Stanford health care. We have Brett lamb from UNC North Carolina health, and we have Chris long Hearst. , from a UC San Diego, and we're going to be talking about our AI journey thus far and what we are finding as we move forward. Through this path and relevant to that is our topic for today. All right. So we have story. , that appeared in healthcare. It news. Which I haven't hit in a while. So it's a, it's interesting to actually get over here and see what they're writing about. This is Andrea Fox, August 29th, 2023. And it is Chacha PT score. 72% in clinical decision accuracy. According to a mass general Brigham study. Okay. The largest language models performance was steady across both primary and emergency care for all medical specialties, but struggled with differential diagnosis. According to the new research by mass general. Brigham, let me go into it a little bit. So I'm putting chat GBT to the test to see if AI can work through an entire clinical encounter with patient. Recommending a diagnostic workup, deciding a course of action or making a final, final diagnosis. Mass general Brigham researchers. I've found the large language models to have impressive accuracy, despite limitations, including possible. Hallucinations as we know that that is possible. , they, , they put it through, , 36 published clinical vignettes and, , compared its accuracy on differential diagnosis, diagnostic testing, final diagnosis and management based on patient age, gender, and case acuity.

They, , their findings, no real benchmark exists, but we estimate this performance to be on a level of someone who just graduated from medical school, such as an intern or resident. And again, not to get too overly hyped on this, but this is a model that wasn't necessarily trained specifically on healthcare. And it's performing at a level of somebody who just graduated from medical school. All right. So this is from Dr. Mark. Susi S U C C I. Associate chair of innovation and commercialization and strategic innovation leader at MGB and executive director of its mesh incubators innovation in operations research group. Wow. That's a big title. The researchers said that Chatswood, PTO. Achieved an overall accuracy of 71% in clinical decision. Making across all 36 clinical vignettes. Chad should be T came up with a possible diagnosis and made final diagnoses. And care management decisions. They measured the popular LLMs. Accuracy on differential diagnosis, diagnostic testing, final diagnosis and management and structured. Blinded process awarding points for correct answers to questions post. Researchers then use linear regression to assess the relationship. Between chats UPTs performance and the vignettes demographic information. According to the study published this past week. And the journal of medical internet research Chatsy PT proved best in making final diagnosis where the AI had 77% accuracy. In this study funded in part by national Institute. , of general medical sciences. It was low performing in making differential diagnosis where it was only 60% accurate. And in clinical management decisions underperforming at 68%. Accuracy based on clinical data. The LLM was trained on the good news for those who have questioned, whether Chiechi PT can really outshine doctors. Expertise, Chad GPT struggled with diff differential diagnosis, which is the meat and potatoes of medicine. When a physician has to figure out what to do. Sushi said. That is important because it tells us where physicians are truly experts and adding the most value. In the early stages of patient care with little presenting information, when he lists a possible diagnosis is needed. Before tools like chats, UPT can be considered for integration into medical or clinical care, more benchmark research and regulatory guidance is needed. According to NCBT. MGB next. , is looking at whether AI tools can improve patient care and outcomes and hospital resources and constrained areas. This is interesting. I, you know, I look at this and I love the fact that this research is going on. I love the fact that we're building this body of knowledge. To see where the value is added either by the technology or by the clinician. And we can then optimize the overall use of the technology in a safe way. Across the board. This is going to be part of our learning curve. That's going to be part of our. Journey. Our AI journey. If you will, as we try to figure out where this can be used. In a safe and effective manner. And, you know, what's my, so what on this. I am again, surprised at the pace. I mean, it was really a little over a year ago that this became very common chat. GPT became part of the conversation. Do you remember sitting around the. Thanksgiving table or Christmas break. You ended up having conversations about chatty beauty. Could you believe what this thing is doing? And here we are just, I don't know, nine months later, and we're talking about it actually diagnosing patients and those kinds of things. And it's that kind of thing that is, that is really lending itself to the hype that is being created. Now we have to stay measured as we look at these. Outcomes. We have to understand what is required. , in order to utilize this technology safely and effectively. And, you know, this is why we're talking a fair amount with decision makers around policies and the policies they're putting in place. And the protections they're putting in place on the use of the technologies. And also pushing them on the other side, in terms of experimentation, where are you experimenting with this technology? So my, so what is. Put the guardrails up. Experiment experiment as much as possible within your health system, given your budget. And your resources.

All right. That's all for today. Don't forget to share this podcast with a friend or colleague. And we want to thank our channel sponsors who are investing in our mission to develop the next generation of health leaders, short tests, artists, I parlance certified health, πŸ“ notable and service. Now check them out at this week. Thanks for listening. That's all for now.

Want to tune in on your favorite listening platform? Don't forget to subscribe!

Thank You to Our Show Sponsors

Our Shows

Keynote - This Week HealthSolution Showcase This Week Health
Newsday - This Week HealthToday in Health IT - This Week Health

Related Content

1 2 3 241
Transform Healthcare - One Connection at a Time

Β© Copyright 2023 Health Lyrics All rights reserved