September 24, 2023

Reproducibility of GPT4 in Healthcare - An Experiment

Summary

Bill Russell discusses generative AI's reliability in healthcare. Using OpenAI's function calling in webapp: gptexperiments.patient.dev, he tested triage nurse blurbs against top five likely conditions. Despite the same input, the AI resulted in varying answers. A 98% majority of diagnoses related to gallbladder disease. The Jaccard similarity of different runs was 3.41%, indicating a moderate variation in generated diagnoses. Russell sees function calling's potential for healthcare apps but worries about variation, though believes imperfect results can still be useful. Experiment repeated with zero temperature showed less variation.

Read Full Story

Explore Related Topics

AI & Machine Learning
EHR/EMR (Electronic Health Records/Electronic Medical Records)

Subscribe Now

Receive 7 Top Stories Daily

Reproducibility of GPT4 in Healthcare - An Experiment

Share this story:

Explore Related Topics

Subscribe Now

Search

Healthcare Transformation Powered by Community