Reproducibility of GPT4 in Healthcare - An Experiment

September 24, 2023Source: LinkedIn

Found this useful? Share it with your network

Bill Russell discusses generative AI's reliability in healthcare. Using OpenAI's function calling in webapp: gptexperiments.patient.dev, he tested triage nurse blurbs against top five likely conditions. Despite the same input, the AI resulted in varying answers. A 98% majority of diagnoses related to gallbladder disease. The Jaccard similarity of different runs was 3.41%, indicating a moderate variation in generated diagnoses. Russell sees function calling's potential for healthcare apps but worries about variation, though believes imperfect results can still be useful. Experiment repeated with zero temperature showed less variation.

Read Full Article

Opens on LinkedIn

More News

AMA Asking For Stronger Safeguards on AI Mental Health Chatbots

April 29, 2026MobiHealthNews

Andrea Daugherty Appointed Chief Information and Digital Transformation Officer at ARMC

April 24, 2026arrowheadregional.org

Craig Richardville Appointed Chief Digital and Information Officer at UF Health

April 13, 2026ufhealth.org

Former AMA Exec Margaret Lozovatsky Named CDIO At Premier Health

April 10, 2026premierhealth.com

Signature Healthcare Diverts Ambulances Following Cybersecurity Incident

April 8, 2026SecurityWeek

More Than 100 Hospitals Suing HHS Over Alleged Underpayment

April 6, 2026MedCity News