GPT-4 outperformed 99.98% of simulated human readers in diagnosing complex clinical cases

November 28, 2023
Contributed by: Bill Russell
OpenAI GPT-4 diagnosed 52.7% complex cases in a study, outperforming 99.98% simulated medical journal readers. The AI assessed 38 cases from 2017-2023, compared with 248,614 answers from journal readers. Most diagnoses included infectious disease, endocrinology, and rheumatology cases. Overall, GPT-4 outperformed human readers, diagnosing 57% of cases with good reproducibility. The AI's accuracy wasn't linked to model training data. Researchers urge clinical trials for safety and efficacy and caution about ethical implications of AI. Despite its limitations, GPT-4 still outperformed 72% of human readers.
