Skip to main content

Search site

Find podcasts, news, articles, webinars, and contributors in one search.

Reproducibility of GPT4 in Healthcare - An Experiment

Source: LinkedIn

Found this useful? Share it with your network

Bill Russell discusses generative AI's reliability in healthcare. Using OpenAI's function calling in webapp: gptexperiments.patient.dev, he tested triage nurse blurbs against top five likely conditions. Despite the same input, the AI resulted in varying answers. A 98% majority of diagnoses related to gallbladder disease. The Jaccard similarity of different runs was 3.41%, indicating a moderate variation in generated diagnoses. Russell sees function calling's potential for healthcare apps but worries about variation, though believes imperfect results can still be useful. Experiment repeated with zero temperature showed less variation.

Read Full Article

Opens on LinkedIn