AI Chatbot Outperforms Clinicians in Diagnosis Probability

ARTIFICIAL intelligence (AI) chatbots, specifically the learning language model (LLM) ChatGPT-4 (OpenAI, San Francisco, California, USA) outperformed human clinicians in probabilistic reasoning when estimating the probability of a diagnosis following a negative test result, according to a recent study led by Adam Rodman, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.

Probabilistic reasoning, the ability to make decisions based on calculating odds, is a challenging aspect of diagnosis. In the study, researchers provided ChatGPT-4 with the same five clinical cases used in a national practitioner survey (n=553), covering conditions such as pneumonia, breast cancer, asymptomatic bacteriuria, coronary artery disease, and urinary tract infection. The chatbot adjusted its estimates after receiving test results for each case.

The results indicated that the LLM exhibited less error in both pre- and post-test probability compared to clinicians, particularly for negative test results, across all five cases. For example, in the case of asymptomatic bacteriuria, the LLM had a median pretest probability of 26%, compared to 20% for clinicians, with a mean absolute error of 26.2, compared to 32.2, respectively.

However, the LLM did not perform as well when faced with positive test results. It demonstrated greater accuracy than clinicians in two cases, similar accuracy in two cases, and less accuracy in one case.

Rodman highlighted that humans sometimes perceive a higher risk than exists after a negative test result, leading to unnecessary treatments, additional tests, and medications. While the LLM is not perfect, its ease of use and potential integration into clinical workflows could contribute to better decision-making.

The study emphasised the need for future research into the collective use of AI in healthcare. Despite limitations in the study design, such as a simple prompt strategy and inclusion of simplistic cases, the findings underscore the potential for AI to enhance diagnostic processes and decision-making in medical settings.

Artificial Intelligence Chatbot Outperforms Clinicians in Diagnosis Probability

Saliva Testing Identifies Early Risk for Cancer and Chronic Diseases

Designing Life-Saving Neonatal Incubators: Interview with James Roberts

More articles

Designing Life-Saving Neonatal Incubators: Interview with James Roberts

Diagnostic and Surgical Challenges in Extradigital Glomus Tumour

Smart Contact Lens Technology for Wearable Biosensors and Drug Delivery

Featured journals

EMJ Innovations 9.1 2025

EMJ Innovations 9 [Supplement 1] 2025

Therapy Area

About Us

Artificial Intelligence Chatbot Outperforms Clinicians in Diagnosis Probability

Related To This Subject

Saliva Testing Identifies Early Risk for Cancer and Chronic Diseases

Designing Life-Saving Neonatal Incubators: Interview with James Roberts

More articles

Designing Life-Saving Neonatal Incubators: Interview with James Roberts

Diagnostic and Surgical Challenges in Extradigital Glomus Tumour

Smart Contact Lens Technology for Wearable Biosensors and Drug Delivery

Featured journals

EMJ Innovations 9.1 2025

EMJ Innovations 9 [Supplement 1] 2025