EAN 2025: ChatGPT-4o Rivals Neurology Experts in Diagnosing and Managing Polyneuropathy - EMJ

EAN 2025: ChatGPT-4o Rivals Neurology Experts in Diagnosing and Managing Polyneuropathy

GPT-4o, the latest ChatGPT model, improves diagnostic accuracy for polyneuropathy among non-specialist neurologists and matches experts in recommending confirmatory tests, according to new research presented at the 11th EAN Congress in Helsinki, Finland. 

Accurate diagnosis and management of polyneuropathies remain a challenge, particularly for non-specialists, due to the sheer variety of potential causes and the complexity of clinical presentation. Artificial intelligence models such as GPT-4o have shown promise in supporting clinical decision-making, but their real-world utility compared to human expertise has not been fully explored. 

In this international study, 100 confirmed polyneuropathy cases from tertiary care centres were presented to GPT-4o, which generated a leading diagnosis, two differential diagnoses, and a recommended confirmatory test using a zero-shot chain-of-thought prompt. The same cases were also reviewed by 26 neurologists—14 specialists and 12 non-specialists—from 19 centres in 10 countries, both before and after seeing GPT-4o’s suggestions. GPT-4o demonstrated high inter-output reliability (Cohen’s kappa = 0.8, p < 0.001) and outperformed non-specialists in identifying the correct leading diagnosis (65.5% vs 54.4%, p = 0.007), though it remained less accurate than specialists (73.9%, p = 0.024). When including differential diagnoses, GPT-4o’s accuracy rose to 82%, again surpassing non-specialists (68.5%, p < 0.001) but still below specialists (88.1%, p = 0.042). The AI model matched specialists in recommending the appropriate confirmatory investigation (68.0% vs 67.3%, p = 0.874), and significantly outperformed non-specialists (45.3%, p < 0.001). Notably, non-specialists improved their diagnostic accuracy after reviewing GPT-4o’s output (from 54.4% to 57.0%, p = 0.007), while specialists showed only a minor, non-significant increase. 

These findings suggest that supervised use of GPT-4o could help bridge the expertise gap in the diagnosis and management of polyneuropathy, particularly in settings where specialist knowledge is scarce. For clinical practice, AI tools like GPT-4o could offer valuable second opinions, guide confirmatory testing, and support early triage or referral decisions, especially in rural or resource-limited environments. However, the model’s limitations—such as over-reliance on laboratory data and occasional misinterpretation of clinical details—highlight the need for careful clinical oversight. Ongoing research will determine the best ways to integrate such tools into practice, ensuring that AI augments rather than replaces clinical judgement 

Reference 

De Lorenzo A et al. ChatGPT-4o in diagnosis and management of real-life polyneuropathy cases: comparative analysis with neurologists. Abstract OPR-023. EAN Congress, 21-24 June, 2025.   

Author:

Each article is made available under the terms of the Creative Commons Attribution-Non Commercial 4.0 License.

Rate this content's potential impact on patient outcomes

Average rating / 5. Vote count:

No votes so far! Be the first to rate this content.