Can AI Match Human Therapists in Mental Healthcare? - EMJ

Can AI Match Human Therapists in Mental Healthcare?

Artificial intelligence (AI) is transforming the landscape of mental health care, with large language models (LLMs) such as ChatGPT showing considerable promise. These advanced AI systems, capable of understanding and generating nuanced language, are being explored for various clinical tasks, ranging from psychoeducation and drafting treatment plans to offering companionship and even conducting elements of therapy. As mental health services struggle to meet increasing global demand, particularly post-COVID-19, generative AI presents a potential means to fill critical gaps in care.

This systematic review assessed the capabilities of LLMs in simulating clinical competencies typically associated with trained mental health professionals. Findings suggest that LLMs are most proficient in delivering psychoeducation, explaining conditions and treatment options, yet their abilities in more complex tasks like assessment, diagnosis, and culturally sensitive care remain limited. Most evaluations used zero-shot prompting, a method that involves asking the AI to respond to questions without prior examples. While convenient, this approach restricts the depth of analysis, overlooking the full capabilities of more advanced prompting techniques such as few-shot or chain-of-thought (CoT) prompts.

The review also found a disparity in evaluations: clinicians often viewed LLM responses positively, while users, particularly those from non-Western or non-English-speaking backgrounds, highlighted limitations in cultural understanding and relevance. This raises concerns around equity, as most LLMs are trained on data heavily biased towards Western, English-speaking contexts. Without diverse training datasets, GenAI tools risk misinterpreting culturally specific expressions of distress and providing inappropriate advice.

While GenAI shows potential in mental health applications, especially for expanding access and supporting professionals, more rigorous research is needed. This includes refining evaluation methods, incorporating diverse user feedback, and ensuring cultural competence. Only then can LLMs become trusted, inclusive tools capable of supporting mental health care in a safe and effective manner.

Reference

Wang L et al. Evaluating generative AI in mental health: systematic review of capabilities and limitations. JMIR Ment Health. 2025;12:e70014.

 

Author:

Each article is made available under the terms of the Creative Commons Attribution-Non Commercial 4.0 License.

Rate this content's potential impact on patient outcomes

Average rating / 5. Vote count:

No votes so far! Be the first to rate this content.