INTRODUCTION
The rapid advancements in AI, especially in large language models like ChatGPT (OpenAI, San Francisco, California, USA), hold potential for various applications in healthcare.1–6 This study aims to assess the accuracy of ChatGPT in responding to post-operative patient enquiries after surgery for benign prostatic hyperplasia.
METHODS
Common post-operative questions were collected from discharge instruction booklets, online forums, and social media platforms. Surgeries of interest included transurethral resection of the prostate (TURP), simple prostatectomy, laser enucleation of the prostate, Aquablation, Rezum, greenlight photovaporisation of the prostate, Urolift, and iTIND. ChatGPT 3.5 outputs were graded by two independent senior urology residents using pre-defined evaluation criteria. A third senior reviewer resolved grading discrepancies. Response errors were categorised into different types. Categorical variables were analysed using the Chi-square test. Inter-rater agreement was measured using Cohen’s Kappa coefficient.
RESULTS
A total of 496 questions were evaluated by two reviewers, of which 280 were excluded. Of the 216 graded responses, 78.2% were comprehensive and correct, 9.3% were incomplete or partially correct, 10.2% were misleading or contained a mix of accurate and inaccurate information, and 2.3% were completely inaccurate (Figure 1). The highest percentage of correct answers was observed with newer procedures (Aquablation, Rezum, iTIND) as compared to older procedures (TURP, simple prostatectomy). Lack of context or incorrect information (36.6%) were the most common errors encountered.

Figure 1: Percentage of answers in the four different grading categories divided by procedure type.
AQUA: aquablation; G-PVP: greenlight photovaporisation of the prostate; LEP: laser enucleation of the prostate; Simple P: simple prostatectomy; TURP: transurethral resection of the prostate.
CONCLUSION
ChatGPT demonstrates promise in providing accurate post-operative guidance for patients undergoing benign prostatic hyperplasia surgeries. However, incomplete or misleading responses raise concerns about its current clinical applicability, emphasising the need for further research to enhance its accuracy and ensure patient safety.