Chat-Oscopy, are we there yet? Accuracy of ChatGPT-4o for the diagnosis
of common otological disorders
Abstract
Design: Evaluation of the diagnostic accuracy of ChatGPT,
available to the public and easily accessible tool, in identifying
common otological pathologies using standardized otoscopic images.
Setting: In this prospective uncontrolled observational study
six common otological pathologies—serous otitis media, acute otitis
media, bullous myringitis, otitis externa, perforated tympanic membrane,
and chronic otitis with cholesteatoma—were selected. Additionally,
images of normal tympanic membranes were included. Ten standardized
images for each pathology were sourced. These images were analyzed by
ChatGPT-4 via its API, which was queried for the most accurate
diagnosis. Results: ChatGPT-4 correctly diagnosed normal
tympanic membranes in 30% of cases, frequently misidentifying them as
serous otitis media. The AI accurately identified serous otitis media in
60% of cases, with the remaining misdiagnosed mainly as normal tympanic
membranes. Similarly, acute otitis media was correctly diagnosed 60% of
the time, often confused with serous otitis media. Bullous myringitis
was correctly identified in 40% of cases, commonly misdiagnosed as
acute otitis media. Otitis externa was correctly diagnosed in 40% of
cases but was frequently mistaken for cholesteatoma. The AI accurately
diagnosed perforated tympanic membrane in 20% of cases, with the
majority misidentified as cholesteatoma. Cholesteatoma had the highest
accuracy rate at 90%, with few misdiagnoses. Conclusion:
ChatGPT-4 demonstrates promise in accurately diagnosing otological
conditions such as cholesteatoma but has limitations, particularly in
distinguishing between similar pathologies. The findings underscore the
potential of AI as a diagnostic aid while highlighting the need for
cautious integration into clinical practice to avoid unnecessary tests
and misdiagnoses.