AI bots make diagnostic errors in almost 80% of cases, as found in a study
Brief on the research findings
A recent study published in *Jama Network Open* and cited by the *Financial Times* showed that popular chat‑bots are unable to accurately diagnose medical conditions when provided with limited data sets. In more than 80 % of cases they produced incorrect diagnoses, and only with a full symptom description did accuracy rise to 90 %.
How the study was conducted
Step What was done Case selection 29 clinical scenarios from reference literature. Data transfer Patient information was passed to chat‑bots progressively: medical history → examination results → laboratory tests. Questions to AI They asked about diagnosis; measured accuracy and completeness of answers.
Experiment participants
* 20 popular models from OpenAI, Anthropic, Google, xAI, DeepSeek.
* With incomplete data more than 80 % gave incorrect diagnoses.
* As the amount of information increased accuracy improved: best cases >90 %, average errors less than 40 %.
Developer reactions
Company Comment Google & Anthropic When asked for medical advice chat‑bots strongly recommend consulting a specialist. OpenAI States in terms of service that their services are not intended to provide licensed medical recommendations. xAI & DeepSeek No comments provided.
Some are developing specialized models: Google created AMIE, which shows good results, but its conclusions still require confirmation by a live physician, especially given the importance of visual assessment.
Conclusion
Chat‑bots can be useful as an auxiliary tool, but with limited information they often err. Using them as a replacement for qualified medical professionals is currently unacceptable, although such models may be helpful in regions lacking access to traditional medicine.
Comments (0)
Share your thoughts — please be polite and stay on topic.
Log in to comment