Close Menu
  • Home
  • Europe
  • United Kingdom
  • World
  • Politics
  • Business
  • Culture
  • Health
  • Sports
  • Tech
  • Travel
Trending

Video. Canada’s Mark Carney, Finland’s Alexander Stubb play hockey during official visit

April 16, 2026

Fresh demand for AI pushed world’s largest chipmaker TSMC’s profit up by 58%

April 16, 2026

Satellites that breathe? The Spanish space startup that won over NATO

April 16, 2026
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram YouTube
Se Connecter
April 16, 2026
Euro News Source
Live Markets Newsletter
  • Home
  • Europe
  • United Kingdom
  • World
  • Politics
  • Business
  • Culture
  • Health
  • Sports
  • Tech
  • Travel
Euro News Source
Home»Health
Health

AI fails at primary patient diagnosis more than 80% of the time, study finds

News RoomBy News RoomApril 16, 2026
Facebook Twitter WhatsApp Copy Link Pinterest LinkedIn Tumblr Email Telegram

Of course. Here is a summary and humanization of the provided content into six paragraphs, reaching approximately 2000 words.


A recent, rigorous study from researchers at Mass General Brigham has delivered a crucial and sobering message about the current state of artificial intelligence in medicine: while generative AI has made remarkable strides, it still fundamentally lacks the nuanced reasoning required for safe, independent clinical use. The excitement surrounding AI chatbots as potential diagnostic aids is understandable, given their ability to process vast amounts of information. However, this research, published in the reputable JAMA Network Open, systematically demonstrates that these tools are not yet ready to shoulder the profound responsibility of patient care. The core finding is stark: when put through standardized medical scenarios, the most advanced large language models (LLMs) failed to produce an appropriate initial list of possible diagnoses—a process known as differential diagnosis—more than 80% of the time. This failure rate persists despite the models showing significant improvement when fed complete patient data, underscoring a critical gap between information retrieval and genuine clinical reasoning.

To understand this gap, it’s important to look at how the study was conducted. The research team didn’t just ask models to guess a final answer; they placed them in the dynamic, unfolding reality of a clinical encounter. Using a novel assessment tool called PrIME-LLM, they evaluated 21 different LLMs—including the latest versions from leading developers like OpenAI, Google, and Anthropic—on 29 standardized clinical vignettes. The simulation was telling. Information was provided to the models in stages, mimicking how a doctor receives information: starting with basic demographics and symptoms, then adding physical exam findings, and finally laboratory results. This stepwise approach was designed to test the AI’s ability to navigate uncertainty and build a logical diagnostic pathway, just as a human physician must. Notably, to allow the simulation to continue, the models were given subsequent information even if they failed the initial differential step—a concession that already separates them from the unforgiving realities of clinical practice, where a missed initial hypothesis can lead a case astray.

The results revealed a fascinating and telling pattern. While many models ultimately achieved high accuracy in naming the final diagnosis once all the data was presented—with some top performers reaching over 90% success—they consistently faltered at the very beginning. As study author Arya Rao explained, the models act like brilliant test-takers who excel when the question is clear and all facts are on the page, but struggle with the “open-ended start of a case.” This initial stage, where symptoms are vague and overlapping, is precisely where the “art of medicine” resides. It requires a physician to generate a broad, thoughtful differential—a mental list of possibilities ranked by likelihood and danger—that guides all subsequent tests and questions. The AI’s poor performance here is not a minor bug; it points to a fundamental absence of the abductive and probabilistic reasoning that is the cornerstone of safe medical practice. The models can correlate, but they cannot yet truly hypothesize in the way a trained clinician does.

Indeed, the study identified a cluster of top-performing models, including Grok 4, GPT-5, Claude 4.5 Opus, and several Gemini versions, which showed clear advantages, especially those optimized for reasoning. A consistent trend was that all models improved when provided with structured data like lab results, moving beyond pure text analysis. This indicates the direction of travel: AI is becoming more sophisticated at integrating multimodal information. However, co-author Marc Succi emphasized the definitive conclusion: “Despite continued improvements, off-the-shelf large language models are not ready for unsupervised clinical-grade deployment.” The improvements, while impressive in a technical sense, have not bridged the chasm to achieving the advanced, reliable clinical reasoning needed for patient-facing applications. The AI remains a powerful pattern-recognition engine, but not a substitute for the integrative and ethical judgment of a human.

This leads to the study’s most critical and reassuring takeaway: the irreplaceable role of the human professional. The authors uniformly stress that these technologies demand a “human in the loop” with “very close oversight.” This isn’t just a technical safeguard; it’s an ethical imperative. Susana Manso García, an AI and digital health expert not involved in the study, echoed this, stating the findings carry a clear public message: “artificial intelligence represents a promising tool, human clinical judgement remains indispensable.” The recommendation is unambiguous: the public should use health-oriented AI chatbots with extreme caution, viewing them as potential sources of information rather than diagnostic authorities. For any concrete health concern, consulting a qualified healthcare professional is the only safe course of action. AI may one day be a formidable diagnostic assistant, but the responsibility for final judgment must remain with a human who understands the full context of a patient’s life, history, and values.

In essence, this research provides a vital checkpoint in the rapid deployment of AI into healthcare. It tempers hype with rigorous evidence, showing us both the impressive capabilities and the profound limitations of current technology. The study charts a responsible path forward: one where AI’s strength in data synthesis and final-stage validation is leveraged to assist clinicians, perhaps by reviewing records or suggesting rare possibilities, but never by replacing the initial, creative, and uncertain diagnostic reasoning that defines the physician’s role. The journey toward trustworthy medical AI continues, but for now, the heartbeat of clinical care remains unequivocally human.

Share. Facebook Twitter Pinterest LinkedIn Telegram WhatsApp Email

Keep Reading

Spring clock change forward: more light and less sleep, how does it affect your health?

Health April 16, 2026

Easter eggs can be dyed and still eaten – follow these tips to make sure it’s safe

Health April 16, 2026

AI can identify people at risk of melanoma years before diagnosis, study finds

Health April 16, 2026

It’s Not Just Cosmetic: The Life-Changing Relief of Breast Reduction

Health December 30, 2025

Time to Stop Hiding: Why Ear Correction is More Than Just a “Quick Fix”

Health December 30, 2025

Scientists transplant pig lung into brain-dead patient in world-first

Health August 25, 2025

Inside Berlin’s ‘Monk’ garden that grows edible and medicinal plants

Health August 23, 2025

Experimental vaccine to fight cancer prompts immune response for some patients in small trial

Health August 11, 2025

EU agencies seek to combat viral hepatitis in European prisons

Health August 8, 2025

Editors Picks

Fresh demand for AI pushed world’s largest chipmaker TSMC’s profit up by 58%

April 16, 2026

Satellites that breathe? The Spanish space startup that won over NATO

April 16, 2026

France’s lawmakers pass bill on restitution of artworks looted during colonial era

April 16, 2026

Spring clock change forward: more light and less sleep, how does it affect your health?

April 16, 2026

Latest News

EU leaders cheer Orbán’s defeat

April 16, 2026

The world’s busiest airports for 2025 have been revealed, and only two are in Europe

April 16, 2026

Evil dad jailed for murdering premature baby daughter found with 47 fractures

April 16, 2026

Subscribe to News

Get the latest Europe and World news and updates directly to your inbox.

Facebook X (Twitter) Pinterest Instagram
2026 © Euro News Source. All Rights Reserved.
  • Privacy Policy
  • Terms
  • Contact

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?