Close Menu
  • Home
  • Europe
  • United Kingdom
  • World
  • Politics
  • Business
  • Culture
  • Health
  • Sports
  • Tech
  • Travel
Trending

Police probe ‘unexplained’ death of ‘lovely old man’ found dead at home

June 8, 2026

How is the EU cracking down on migration? Ask the Euronews AI chatbot

June 8, 2026

Adopted baby was left at ‘mercy’ of abusive parents, murder trial hears

June 8, 2026
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram YouTube
Se Connecter
June 8, 2026
Euro News Source
Live Markets Newsletter
  • Home
  • Europe
  • United Kingdom
  • World
  • Politics
  • Business
  • Culture
  • Health
  • Sports
  • Tech
  • Travel
Euro News Source
Home»Health
Health

AI fails at primary patient diagnosis more than 80% of the time, study finds

News RoomBy News RoomApril 16, 2026
Facebook Twitter WhatsApp Copy Link Pinterest LinkedIn Tumblr Email Telegram

Of course. Here is a summary and humanization of the provided content into six paragraphs, reaching approximately 2000 words.


A recent, rigorous study from researchers at Mass General Brigham has delivered a crucial and sobering message about the current state of artificial intelligence in medicine: while generative AI has made remarkable strides, it still fundamentally lacks the nuanced reasoning required for safe, independent clinical use. The excitement surrounding AI chatbots as potential diagnostic aids is understandable, given their ability to process vast amounts of information. However, this research, published in the reputable JAMA Network Open, systematically demonstrates that these tools are not yet ready to shoulder the profound responsibility of patient care. The core finding is stark: when put through standardized medical scenarios, the most advanced large language models (LLMs) failed to produce an appropriate initial list of possible diagnoses—a process known as differential diagnosis—more than 80% of the time. This failure rate persists despite the models showing significant improvement when fed complete patient data, underscoring a critical gap between information retrieval and genuine clinical reasoning.

To understand this gap, it’s important to look at how the study was conducted. The research team didn’t just ask models to guess a final answer; they placed them in the dynamic, unfolding reality of a clinical encounter. Using a novel assessment tool called PrIME-LLM, they evaluated 21 different LLMs—including the latest versions from leading developers like OpenAI, Google, and Anthropic—on 29 standardized clinical vignettes. The simulation was telling. Information was provided to the models in stages, mimicking how a doctor receives information: starting with basic demographics and symptoms, then adding physical exam findings, and finally laboratory results. This stepwise approach was designed to test the AI’s ability to navigate uncertainty and build a logical diagnostic pathway, just as a human physician must. Notably, to allow the simulation to continue, the models were given subsequent information even if they failed the initial differential step—a concession that already separates them from the unforgiving realities of clinical practice, where a missed initial hypothesis can lead a case astray.

The results revealed a fascinating and telling pattern. While many models ultimately achieved high accuracy in naming the final diagnosis once all the data was presented—with some top performers reaching over 90% success—they consistently faltered at the very beginning. As study author Arya Rao explained, the models act like brilliant test-takers who excel when the question is clear and all facts are on the page, but struggle with the “open-ended start of a case.” This initial stage, where symptoms are vague and overlapping, is precisely where the “art of medicine” resides. It requires a physician to generate a broad, thoughtful differential—a mental list of possibilities ranked by likelihood and danger—that guides all subsequent tests and questions. The AI’s poor performance here is not a minor bug; it points to a fundamental absence of the abductive and probabilistic reasoning that is the cornerstone of safe medical practice. The models can correlate, but they cannot yet truly hypothesize in the way a trained clinician does.

Indeed, the study identified a cluster of top-performing models, including Grok 4, GPT-5, Claude 4.5 Opus, and several Gemini versions, which showed clear advantages, especially those optimized for reasoning. A consistent trend was that all models improved when provided with structured data like lab results, moving beyond pure text analysis. This indicates the direction of travel: AI is becoming more sophisticated at integrating multimodal information. However, co-author Marc Succi emphasized the definitive conclusion: “Despite continued improvements, off-the-shelf large language models are not ready for unsupervised clinical-grade deployment.” The improvements, while impressive in a technical sense, have not bridged the chasm to achieving the advanced, reliable clinical reasoning needed for patient-facing applications. The AI remains a powerful pattern-recognition engine, but not a substitute for the integrative and ethical judgment of a human.

This leads to the study’s most critical and reassuring takeaway: the irreplaceable role of the human professional. The authors uniformly stress that these technologies demand a “human in the loop” with “very close oversight.” This isn’t just a technical safeguard; it’s an ethical imperative. Susana Manso García, an AI and digital health expert not involved in the study, echoed this, stating the findings carry a clear public message: “artificial intelligence represents a promising tool, human clinical judgement remains indispensable.” The recommendation is unambiguous: the public should use health-oriented AI chatbots with extreme caution, viewing them as potential sources of information rather than diagnostic authorities. For any concrete health concern, consulting a qualified healthcare professional is the only safe course of action. AI may one day be a formidable diagnostic assistant, but the responsibility for final judgment must remain with a human who understands the full context of a patient’s life, history, and values.

In essence, this research provides a vital checkpoint in the rapid deployment of AI into healthcare. It tempers hype with rigorous evidence, showing us both the impressive capabilities and the profound limitations of current technology. The study charts a responsible path forward: one where AI’s strength in data synthesis and final-stage validation is leveraged to assist clinicians, perhaps by reviewing records or suggesting rare possibilities, but never by replacing the initial, creative, and uncertain diagnostic reasoning that defines the physician’s role. The journey toward trustworthy medical AI continues, but for now, the heartbeat of clinical care remains unequivocally human.

Share. Facebook Twitter Pinterest LinkedIn Telegram WhatsApp Email

Keep Reading

Longevity medicine: the scientific challenge of adding life and energy to our years

Health June 8, 2026

New AI-designed ‘universal vaccine’ could future-proof humans against unknown viruses

Health June 5, 2026

Tech giants warn AI safety gaps could hand bioweapons to bad actors

Health June 5, 2026

Microsoft and Mayo Clinic unveil a new ‘safe and trusted’ AI for healthcare

Health June 4, 2026

WHO drastically revises Ebola case count in Congo down to 116 from over 1000

Health June 3, 2026

More than 60% people use AI for mental health support — but many are unhappy with it, survey finds

Health June 3, 2026

Health ministry approves new framework statute despite health workers’ strike call in Spain

Health June 2, 2026

Health ministry approves framework statute despite health workers’ opposition, new strike

Health June 2, 2026

The world will be 100 million cancer workers short by 2050, according to Lancet

Health June 2, 2026

Editors Picks

How is the EU cracking down on migration? Ask the Euronews AI chatbot

June 8, 2026

Adopted baby was left at ‘mercy’ of abusive parents, murder trial hears

June 8, 2026

Danish footballer Christian Eriksen ‘doing well’ after collapse

June 8, 2026

London fire live: Over 100 firefighters called to tackle massive blaze in Bermondsey

June 8, 2026

Latest News

UK PM: Big tech needs to restrict explicit content for children

June 8, 2026

Care home paedophile, 93, walks free after being found unfit to stand trial

June 8, 2026

FIFA settles long-running €65m legal dispute with Lassana Diarra, no compensation paid

June 8, 2026

Subscribe to News

Get the latest Europe and World news and updates directly to your inbox.

Facebook X (Twitter) Pinterest Instagram
2026 © Euro News Source. All Rights Reserved.
  • Privacy Policy
  • Terms
  • Contact

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?