A new AI model can forecast a person’s risk of diseases across their life

Artificial intelligence is opening new doors in healthcare, and one of the latest breakthroughs comes from Europe. Scientists have developed Delphi-2M, an AI model that can forecast a person’s risk of diseases across their lifetime. The innovation, inspired by large language models (LLMs) like GPT-5 that power ChatGPT, could mark a turning point in predictive medicine.
How the Model Works
Teams at the European Molecular Biology Laboratory (EMBL) in Cambridge and the German Cancer Research Centre in Heidelberg designed Delphi-2M by adapting the architecture of LLMs. Much like LLMs predict the next word in a sentence by spotting patterns in massive amounts of text, Delphi-2M was trained to predict the next likely diagnosis in a patient’s medical history.
A key modification was required: unlike words in a sentence, medical diagnoses don’t follow one another at regular intervals. For example, high blood pressure after pregnancy carries very different implications depending on whether the diagnoses are weeks or years apart. To solve this, researchers swapped the position encoder used in LLMs with one that tracks a patient’s age, allowing the AI to account for the passage of time.
(Interestingly, in early tests the system sometimes predicted new diagnoses after a patient’s death a glitch that was later fixed.)
Training and Validation
Delphi-2M was trained on data from 400,000 people in the UK Biobank, one of the most complete biological datasets in the world. The model analyzed the timing and sequence of ICD-10 codes, covering 1,256 different diseases. It was then validated on another 100,000 UK Biobank participants before being tested on an even larger dataset: health records of 1.9 million Danes dating back to 1978.
This gave the model access to a diverse and long-term dataset, strengthening its predictive power and making it more representative of real-world populations.
Accuracy and Performance
Researchers measured the model’s performance using AUC (area under the curve) scores, where 1 represents perfect accuracy and 0.5 represents random chance.
Within five years of a diagnosis, Delphi-2M achieved an AUC of 0.76 on UK data and 0.67 on Danish data.
Predictions of strongly linked events (like sepsis followed by death) were highly accurate.
Predictions of random events (such as catching a virus) proved harder.
Over a 10-year forecast horizon, accuracy dropped slightly to 0.7 on average.
Challenges and Future Potential
While Delphi-2M shows promise, it is not ready for clinical use yet. The model must undergo rigorous testing to ensure it genuinely improves patient outcomes. This process could take years.
The team is already working on upgrades: incorporating medical images, genetic data, and other sophisticated datasets from UK Biobank to boost accuracy further.
Other AI Health Forecasters
Delphi-2M is not alone in this space. Other models include:
Foresight: Developed at King’s College London in 2024, though its expansion was paused after data approval concerns.
ETHOS: A model being built at Harvard University with similar predictive goals.
Why It Matters for Science
Even before its direct use in hospitals, Delphi-2M offers valuable insights for researchers. The model highlights disease clusters conditions that often occur together which could reveal new relationships between illnesses. This may lead to breakthroughs in understanding disease progression and prevention.
As Ewan Birney, geneticist at EMBL, puts it: “I’m like a kid in a candy shop.” The excitement is well-founded: predictive AI could help reshape healthcare, not just in treating illness, but in preventing it before it starts.