Researchers at Google's DeepMind have developed an artificial intelligence model named AlphaGenome, designed to interpret the vast, non-coding regions of human DNA. This technology aims to accelerate the discovery of genetic links to complex diseases and could fundamentally change how scientists approach drug development and genetic therapies.
The model analyzes the 98% of the human genome that does not consist of genes—often called the "dark genome"—to predict how genetic variations influence health. Early use by the scientific community suggests it could be a pivotal tool in understanding conditions like cancer, diabetes, and dementia.
Key Takeaways
- Google DeepMind has released AlphaGenome, an AI model that analyzes the non-coding or "dark" regions of DNA.
- The tool can predict the function of DNA sequences and the potential impact of single-letter mutations on gene activity.
- It is expected to help researchers identify the genetic causes of diseases like obesity, cancer, and high blood pressure.
- Over 3,000 scientists are already using the model for non-commercial research, accelerating lab work and potential drug discovery.
Decoding the Blueprint of Life
The human genome contains approximately three billion letters of DNA code, but only about 2% of this code forms the genes that create proteins for the body. The remaining 98%, long considered biological dark matter, plays a critical role in regulating when and how genes are activated.
Many genetic mutations linked to common diseases are found within this vast, non-coding expanse. Understanding their function has been a major challenge for geneticists. AlphaGenome addresses this by analyzing up to one million letters of DNA code at once, predicting which parts are genes and how the surrounding sequences control them.
Natasha Latysheva, a research engineer at DeepMind, explained the project's goal. "We see AlphaGenome as a tool for understanding what the functional elements in the genome do, which we hope will accelerate our fundamental understanding of the code of life," she stated.
A New Tool for Medical Research
Since being made available for non-commercial use last year, AlphaGenome has been adopted by thousands of scientists. The model helps researchers prioritize which genetic variants to investigate in the laboratory, potentially saving significant time and resources.
Targeting Obesity and Diabetes
Dr. Gareth Hawkes from the University of Exeter is among the researchers using the AI. His work focuses on genetic variants associated with obesity and diabetes, which are often located in the dark genome.
"They're directly impacting some important piece of biology that we don't really understand," Dr. Hawkes said. He noted that AlphaGenome provides predictions that guide his lab's experiments. "I wouldn't say the dark side of the genome is solved by AlphaGenome, but it's a big leap. I'm really excited."
Advancing Cancer Treatment
The AI is also being applied in oncology. Researchers are using it to distinguish between cancer-driving mutations and incidental ones. This could lead to more precise treatments that target the specific genetic errors fueling a tumor's growth.
By the Numbers
- 98% of the human genome is non-coding DNA, often called the "dark genome."
- 3 billion letters of DNA code make up the complete human genome.
- 3,000 scientists have already used AlphaGenome for research since its release.
- 1 million letters of code can be analyzed by the model at a time.
How AlphaGenome Works
Unlike large language models such as ChatGPT, which predict the next word in a sentence, AlphaGenome is a "sequence-to-function model." It was trained on extensive public databases of experiments from human and mouse cells to learn the relationship between a DNA sequence and its biological outcome.
The model can predict how changes in the DNA sequence affect gene expression—whether a gene is turned on or off—and gene splicing, a process that allows a single gene to produce multiple proteins. This predictive power is crucial for understanding the consequences of genetic mutations.
Pushmeet Kohli, vice president of science at Google DeepMind, sees this as part of a larger trend. "I think we are at the start of a new era of scientific progress, and AI is going to enable a number of different breakthroughs," he said.
Building on AlphaFold's Success
AlphaGenome follows DeepMind's earlier success with AlphaFold, an AI system that predicts the 3D structure of proteins. AlphaFold has been widely adopted by the scientific community and was recognized with a Nobel Prize, demonstrating the transformative potential of AI in biological research.
Acknowledged Limitations and Future Steps
Experts in the field have praised the development. Dr. Robert Goldstone, head of genomics at the Francis Crick Institute, called the model "a major milestone" and "an incredible technical feat."
However, its developers and users acknowledge it is not yet perfect. Professor Ben Lehner of the Wellcome Sanger Institute, whose team tested the model extensively, confirmed it performs well but is "far from perfect."
Current limitations include reduced accuracy when predicting gene regulation over very long distances—more than 100,000 letters of code. The DeepMind team is also working to improve its accuracy across different cell types. A neuron and a heart cell share the same DNA, but they function differently because genes are used in unique ways, a complexity the model is still learning to master.
The ongoing refinement of AlphaGenome, combined with its current capabilities, positions it as a foundational technology that could significantly advance genetic medicine and our basic understanding of human biology.





