Google presents an artificial intelligence tool that investigates the mysteries of the human genome


A Google logo is seen at a company research center in Mountain View, California, U.S., May 13, 2025. – Reuters

Google on Wednesday unveiled an artificial intelligence tool that its scientists say would help unravel the mysteries of the human genome and could one day lead to new treatments for diseases.

The AlphaGenome deep learning model was hailed by outside researchers as a “breakthrough” that would allow scientists to study and even simulate the roots of hard-to-treat genetic diseases.

While the first complete map of the human genome in 2003 “gave us the book of life, reading it remains a challenge,” Pushmeet Kohli, vice president of research at Google DeepMind, told reporters.

“We have the text,” he said, which is a sequence of three billion pairs of nucleotides represented by the letters A, T, C and G that make up DNA.

However, “understanding the grammar of this genome – what is encoded in our DNA and how it governs life – is the next critical frontier for research,” said Kohli, co-author of a new study in the journal Nature.

Only about 2% of our DNA contains instructions for making proteins, which are the molecules that build and run the body.

The other 98% was long dismissed as “junk DNA” as scientists struggled to understand what it was for.

However, this “non-coding DNA” is now believed to act as a conductor, directing how genetic information works in each of our cells.

These sequences also contain many variants that have been associated with diseases. It is these sequences that AlphaGenome aims to understand.

a million letters

The project is just one part of Google’s AI-powered scientific work, which also includes AlphaFold, the winner of the 2024 Nobel Prize in chemistry.

AlphaGenome’s model was trained with data from public projects that measured non-coding DNA in hundreds of different types of cells and tissues in humans and mice.

A DNA double helix is ​​seen in an undated artist's illustration released by the National Human Genome Research Institute to Reuters on May 15, 2012. - Reuters
A DNA double helix is ​​seen in an undated artist’s illustration released by the National Human Genome Research Institute to Reuters on May 15, 2012. – Reuters

The tool is capable of analyzing long sequences of DNA and then predicting how each pair of nucleotides will influence different biological processes within the cell.

This includes whether genes start and stop and how much RNA (molecules that transmit genetic instructions inside cells) is produced.

There are already other models that have a similar objective. However, they have to compromise, either by analyzing much shorter DNA sequences or by reducing the level of detail in their predictions, known as resolution.

DeepMind scientist and lead author of the study, Ziga Avsec, said long sequences (up to a million DNA letters) were needed to understand the full regulatory environment of a single gene.

And the model’s high resolution allows scientists to study the impact of genetic variants by comparing differences between mutated and non-mutated sequences.

“AlphaGenome can accelerate our understanding of the genome by helping to map where functional elements are and what their functions are at the molecular level,” said study co-author Natasha Latysheva.

The model has already been tested by 3,000 scientists in 160 countries and is open to anyone to use non-commercially, Google said.

“We hope that researchers will expand it with more data,” Kohli added.

‘Discovery’

Ben Lehner, a researcher at the University of Cambridge who was not involved in developing AlphaGenome but who tested it, said the model “really works very well.”

“Identifying the precise differences in our genomes that make us more or less likely to develop thousands of diseases is a key step toward developing better therapies,” he explained.

However, AlphaGenome “is far from perfect and there is still a lot of work to do,” he added.

“AI models are only as good as the data used to train them” and existing data is not very suitable, he said.

Robert Goldstone, head of genomics at the UK’s Francis Crick Institute, warned that AlphaGenome was “not a magic bullet for all biological questions.”

This is partly because “gene expression is influenced by complex environmental factors that the model cannot see,” he said.

However, the tool still represents a “major advance” that would allow scientists to “study and simulate the genetic roots of complex diseases,” Goldstone added.

Leave a Comment

Your email address will not be published. Required fields are marked *