Researchers at Google DeepMind, the tech giant’s artificial intelligence arm, on Tuesday introduced a tool that predicts whether genetic mutations are likely to cause harm, a breakthrough that could help research into rare diseases.
The findings are “another step in recognising the impact that AI is having in the natural sciences,” said Pushmeet Kohli, vice president for research at Google DeepMind.
The tool focuses on so-called “missense” mutations, where a single letter of the genetic code is affected.
A typical human has 9,000 such mutations throughout their genome; they can be harmless or cause diseases such as cystic fibrosis or cancer, or damage brain development.
To date, four million of these mutations have been observed in humans, but only two percent of them have been classified, either as disease-causing or benign.
In all, there are 71 million such possible mutations. The Google DeepMind tool, called AlphaMissense, reviewed these mutations and was able to predict 89 percent of them, with 90 percent accuracy.
A score was assigned to each mutation, indicating the risk of it causing disease (otherwise referred to as pathogenic).
The result: 57 percent were classified as probably benign, and 32 percent as probably pathogenic — the remainder being uncertain.
The database was made public and available to scientists, and an accompanying study was published on Tuesday in the journal Science.
AlphaMissense demonstrates “superior performance” than previously available tools, wrote experts Joseph Marsh and Sarah Teichmann in an article also published in Science.
“We should emphasize that the predictions were never really trained or never really intended to be used for clinical diagnosis alone,” said Jun Cheng of Google DeepMind.
“However, we do think that our predictions can potentially be helpful to increase the diagnosed rate of rare disease, and also potentially to help us find new disease-causing genes,” Cheng added.
Indirectly, this could lead to the development of new treatments, the researchers said.
The tool was trained on the DNA of humans and closely-related primates, enabling it to recognize which genetic mutations are widespread.
Cheng said the training allowed the tool to input “millions of protein sequences and learns what a regular protein sequence looks like.”
It then could identify a mutation and its potential for harm.
Cheng compared the process to learning a language.
“If we substitute a word from an English sentence, a person that is familiar with English can immediately see whether this word substitution will change the meaning of the sentence or not.”