Alphafold has made protein folding predictable

The artificial intelligence from Deepmind calculates in minutes how complex molecules fold. The first experiences show how radically this changes biology.

Depending on their building blocks, proteins can take many different forms.

Mohammed AlQuraishi

Two groups once competed to decipher the human genome: Celera, a company founded by Craig Venter, and the publicly funded Human Genome Project. In February 2001, the American journal Science published the data from Venter’s team, and the British counterpart Nature published the results of the human genome project on the same day.

Some may have remembered this when a few months ago – on July 15th – the two leading science magazines reported on a new scientific breakthrough that the industry is celebrating as much as the sequencing of the genome was then. This time «Nature» presented the open source software Alphafold from Deepmind, which was bought by Google in 2014. In “Science” presented Researchers led by David Baker from the University of Washington in Seattle developed their freely accessible computer program Rosettafold.

One of the most difficult problems in modern biology

Both instruments use artificial intelligence (AI) to calculate the exact three-dimensional shape of proteins based on the sequence of their building blocks, the amino acids. This means that one of the most difficult problems in modern biology has been solved. The competition between private and public research has fueled the field, says Swiss molecular biologist Beat Christen, who was an assistant professor at ETH Zurich until last August.

Deepmind had already won the Critical Assessment of Protein Structure Prediction (Casp) in 2018. In this competition, researchers use their computer programs to calculate protein structures that have already been decoded experimentally, but the results have not been published. However, Deepmind was not satisfied with the victory, because the calculated atomic positions were still imprecise.

The Alphafold software has been redesigned. With success: in the next Casp round in November 2020, the London company achieved a sensational result. For two-thirds of the specified proteins, the structure predicted by Alphafold corresponded almost perfectly with that determined experimentally. Agreement was also high for most other proteins.

The London-based AI company left the other participants far behind, but academic research was catching up quickly. Baker’s group has integrated AI approaches from Deepmind into its own software and developed an equivalent program with Rosettafold.

A boost for medicine and biotech

Proteins are the key molecules in almost all processes in living cells and viruses. They consist of long chains of amino acids strung together. The sequence is determined by the genes. As the protein building blocks attract or repel each other, the chains become crumpled. Specialists refer to this process as protein folding.

How a protein folds

How a protein folds

The protein balls appear confused, but have a highly specific shape that is crucial for their function. Even the tiniest mistake when crumpled can trigger diseases such as Alzheimer’s, Parkinson’s and cancer. “Only when we know the protein structures can we specifically develop drugs to treat such diseases,” says Christen. The AI ​​software also makes it easier to analyze mutations.

A genetic modification results in a different amino acid sequence. Thanks to the precise structure prediction, conclusions can be drawn about the changed form and function of the mutated protein. “Bridging the DNA level to the three-dimensional protein structure will massively accelerate biological research,” says Christen. He is not alone in his enthusiasm.

“Nature” ranks Deepmind collaborator John Jumper, who leads the Alphafold project, among the top ten researchers of the past year. “Science” chose the AI-supported structure prediction as the scientific highlight of 2021.

In November, Deepmind CEO Demis Hassabis founded Isomorphic Labs under the umbrella of Google Holding Alphabet, which is searching for new pharmaceutical ingredients with Alphafold. Other researchers are already using the freely accessible AI to better understand antibiotic resistance or the mechanisms of Sars-CoV-2 infections.

Potential for environmental protection and industry

Not only medicine benefits from the new programs. Technical enzymes that help with bleaching jeans or with more environmentally friendly chemical production are also proteins. New enzymes could make many industrial processes more sustainable and solve environmental problems. John McGeehan from the University of Portsmouth in England is developing enzymes to break down plastic. Alphafold speeds up his work by years, he says.

Determining the structure of a single protein in the laboratory takes at least months, often takes up entire doctoral theses, and sometimes it doesn’t succeed at all. The fact that a company has now managed to do the decryption with an AI just as precisely, but in minutes to hours, is what Christen describes as a wake-up call for academic research, also because the London-based company Deepmind should now find it easy to poach talent.

At first glance, it is surprising that the company does not capitalize on Alphafold. The program runs free of charge on any laptop via a cloud computing tool, but can also be installed locally. Christoph Müller, group leader at the European Molecular Biology Laboratory (EMBL) in Heidelberg, finds the free use only logical, since Deepmind trained its AI with public data. In the past few decades, researchers have decoded around 180,000 protein structures in painstaking experimental work. They served as the basis for the development of Alphafold.

In July, together with EMBL, Deepmind presented a freely accessible database with 350,000 calculated protein structures. The collection contains almost all proteins from humans and from research-relevant organisms such as fruit flies, mice and coliform bacteria. It was expanded to over 800,000 proteins in December and is to be expanded this year so that it will then cover most of the approximately 100 million known proteins.

Müller expects “a whole wave of new knowledge”, but admits that Alphafold only makes predictions. Since the accuracy of the calculation for each amino acid is supplied, it is not necessary to check all the structures. However, the function resulting from a protein shape should be validated in the laboratory.

How do protein complexes fold?

The AI ​​still does not answer all questions about protein structures. Müller’s group is researching how DNA is transcribed into RNA in our cell nuclei. Molecular machines made up of several proteins are involved in this, as in many other life processes. Such complex structures cannot yet be precisely calculated, Müller explains. Nevertheless, it makes it easier to interpret the overall structure if the folds of the individual components are known.

The first attempts to predict the structure of relatively simple protein complexes already exist. Deepmind posed for this purpose in October its Alphafold-Multimer program. Baker’s competing group had already dealt with this topic in their “Science” publication in July.

Another challenge is the calculation of proteins that can change their shape like the machines in the film «Transformers». Viral proteins in particular are versatile. However, other proteins are not rigid either, especially since they are constantly in contact with each other and with other biomolecules inside the cell and especially in the tightly packed membranes. Environmental conditions such as the pH value or the docking of a pharmaceutical agent also affect the protein shape.

So far, the AI ​​software has not taken such dynamic processes into account, says the molecular biologist Müller. In order to understand the big picture, he examines thin cell sections using a special electron microscopic method. Not all details are clearly visible, says the EMBL researcher: “We therefore want to fit the calculated protein structures piece by piece into the image of the cell.”

His vision is the representation of a cell in atomic sharpness. Behind this is the hope of seeing and understanding how life works. The sequencing of the genome was an important step in unraveling this mystery. It will be interesting to see how much perspective the AI ​​will now provide.

Follow the science editors of the NZZ Twitter.

source site-111