An innovative machine learning method has been shown to rapidly predict a wide range of protein structures. A new paper presents a method for predicting relative groups of protein conformations using AlphaFold 2, an artificial intelligence method capable of accurately predicting protein structures. This work will advance the understanding of protein dynamics and function. The authors note that the technique is accurate, fast, cost-effective, and has the potential to revolutionize drug discovery by discovering more new therapeutic targets. The work was published in Nature Communications in an article titled "High-throughput prediction of protein conformational distributions with subsampled AlphaFold2".
The work of Gabriel Monteiro da Silva, a Ph.D. candidate in molecular biology, cell biology, and biochemistry at Brown University, seeks to improve computational methods to model protein dynamics. In this study, he conducted experiments with AlphaFold 2.
Monteiro da Silva says that AlphaFold 2's accuracy has revolutionized protein structure prediction, but the method has limitations: it only allows scientists to model proteins statically at a specific point in time. The authors further elaborate on this point, writing that while AlphaFold 2 demonstrated remarkable accuracy and speed, "it was designed to predict the basal conformation of proteins and has a limited ability to predict conformational landscapes." In this study, they show how AlphaFold 2 "directly predicts the relative population of different protein conformations by secondary sampling of multiple sequence pairs."
The researchers were able to manipulate the evolutionary signals of proteins and use AlphaFold 2 to rapidly predict multiple protein conformations, as well as the frequency of distribution of these structures.
If you understand the multiple snapshots that make up the dynamics of proteins, then you can find multiple different ways to target proteins with drugs and treat diseases.
The researchers tested their method in NMR experiments on two proteins with "very different amounts of available sequence data" - Abl1 kinase and granulocyte-macrophage colony-stimulating factor. They tested their approach in NMR experiments on two proteins with "different amounts of available sequence data" - Abl1 kinase and granulocyte-macrophage colony-stimulating factor. They predicted demographic changes in the relevant states with more than 80% accuracy.
The researchers point out computational methods are costly and time-consuming, the researchers point out. monteiro da Silva says: "They are expensive in terms of materials and infrastructure; they take a lot of time, and you can't really do these computations in a high-throughput way. On a larger scale, this is a problem because there's a lot to explore in the world of proteins: how protein dynamics and structure relate to little-known diseases, drug resistance, and emerging pathogens.
As for the next step, the research team is improving their machine learning approach to make it more accurate, more versatile, and more useful in a range of applications.
Reference