Google’s AI protein folder IDs structure where none seemingly existed – Ars Technica

Cartoon diagram of a three-dimensional protein structure.

For most proteins, structure is function. The complex three-dimensional shapes that proteins adopt create folds and pockets that can accomplish the remarkably improbable: driving chemical reactions that would otherwise never happen or binding to a single chemical inside the complex environment of a cell. Protein structure is so important that there’s an entire discipline, along with several well-developed approaches, to figuring out what a protein looks like when it’s all folded up into its active state.

But that’s only most proteins. Scientists have also found a growing catalog of intrinsically disordered proteins. Rather than having a set structure, intrinsically disordered proteins seem to have entire sections that can flap around in the breeze of Brownian motion and, yet, were critical to the protein’s structure. People haven’t been sure whether these proteins temporarily adopted a specific structure to work or the disorder was critical for function.

Now, a new paper describes a case where two intrinsically disordered proteins induce specific structures in each other when they interact. And Google’s new AlphaFold AI software was critical to figuring out that structure.

Chasing disorder

Most studies of protein structures identify the positions of amino acids with a fairly high degree of certainty. However, many proteins had regions where these studies produced the equivalent of a blur, suggesting that part of the protein was in constant motion in the environment. A number of additional proteins also resisted structural studies entirely.

For many years, these were considered oddities that had little to do with each other. Eventually, people came around to the idea that this seemingly disordered state was not an experimental artifact but, rather, represented the protein’s actual behavior—and, in some cases, was essential to their function. The idea of intrinsically disordered proteins was a key conceptual breakthrough.

Since then, researchers have identified a number of ways these things function. In some cases, they can form a specific structure when interacting with a separate molecule. In others, they allow the formation of different structures depending on which molecule they’re interacting with. In still others, proteins seem to remain disordered even when functionally active. Figuring out which is the case for a given intrinsically disordered protein can be a serious challenge.

But that’s the challenge a group of researchers in Hefei, China, decided to tackle. They were interested in a protein called protein 4.1G, which, through its interactions with a protein called NuMA, is essential for cell division. The regions of both NuMA and protein 4.1G that mediate this interaction have been identified, and they’re both intrinsically disordered.

So how do you figure out what the proteins are doing?

Studying the unstructured

One part of the team was looking for function by mutating one of the two interacting intrinsically disordered regions. They did this by linking the two regions to half of a protein that catalyzes a chemical reaction. If they interact and bring the halves together, the reaction would go forward. The researchers then made mutations in one of the intrinsically disordered regions and determined the rate of the chemical reaction. This allowed them to identify specific locations in the disordered regions that were essential for interactions.

Separately, they tried a molecular dynamics simulation, which tries to compute the structural state of proteins based on physical features like interactions of opposite charges or the ability to interact with water in solution. But even over a relatively short time window (200 nanoseconds), the intrinsically disordered region switched among 15 different conformations.

So, there was a lot of work for a limited amount of information. That’s where AlphaFold entered the picture.

The team set up AlphaFold to predict the configuration of protein complexes and tested one of the intrinsically disordered regions complexed with pieces of the second. These analyses consistently showed the intrinsically disordered region in a single configuration—a configuration that made sense in light of the mutations that eliminate function identified in the earlier experiments.

The structure explains why things remain disordered outside of a complex. It shows that a key part of the complex involves an interaction of three sheets of amino acids that run antiparallel to each other—two of them from one protein, one from the second. So, the structure is impossible to form without both proteins being present.

To show that this predicted structure was relevant to the actual protein, the researchers used a second AI package to predict mutations that would stabilize it. Tests of these mutants found one where the complex was stable enough to obtain a crystal of the proteins, confirming that the AlphaFold-predicted structure was accurate. In addition, they used AlphaFold to identify other proteins that have regions that could potentially interact by a similar mechanism. Of the 38 potential partners tested, seven interacted.

Reasonable disorder

While this structure worked to form a complex necessary for cell division, similar complexes can easily be formed by well-structured proteins. So, based on that alone, it’s unclear why evolution would favor the intrinsically disordered option instead. But it turns out that protein 4.1G acts a bit like a bridge, forming interactions with lots of proteins and bringing them together into a complex. The intrinsically disordered nature allows it more flexibility about the partners it interacts with, allowing it to bring more partners into the complex.

The new study doesn’t mean that all intrinsically disorder proteins form these sorts of ordered structures under the right circumstances, though it does provide another example that this sort of thing is possible. And it does provide a great example of how AlphaFold can give us a new tool to approach important biological questions that have been difficult to answer.

PNAS, 2023. DOI: 10.1073/pnas.2305603120  (About DOIs).

Previous post Guerrilla artists project laser light displays onto Charleston’s ‘toilet paper roll’ cell tower | Charleston Scene
Next post New “electrical” blue tarantula species found in Thailand: “Enchanting phenomenon”