Movers and SHAKERS
MIT Researchers are Learning to “Speak the Language” of Viruses
Artificial intelligence, now used in machine language learning, is being put to use to study viral evolution in order to design effective vaccines. MIT researchers have conducted studies that have developed a powerful new computational tool for predicting the mutations that allow viruses to “escape” human immunity or vaccines. Predicting how a virus is going to behave should save countless lives each year. This is true whether it is a coronavirus, influenza, rhinovirus, or HIV. The challenge is being overcome by MIT with the help of the same software that teaches machines spoken language.
The behavior they have focused on is mutation, which viruses do slowly but progressively over time. This is why they tend to become resistant to previously effective vaccines. It is also why vaccines for influenza are “updated” annually, and an HIV vaccine has been so difficult to obtain. The MIT researchers have devised a new way to compute viral escape based on models that were originally developed to analyze and then teach language. The model has successfully predicted which sections of the viral surface are most prone to mutate in a way that enables the virus to escape discovery by the immune system. The language learning model can also identify portions of the viruses that are less likely to mutate. The sections that are least likely to change are the better targets for new vaccines.
Viral escape is the process that allows viruses to evade the host's immune systems (including antibodies induced by vaccines). It occurs when the genetic material of the virus is modified and the sequence of proteins altered.
Ongoing modification and synthesis of viral protein sequences are why vaccines quickly become obsolete and then require new study and redesign to be effective. The goal of scientists, including those studying viruses using language learning models, is to stay one step ahead of these parasites by focusing their attention on parts of the virus least likely to mutate.
About MIT's Model
As mentioned earlier, the model developed and optimized at MIT is focused on observing the regions of the virus’s surface proteins and then forecasting which parts have the highest probability to mutate based on previous observations. Identifying these portions and their genetic “language” has allowed the researchers to calculate the best objectives of a new vaccine or modifications to those that already exist.
Different Viruses, Different Languages
Each virus mutates at a different rate. The seasonal flu virus spins off different versions rapidly (in less than a year), HIV mutates with a speed that has prevented an effective vaccine. This is why these two virus types have the ability to escape the immune system with relative ease. Reading the language of each virus, by following repeated patterns to know what it is telling us about itself, has allowed researchers to predict where the change may come about. The stable portion of the virus, once identified, becomes the center of research in cures and prevention.
In order to use viral language reading to model the gene expression and mutation processes of new virus surface proteins, scientists analyze pre-existing sequences of genes and observe temporal and spatial changes. After many observations, sophisticated models are used to create virtual simulations of the changes that could occur. The models used are based on language.
Language models have shown themselves to be powerful because they can learn the complex distributional structure and gain insight into function just from the sequence variation. The model learns from each occurrence, co-occurence, and sequence variation across data.
After training the model, the researchers put it to use to predict sequences of the coronavirus spike protein, HIV envelope protein, and influenza hemagglutinin (HA) protein for it to suggest where escape mutations would be less likely to be generated.
For the flu, the model suggested that the sequences least likely to mutate and produce viral escape were in the stalk of the HA protein. This matches the findings of recent studies that show antibodies that target the HA stalk can offer near-universal protection against any flu strain.
The coronaviruses' forecast and analysis provided that a part of the spike protein referred to as the S2 subunit is least likely to produce escape mutations. As an aside, there is not enough data on the variant SARS-CoV-2 to determine how rapidly it mutates at this time.
In their studies of HIV, the scientists discovered that the V1-V2 hypervariable region of the protein has several possible escape mutations; this is consistent with previous findings. On the positive side, they also identified sequences that would have a lower probability for mutation, allowing immune system escape.
The Future of this Research
Mutations and "viral escape" remain the biggest challenge in the search for vaccines and viral treatments that remain effective year after year. Indications are the future of virus research, and the fight and prevention of the infections they produce lie in predicting and anticipating each virus's behavior. The AI language learning models adapted to recognize the activity of viruses to determine future activity is novel and producing useful results. It’s expected that this new use of the technology will be a large contributor to facing the challenges of viral research.
Stay up to date. Follow us: