Adaptations found in the secondary structure of RNA in the genes of two SARS-CoV-2 proteins


Modeling indicates that the RNA secondary structures in the genes encoding the Nsp4 and Nsp16 proteins of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are different from other related coronavirus species, which may affect some viral molecular processes.

With the dramatic spread of COVID-19 caused by the SARS-CoV-2 coronavirus, there has been a push to understand how a virus can infect new hosts and what makes it different from other coronaviruses.

One way to do this is to determine which parts of the viral genome have been naturally selected to evolve and are different from the ancestor species and which parts have been selectively removed from the genome.

Previous studies have found a mixture of selective evolution and elimination in genes encoding the spike protein of the coronavirus SARS-CoV-2. The spike protein helps the virus to invade and infect the host cell by binding to the angiotensin 2 converting enzyme (ACE2).

However, there are other critical processes in RNA viruses, like the coronavirus, which are not controlled by protein sequences. Standard tests for determining mutations can detect changes in viral proteins, but do not include the RNA molecules that the proteins interact with for different processes.

Mutations found in spike protein

To investigate coronavirus mutations, a new study conducted by scientists at Duke University and posted on the pre-print server bioRxiv *, used a calculation methodology, adaptiPhy, which identifies additional nucleotide substitutions in parts of the viral genome compared to mutations that have no effect on the genome. Using adaptiPhy, the team identified regions of genomes from different Sarbecovirus bat species, pangolin and human hosts, which could have been selected positively, or beneficial mutations. For the new SARS-CoV-2 coronavirus, they used around 5,000 genomic sequences from the NCBI Virus database.

The team also studied changes in the structures of Nsp4 and Nsp16 proteins at the RNA and protein level using modeling.

Using different calculation methods, the researchers found that the most important signal was for the gene that encodes the spike protein, showing positive selection in all species tested. This is similar to what other studies have reported.

In the SARS-CoV-2 virus, they found positive selection in four regions of the spike protein gene. One was a change in the entire structure of the receptor binding domain (RBD), which binds to host cells. Another was a site change required to infect lung cells.

This is different from the changes found in SARS-CoV and Bat-CoV-LYRa11. In these viruses, positive selection has occurred in regions for viral camouflage and those which allow entry of virus into the host cell.

These mutations suggest that viruses have adapted for different hosts, with SARS-CoV-2 adapting to bind to the ACE2 protein in various hosts.

Positive selection in proteins

The authors also found positive selection in genes encoding two proteins, Nsp4 and Nsp16, which had never been seen before. In Nsp4, they found two nucleotide substitutions, valine to alanine and valine to isoleucine. Modeling suggested that these changes did not have a significant impact on the secondary or tertiary structure of the protein in SARS-CoV-2 compared to other species. In Nsp16, they did not find such substitutions. However, none of these changes likely affect the structure or functions of these proteins, the authors write.

It is therefore possible that the positive selection is due to changes in the structure and function of RNA.

Nsp16 has a single, well-folded region, which is the only such region that is also conserved in other related coronavirus species. Nsp4 has two fairly well folded regions. Thus, it is likely that these folded structures are related to viral functions.

“Our minimal free energy (MFE) predictions reveal that the likely secondary structure of the RNA genome in the region of the Nsp4 and Nsp16 genes likely differs among the six species of coronavirus we examined,” the authors write.

There were also differences between species in entropy in regions of positive selection, suggesting differences in the stability of the folded molecules.

“Taken together, these new results indicate that the folded regions of Nsp4 and Nsp16 in the SARS-Cov-2 genome may differ in shape from those of related coronaviruses,” the authors write.

However, how these changes, which are unique to SARS-CoV-2, relate to specific molecular functions, cannot yet be determined, as the molecular functions of the secondary structures of coronaviruses are not well known today. .

Since previous studies indicate that these regions have functional roles, the changes may affect genome or transcription functions. However, the true roles of these adaptations in structural proteins need to be further investigated experimentally.

*Important Notice

bioRvix publishes preliminary scientific reports which are not peer reviewed and, therefore, should not be considered conclusive, guide clinical practice / health-related behaviors, or treated as established information.


Please enter your comment!
Please enter your name here