University of Washington: Understanding the Spike Protein of SARS-CoV-2

Image credit: Walls, Park et al. Cell (2020)

The paper we’re demystifying can be found here, if you’d like to follow along.

This is another enormous undertaking by a team of scientists from across the world. This paper was published in Cell on March 19, 2020 and is a big leap in the direction of the ancient adage: know thy enemy. Today’s team of science superheroes come from:

  1. Department of Biochemistry, University of Washington, Seattle
  2. Institute Pasteur & CNRS UMR, Unité de Virologie Structurale, Paris
  3. Vaccines and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle
  4. Department of Global Health, University of Washington, Seattle

(This paper focuses a lot on the spike protein of nCoV-19. For more on the spike protein and how coronaviruses get into cells, check out this sidebar).

In today’s paper the researchers involved put a lot of work into understanding the spike protein of SARS-CoV-2 (AKA nCoV-19, the novel coronavirus, and the reason you’re stuck at home right now). Using an enormous battery of scientific techniques, they discovered the following:

  1. ACE2 is a receptor for the spike protein of SARS-CoV-2 and the strong binding between the SARS-CoV-2 S protein and ACE2 could explain why the virus can jump from person to person so easily
  2. There is a ‘cut here’ line (in scientific terms, a novel cleavage site) in the S protein of SARS-CoV-2 that is both new and unexpected and changes how the S protein is made in infected cells. The researchers think this may contribute to the virus’ ability to infect new species or types of tissue
  3. The researchers used cryo-EM to determine the structure of part of the S protein trimer for SARS-CoV-2 and found that it can change its shape in ways similar to MERS and SARS
  4. The researchers found that antibodies from mice infected with SARS could reduce the rate at which the SARS-CoV-2 spike protein attached to its receptor, potentially also reducing the rate at which the virus infects cells

Quick background:

Coronaviruses get into host cells by using the trans-membrane spike (S) protein. The S protein has two subunits– which you can think of as two lego blocks stuck together to make the larger protein. The two subunits are S1, which is responsible for recognizing and binding to the host cell receptor, and S2, which is responsible for fusing the virus and cell membranes together (A membrane is the special coating surrounding cells and some viruses. Membranes are made of amphipathic molecules; molecules that have one part that likes water and one part that hates water). In coronaviruses the spike protein exists as a homotrimer (homo = same, tri= three), three identical protein pieces that are joined together into the final functional form.

For many (but not all) coronaviruses, binding of the spike protein to the receptor is quickly followed by a host cell protease (a protein cutting machine) cutting the spike protein between the S1 and S2 subunits. For ALL coronaviruses, the spike protein is cut even further at a site called the S2′ site, which is just above the small amino acid chain responsible for fusing (or joining together) the virus and cell membranes. When the S2 subunit causes the virus and cell membranes to fuse together it lets the virus dump all of its genetic material into the cell and get on with infecting. 

It’s been suggested that this fusion of the virus and cell membranes, which is critical for infection to occur, requires this cutting of the spike protein. Previous groups (for example, Madu et al. 2009 and Millet and Wittaker, 2015) have suggested that this cutting actually prepares the S2 protein to fuse the membranes by causing important and irreversible changes in the protein shape. Either way, it is accepted that in order for a coronavirus to enter a cell there has to be a combination of receptor binding and further processing (‘cutting’) of the S protein.

The S1 subunit (the one that recognizes the receptor) has different domains inside it. A protein domain is like a neighbourhood within a city district. It’s a small piece of the protein that can exist, function and evolve on its own. Different coronaviruses use different domains of the S protein to recognize receptors different. For example MERS uses domain A (the SA domain) to recognize its attachment receptors and SB to bind an entry receptor. In contrast, SARS-CoV (SARS) directly uses SB to attach to and interact with its entry receptor. The entry receptor for SARS-CoV-2 is hACE2 (the human form of ACE2) on the surface of type II pneumocytes (a type of cell lining the alveoli in the lungs). SARS-CoV-2, like SARS, uses the B domain of its spike protein (SB) to recognize this receptor on host cells.

/quick background.

[I know, I know- it wasn’t very quick. Virology is a tough and complicated subject and I don’t pretend to be an expert. For more, detailed information on how SARS-CoV-2 and other coronaviruses get into cells, check out the full paper here (the introduction has very good background) or this link!]

Now that we’ve gone through that background, we can look back at the paper. This particular paper is a long one with several discoveries so let’s go through them one discovery as a time. 

  1. ACE2 is a receptor for the spike protein of SARS-CoV-2

The researchers wanted to figure out what elements were needed for SARS-CoV-2 to enter its target cell. In order to do this they turned to a system called the murine leukaemia virus (MLV) pseudotyping system. 

Another long set of words, so let’s break it down. Simply put, a viral pseudotyping system consists of a particular virus particle that’s been modified to display the envelope proteins of a different virus that the researchers are trying to study. In this case they engineered a mouse leukaemia virus to display the S proteins of either SARS-CoV (SARS) or SARS-CoV-2 (the virus that causes COVID19). Following this the researchers compared how quickly the engineered viral particles were taken up into a cell line derived from African green monkeys that expresses ACE2 receptors (VeroE6 cells). Both the pseudoviruses (The one displaying the SARS spike protein and the one displaying the SARS-CoV-2 spike protein) entered these cells equally well.

The researchers also took BHK cells (cells from hamster kidneys) and used a technique called transfection to make these cells express human ACE2 for a brief period of time. When the SARS-CoV-2 S-MLV virus (a mouse leukaemia virus with the novel coronavirus S protein) was tested on these cells they researchers found that the virus could enter despite the temporary nature of the ACE2 expression.

From these results the researchers concluded that hACE2 is a functional receptor for SARS-CoV-2. They also noted that this finding was in line with other reported results, which is good because in science we always strive for replicability.

The researchers then used a technique called biolayer interferometry to study how strongly the S proteins bind to the hACE2 receptor. Biolayer interferometry is a technique that uses interference patterns in white light to tell whether or not a free ligand (in this case, the hACE2 protein) is bound to its partner (in this case, the S proteins), which is immobilized on a biosensor. From this the researchers found that the S protein of SARS-CoV-2 binds to hACE2 about as strongly as the S protein of the strains of SARS involved in the 2002-2003 outbreak. The strength of the binding between the S protein and its receptor is an indication of how easily the virus spreads; the SARS strains from the later resurgence in 2003-2004 had an S protein that bound its receptor more weakly and as a result this strain had a lower pathogenicity and didn’t spread as easily. This relatively high affinity (strength of binding) between the SARS-CoV-2 S protein and hACE2 might explain why COVID19 is spreading so easily.

2. There is a novel cleavage site in the S protein of SARS-CoV-2 that is both new and unexpected and changes how the S protein is made in infected cells.

After analyzing the sequence of the SARS-CoV-2 spike protein the researchers found an insertion of four amino acids between the S1 and S2 boundaries. An insertion means that a set of amino acids just got shoved into the middle of the initial chain. For example if the initial amino acid sequence was ‘acat,’ a four amino acid insertion might look like ‘acrrrrat.’

(For more on amino acids check out this sidebar).

The researchers found that this insertion lead to the introduction of a new ‘cut here’ line that leads to a protease called furin cutting the S protein at the S1/S2 site right after the S protein is made, and before the new virus has even left the cell that made it. This is different from SARS, where the S protein is only cleaved after the virus binds to its receptor on a cell that it is in the process of infecting.

Another way of putting the difference- SARS-CoV-2 cuts its S protein ahead of time, prepping it if you will. SARS, on the other hand, totes around a full S protein and only cleaves it after beginning the process of infecting a new cell.

This was so strange that the researchers wanted to know why it happened. To study this they constructed a mutated SARS-CoV-2 S protein that did not have this new ‘cut here’ line and put it into the membrane of their MLV pseudovirus.

The researchers expected that this mutated S protein would be incorporated into the pseudovirus membrane in its full, uncleaved (not cut) form. They found that their expectation was correct and that this new pseudovirus, SARS-CoV-2 Sfur/mut– MLV, had uncleaved S proteins after budding from the cells. This virus was still able to worm its way into cells! From this, the researchers concluded that this mutation (introducing a new ‘cut here’ site) in the S protein was not necessary for the virus to enter the cells.

Since this ‘cut here’ site does not improve the virus’ ability to infect cells and is indeed not necessary for the virus to enter a cell’s membrane, the researchers worked to come up with alternate reasons why this mutation might have happened. Remember that mutations happen all the time, but usually mutations that are somehow helpful to the virus, cell or bacteria tend to be passed on to later generations. Since this cleavage site was retained in the virus, it must’ve had some benefit!

The researchers suggest that this novel cleavage site might help the virus infect a larger range of species or tissues and could potentially increase how transmissible it is (how easily it jumps from one host to another). The reason they think this might be an explanation is because proteases like furin are present in all sorts of cells and species and other groups have reported effects on other viruses that support this suspicion (Millet and Whittaker, 2015, for example). For example, some very pathogenic bird influenzas have sites that can be cut by furin-like proteases.

Figure 1 from the paper showing- A: How the pseudovirus with either the SARS S protein, the novel coronavirus S protein or the mutated novel coronavirus S protein (without the new cut site) enters VeroE6 cells, which are from African green monkeys. B: How the pseudovirus with either the normal S protein from the novel coronavirus or the mutated S protein from the novel coronavirus (without the new cut site) enters BHK cells (hamster kidney cells) that temporarily express hACE2. C: The sequences of several SARS-like viruses compared. D: A western blot proving that the research group has made a version of the novel coronavirus S protein that cannot be cut by the protease furin

3. The researchers found a structure of part of the S protein trimer for SARS-CoV-2 and found that it can change its shape in ways similar to MERS and SARS

Finding a structure of a protein is very difficult and to even get started the researchers had to make various small changes to the spike protein to make it stable enough to analyze. Following these minor changes they used cryo-EM to develop a 3D structure. Remember how the S protein has two domains- A and B? These domains can also take either a closed or open form. Also remember that the S protein exists as a trimer- each S protein trimer has 3 S proteins, which means 3 A domains and 3 B domains. The researchers found that in their generated 3D structures at least half of the selected images had trimers with one B domain in the open form with the other two closed. The other half of the images showed the trimers with all the B domains closed. This variation in the conformation (roughly, shape) is similar to what other groups have seen with SARS and MERS S protein trimers. Overall the group found that the ectodomain (the part that protrudes out of the virus membrane) of the SARS-CoV-2 S protein is very similar to the structure of the SARS S protein.

From previous research in the field and the cryo-EM structures the group is suggesting that the part of the SB domain that recognizes hACE2 is exposed only when the SB domain is in its open form. From this the researchers expect that the opening of the SB domain is important for the S protein to interact with hACE2 and start off the rest of the process, including the protease cutting step and membrane fusion steps.

So why have a closed state at all? It’s possible that by randomly switching between closed and open states for the SB domain the virus can avoid being detected by the host’s immune system. This, combined with shields made of long sugar based structures called glycans, may help the virus avoid recognition by the host immune system. 

4. The researchers found that antibodies from mice infected with SARS could reduce the rate at which the SARS-CoV-2 spike protein attached to its receptor, potentially also reducing the rate at which the virus infects cells

From the sequence mapping the researchers found that the S2 fusion machinery was relatively similar between SARS and SARS-CoV-2. They also found that the glycan ‘shields’ used by SARS were very closely conserved (aka, very similar) in SARS-CoV-2. Since these glycans are what typically ‘shield’ the fusion machinery from being recognized by antibodies, the researchers suspected that the fusion machinery of SARS-CoV-2 would be equally accessible to antibodies as the machinery of SARS, since they use largely similar ‘shields’ and would probably have the same weaknesses in their ‘armour.’

Based on this observation they decided to see whether being exposed to one virus might generate an antibody response (part of an immune response) to the other. To check this they took plasma from four mice that had been immunized with a stable form of the S protein from SARS and checked whether that plasma would slow down or prevent the SARS-CoV S-MLV or the SARS-CoV-2 S-MLV pseudoviruses from infecting cells.

The mouse plasma completely prevented SARS-CoV S-MLV from entering cells and reduced the amount of SARS-CoV-2 S-MLV entering cells by 90%. This suggests that immunity against SARS may provide immunity against the novel coronavirus, and vice versa- which is very good news indeed!

The fact that SARS-CoV antibodies worked against SARS-CoV-2 also indicates that it might be difficult to tell from serum tests alone whether a patient was exposed to SARS-CoV-2 or another similar SARS-like virus, which sheds some distrust on serum-based testing. 

Overall, this paper worked to give a good framework on how to find parts of the S glycoprotein that antibodies might recognize and that might be similar to other SARS-like viruses. This will support the many teams out there trying to develop a vaccine!

Our results provide a structural framework to identify conserved and accessible epitopes across S glycoproteins that will support ongoing vaccine design efforts

Walls et al. Structure, Function and Antigenicity of the SARS-CoV-2 Spike Glycoprotein, Cell (2020)

This paper is a great example of a relatively small group of scientists doing really big research using techniques across the board. This paper provides a true wealth of information to the scientific community and this basic science foundation is one that can be built upon to bring about enormous medical innovation.

Leave a Reply