Antigenic minimalism of SARS-CoV-2 is linked to surges in COVID-19 community transmission and vaccine breakthrough infections

Posted in medRxiv

May 31, 2021

Abstract: The raging COVID-19 pandemic in India and reports of “vaccine breakthrough infections” globally have raised alarm mandating the characterization of the immunoevasive features of SARS-CoV-2. Here, we systematically analyzed over 1.3 million SARSCoV-2 genomes from 178 countries and performed whole-genome viral sequencing from 53 COVID-19 patients, including 20 vaccine breakthrough infections. We identified 116 Spike protein mutations that increased in prevalence during at least one surge in SARSCoV-2 test positivity in any country over a three-month window. Deletions in the Spike protein N-terminal domain (NTD) are highly enriched for these ‘surge-associated mutations’ (Odds Ratio = 18.2, 95% CI: 7.53-48.7; p=1.465x10-18). In the recent COVID-19 surge in India, an NTD deletion (ΔF157/R158) increased over 10-fold in prevalence from February 2021 (1.1%) to April 2021 (15%). During the recent surge in Chile, an NTD deletion (Δ246-253) increased rapidly over 30-fold in prevalence from January 2021 (0.86%) to April 2021 (33%). Strikingly, these simultaneously emerging deletions associated with surges in different parts of the world both occur at an antigenic supersite that is targeted by neutralizing antibodies. Finally, we generated clinically annotated SARS-CoV-2 whole genome sequences and identified deletions within this NTD antigenic supersite in a patient with vaccine breakthrough infection (Δ156-164) and other deletions from unvaccinated severe COVID-19 patients that could represent emerging deletion-prone regions. Overall, the expanding repertoire of NTD deletions throughout the pandemic and their association with case surges and vaccine breakthrough infections point to antigenic minimalism as an emerging evolutionary strategy for SARS-CoV-2 to evade immune responses. This study highlights the urgent need to sequence viral genomes at a larger scale globally and to mandate that sequences are deposited with more granular and transparent clinical annotations to ensure that therapeutic development keeps pace with the evolution of SARS-CoV-2.


A.J. Venkatakrishnan, Praveen Anand, Patrick Lenehan, Pritha Ghosh, Rohit Suratekar, Abhishek Siroha, Dibyendu Roy Chowdhury, John C. O’Horo, Joseph D. Yao, Bobbi S. Pritt, Andrew Norgan, Ryan T. Hurt, Andrew D. Badley, John D. Halamka, Venky Soundararajan

nference, Cambridge, MA 02142, USA
nference Labs, Bengaluru, KA 560047, India
Mayo Clinic, Rochester, MN 55905, USA

Correspondence: Venky Soundararajan (, A.J. Venkatakrishnan (

The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.