- Split View
-
Views
-
Cite
Cite
Yi Zhang, Yang Zhou, Jiazhen Chen, Jing Wu, Xun Wang, Yumeng Zhang, Shiyong Wang, Peng Cui, Yuanyuan Xu, Yang Li, Zhongliang Shen, Tao Xu, Qiran Zhang, Jianpeng Cai, Haocheng Zhang, Pengfei Wang, Jingwen Ai, Ning Jiang, Chao Qiu, Wenhong Zhang, Vaccination Shapes Within-Host SARS-CoV-2 Diversity of Omicron BA.2.2 Breakthrough Infection, The Journal of Infectious Diseases, Volume 229, Issue 6, 15 June 2024, Pages 1711–1721, https://doi.org/10.1093/infdis/jiad572
- Share Icon Share
Abstract
Low-frequency intrahost single-nucleotide variants of SARS-CoV-2 have been recognized as predictive indicators of selection. However, the impact of vaccination on the intrahost evolution of SARS-CoV-2 remains uncertain at present.
We investigated the genetic variation of SARS-CoV-2 in individuals who were unvaccinated, partially vaccinated, or fully vaccinated during Shanghai's Omicron BA.2.2 wave. We substantiated the connection between particular amino acid substitutions and immune-mediated selection through a pseudovirus neutralization assay or by cross-verification with the human leukocyte antigen–associated T-cell epitopes.
In contrast to those with immunologic naivety or partial vaccination, participants who were fully vaccinated had intrahost variant spectra characterized by reduced diversity. Nevertheless, the distribution of mutations in the fully vaccinated group was enriched in the spike protein. The distribution of intrahost single-nucleotide variants in individuals who were immunocompetent did not demonstrate notable signs of positive selection, in contrast to the observed adaptation in 2 participants who were immunocompromised who had an extended period of viral shedding.
In SARS-CoV-2 infections, vaccine-induced immunity was associated with decreased diversity of within-host variant spectra, with milder inflammatory pathophysiology. The enrichment of mutations in the spike protein gene indicates selection pressure exerted by vaccination on the evolution of SARS-CoV-2.
The COVID-19 pandemic, caused by SARS-CoV-2, has led to billions of infections and millions of deaths [1]. Mass vaccination has diminished disease transmission and alleviated disease severity in patients. Nevertheless, vaccinated individuals may still encounter breakthrough infections as vaccination does not confer sterilizing immunity [2, 3]. One of the threatening consequences is that partial immunity may favor the emergence of variants that can escape immune recognition [4–6]. These well-adapted variants pose a potential threat to immunized persons and could have forward transmission [7]. However, it remains unclear whether these mutations are more likely to arise from breakthrough infection when compared with those from natural infection.
Mutations that arise in cases of SARS-CoV-2 infection could provide valuable information about the transmission chain, viral diversity, and evolution process. Deep sequencing technologies are a proficient tool to assemble the whole genome and elucidate the quasispecies of SARS-CoV-2 [8, 9]. Nevertheless, the predominant focus has been on variants that have become prevalent in the population, while mutations eliminated during transmission have been largely neglected. The high sequencing depth enables the exploration of intrahost single-nucleotide variants (iSNVs) occurring at rare frequencies. Our prior work has highlighted that the iSNVs of SARS-CoV-2 can serve as biomarkers reflecting disease pathogenesis during natural infection and forecast fixed mutations in the population [10].
In this study, we aimed to further explore SARS-CoV-2 iSNVs in breakthrough infections, which is crucial for elucidating the driving factors of virus evolution under immune selection pressure. We hypothesized that vaccination might exert an influence on distinct intrahost variant spectra in vaccinated and unvaccinated individuals. Our findings would contribute to advancing understanding of SARS-CoV-2 intrahost replication dynamics and selection driven by immune pressure, as well as genetic information of Omicron variants in Shanghai.
METHODS
Participants and Samples
In this prospective study, we collected samples from participants who visited Huashan Hospital and tested positive for SARS-CoV-2 from 13 March 2022 to 18 June 2022. Participants received a SARS-CoV-2 nasopharyngeal swab test (SARS-CoV-2 ZC-HX-201-2; Biogerm) based on reverse transcriptase polymerase chain reaction (PCR). Clinical data were collected, including symptoms, underlying diseases, viral shedding time, and SARS-CoV-2 vaccination doses. For the plasma pseudovirus neutralization tests, we collected serum samples from 5 participants 14 days after receiving the third dose of the COVID-19 inactivated BBIBP-CorV vaccine, as well as from 5 patients of Omicron BA.2.2 subvariant breakthrough infections during this Omicron wave in Shanghai.
Ethical Statement and Consent to Publish
The protocol and informed consent were approved by the ethical committee of Huashan Hospital of Fudan University (2020-688 and 2021-749). Consent to publish the data was obtained from participants.
SARS-CoV-2 Amplicon and Long Noncoding RNA Sequencing
Amplicon sequencing was performed to obtain the whole genome of SARS-CoV-2 by using specific primers from the COVID-19 ARTIC library (version 4.1; Illumina). The qualified libraries were then sequenced with the NovaSeq 6000 Platform (Illumina) with a paired-end 150–base pair strategy, and a minimum of 3 GB of data per sample was generated. Three pairs of samples were subjected to replicate experiments and iSNV calling. After mapping to the SARS-CoV-2 Wuhan-Hu-1 strain reference sequence (accession NC_045512.2), we identified single-nucleotide polymorphisms (SNPs) and iSNVs (minor allele frequency ≥5%) for each specimen. The detailed calling methods are presented in the supplementary methods. The dN/dS ratio was calculated by SNPGenie. The phylogenetic tree was constructed by MEGA X and annotated in iTOL (version 4).
After excluding participants with underlying diseases and matching by body mass index and age, we performed long non-coding RNA sequencing in 9, 10, and 10 cases in the unvaccinated, fully vaccinated, and booster groups, respectively. The qualified libraries contained >40 million reads for each sample (supplementary methods).
Plasma Pseudovirus Neutralization
The in silico mean escape over selected antibodies for the sites was obtained from the data set (https://jbloomlab.github.io/SARS2_RBD_Ab_escape_maps/). Pseudovirus incorporating spike protein from either wild type (D614G) or variants (BA.2, BA.2.2, BA.2.2 + R346K, BA.2.2 + G446S, BA.2.2 + L455S, and BA.2.2 + S939F) was constructed with a DNA plasmid expressing the spike protein and G*ΔG–VSV (VSV G pseudovirus) in 293T cells. The plasma pseudovirus neutralization test was performed to detect neutralizing titers (supplementary methods).
Whole-Exome Sequencing and Human Leukocyte Antigen Analysis
Total DNA was extracted from nasopharynx swab and sent for whole-exome sequencing. Human leukocyte antigen (HLA) typing was performed on all samples. We downloaded information on all SARS-CoV-2 epitopes and their corresponding HLA types from the IEDB database (https://www.iedb.org/).
Statistical Analyses
Categorial numbers or proportions were used to describe clinical characteristics, which were evaluated through a Pearson chi-square test. Continuous variables were described as mean ± SD if they were the gaussian distribution by the Kolmogorov-Smirnov test; otherwise, they were expressed as the median and IQR. The geometric mean titer (GMT) was calculated in the pseudovirus neutralization test. The Mann-Whitney U test was used to compare GMT between the groups. The distribution of mutations in different regions was calculated via Poisson regression. Risk factors of mutation numbers and the difference in iSNV numbers between vaccinated and unvaccinated participants were evaluated by multivariate Poisson regression. Spearman analysis assessed the association between differentially expressed genes (DEGs) and iSNV numbers, C > U/G > A numbers, and vaccination groups. P values <.05 were considered statistically significant. Statistical analysis was performed with Stata software (version 14.0). Figures were generated with GraphPad Prism (version 8) and RStudio (version 1.2).
RESULTS
Virus Evolution During the 3-Month Mass Outbreak in Shanghai
In late February 2022, a widespread transmission of SARS-CoV-2 occurred in Shanghai. This wave of infection ended in early June following the implementation of strict population-based screening and lockdown policies, ultimately resulting in approximately 0.65 million confirmed cases [11]. To elucidate the origin of this outbreak and track the evolution of the virus and its patterns of spread throughout the outbreak, we obtained the near full-length SARS-CoV-2 genome from 164 samples (Figure 1A). We included the following:
46, 74, 29, and 2 samples from 151 community cases in March, April, May, and June, respectively;
6 mixed screen swabs, with PCR cycle thresholds of the ORF1ab gene segment ranging from 11.16 to 30.50; and
7 sequential samples from 3 patients with prolonged COVID-19.
Of the 154 enrolled patients, 4 were hospitalized and 150 were from community screening. All cases were in the early stages of SARS-CoV-2 infection and were collected at the first positive test result in active surveillance (Table 1).
Clinical Characteristic . | No. (%) or Median (IQR) . |
---|---|
Male (n = 154) | 84 (52.60) |
Age, y (n = 154) | 46 (31–64) |
Vaccination status (n = 154) | |
Unvaccinated | 29 (18.83) |
Partially vaccinated | 7 (4.55) |
Fully vaccinated | 43 (27.92) |
Booster | 51 (33.12) |
Unknown | 24 (19.48) |
Sampling time (n = 162 samples included in phylogenetic tree) | |
March | 46 (28.40) |
Aprils | 76 (46.91) |
May | 36 (22.22) |
June | 4 (2.47) |
Symptomatic before the first positive report | 28/54 (51.85) |
Viral shedding time, d (n = 41) | 12 (9–16) |
Clinical Characteristic . | No. (%) or Median (IQR) . |
---|---|
Male (n = 154) | 84 (52.60) |
Age, y (n = 154) | 46 (31–64) |
Vaccination status (n = 154) | |
Unvaccinated | 29 (18.83) |
Partially vaccinated | 7 (4.55) |
Fully vaccinated | 43 (27.92) |
Booster | 51 (33.12) |
Unknown | 24 (19.48) |
Sampling time (n = 162 samples included in phylogenetic tree) | |
March | 46 (28.40) |
Aprils | 76 (46.91) |
May | 36 (22.22) |
June | 4 (2.47) |
Symptomatic before the first positive report | 28/54 (51.85) |
Viral shedding time, d (n = 41) | 12 (9–16) |
Clinical Characteristic . | No. (%) or Median (IQR) . |
---|---|
Male (n = 154) | 84 (52.60) |
Age, y (n = 154) | 46 (31–64) |
Vaccination status (n = 154) | |
Unvaccinated | 29 (18.83) |
Partially vaccinated | 7 (4.55) |
Fully vaccinated | 43 (27.92) |
Booster | 51 (33.12) |
Unknown | 24 (19.48) |
Sampling time (n = 162 samples included in phylogenetic tree) | |
March | 46 (28.40) |
Aprils | 76 (46.91) |
May | 36 (22.22) |
June | 4 (2.47) |
Symptomatic before the first positive report | 28/54 (51.85) |
Viral shedding time, d (n = 41) | 12 (9–16) |
Clinical Characteristic . | No. (%) or Median (IQR) . |
---|---|
Male (n = 154) | 84 (52.60) |
Age, y (n = 154) | 46 (31–64) |
Vaccination status (n = 154) | |
Unvaccinated | 29 (18.83) |
Partially vaccinated | 7 (4.55) |
Fully vaccinated | 43 (27.92) |
Booster | 51 (33.12) |
Unknown | 24 (19.48) |
Sampling time (n = 162 samples included in phylogenetic tree) | |
March | 46 (28.40) |
Aprils | 76 (46.91) |
May | 36 (22.22) |
June | 4 (2.47) |
Symptomatic before the first positive report | 28/54 (51.85) |
Viral shedding time, d (n = 41) | 12 (9–16) |
The viral genomes were amplified through tiling multiplex PCR, followed by next-generation sequencing on the Illumina NovaSeq platform. An average of 36.52 million reads was generated per sample, with a median read depth of 69713 and 99.03% genome coverage. The iSNVs were validated by replicate experiments (supplementary data), and the high sequencing depth guaranteed the reliable detection of iSNVs with a minor allele frequency of 5% in each sample.
The phylogenetic analysis (Figure 1B) included 162 qualified sequences and revealed that 158 viral genomes were closely congregated on the Omicron BA.2.2 (n = 7) and BA.2.2.1 (n = 151) branches of the SARS-CoV-2 evolutionary tree. The other samples were 2 Omicron BA.2.3 and 2 BA.1.1 subvariants collected before April. The genomic data indicated that multiple independent introductions occurred in March and early April, while the massive outbreak in Shanghai primarily resulted from the sustained transmission of Omicron BA.2.2.1. On the contrary, the transmission of the Omicron BA.1.1 and BA.2.3 subvariants was contained before any widespread catastrophic consequences.
Among these 158 sequences of the Omicron BA.2.2 lineage, we identified 103 SNPs and 146 iSNVs. These mutations were stochastically distributed across the viral genome, suggesting that laboratory contamination was marginal (Figure 1C, Supplementary Figure 1). The number of SNPs was not enriched in any specific region (Supplementary Figure 2).
Given that >70% of the residents in Shanghai had been vaccinated, we examined whether mutations accumulated over the 3 months of transmission, which might indicate immunity resistance induced by vaccination. In the set of samples from acute infection, no apparent convergent mutations were observed across the whole genome or on the spike protein gene. Neither the number of mutations nor the genetic divergence showed a significant increase in viral genomes collected in May and June when compared with those collected in March and April (Figure 1D). Although we found several community clusters, no prominent novel lineage emerged. We identified 12 mutation sites that were detected as SNPs and iSNVs. Four sites (G25563C, C19172T, T21765C, T8634C) became fixed SNPs in the population after detection as iSNVs with 5% to 50% frequency (Supplementary Figure 3).
The nonapparent incremental rounds of evolution—even after approximately 12 to 17 generations in humans, as estimated by an incubation time of 3 days [12, 13]—suggest that the spread of immune-escape variants into the population would be rare. In contrast, SARS-CoV-2 exhibited adaptive viral evolution in 2 patients who were immunosuppressed with prolonged infection and persistent viral shedding for 37 and 52 days (Figure 1E and 1F).
Within-Host Mutations in Fully Vaccinated Individuals
As the 3 months of human transmission might not provide enough time for variants with a fitness advantage to establish prevalence at the population level, we focused on within-host mutations to determine whether some traits were associated with vaccination. The 130 participants infected with Omicron BA.2.2 with confirmed vaccination statuses were divided into 2 groups based on their vaccination record: 29 unvaccinated and 7 who received the first dose of vaccine were considered the unvaccinated/partially vaccinated group; 43 who received the primary vaccination series and 51 who received the booster were considered the fully vaccinated group. When compared with those in the unvaccinated/partially vaccinated group, iSNVs in the fully vaccinated group were characterized by several features: First, the number of iSNVs (1.92 vs 1.06, P = .093) and their minor allele frequency (0.158 vs 0.109, P = .003) were profoundly reduced (Figure 2A and 2B). Second, when stratified by type of substitution, C > U and G > A (P = .085), especially C > U (P = .074), were the types less presented, indicating the absence of the inflammatory pathology-induced deamination of cytosines on viral genomic RNA [14] (Figure 2C–F). Third, more iSNVs were enriched on the spike gene (especially 22 000–23 000 base pairs, P = .022) in the fully vaccinated group than in the unvaccinated group (16.88% vs 4.35%, P = .040; Figure 2G). Finally, there was no indication of selection acting on within-host mutations at open reading frame regions (ORFs), as inferred by the ratio of nonsynonymous to synonymous substitutions (the dN/dS ratio; Figure 2H). The multivariate Poisson regression further confirmed that complete vaccination was a protective factor against an increase in the number of iSNVs (incidence rate ratio [IRR], 0.492 [95% CI, .293–.827]; P = .007; Supplementary Table 1).
Mutations Impair Neutralizing Antibody Potency or Escape T-Cell Immunity
The nonsynonymous mutations observed in this study may resist vaccine-induced immunity, including adaptive humoral and T-cell responses. We identified 7 nonsynonymous SNPs and 10 nonsynonymous iSNV sites on the Spike gene (Figure 3A). The N-terminal domain included L54F, I68T, and G232C mutations (SNPs) and P174T, G199V, D253N, and Q314H/L mutations (iSNVs), and the receptor-binding domain (RBD) contained a C379S mutation. These mutations were identified in samples from unvaccinated and vaccinated patients (Supplementary Table 2). I68T was present in BA.2.55 (98.03%), BA.2.36 (91.34%), and BA.2.10.3 (93.48%) as a lineage-specific site. L455W was also observed in BA.1.23 (81.82%; Supplementary Figure 4).
To test whether the RBD mutations could affect the viral entry blockade by immunized sera, the escape ability of the mutations was calculated in silico and experimentally validated with a pseudovirus neutralization assay. The mean escape values from selected antibodies for sites R346, G446, and L455 at the RBD were predicted to be 0.08, 0.12, and 0.11, respectively [15]. To varying degrees, booster sera and convalescent sera of the Omicron BA2.2 breakthrough infection exhibited diminished neutralization efficacy against pseudovirus with the G446S, L455S, or S939F sites. The sera of patients who received a homologous booster or had BA.2.2 breakthrough infections had a neutralizing GMT of 279.9 against BA.2.2 + L455S, with a 1.86-fold reduction against BA.2.2 (P = .002), and a GMT of 352.5 against BA.2.2 + S939F, with a 1.47-fold decrease against BA.2.2 (P = .004; Figure 3B, Supplementary Figure 5). Considering that the representative BA.1 mutation in newly emerging sublineages (BF.7, BA.4.1.8, BA.4.6, et al) had immune evasion properties, we also evaluated the escape of BA.2.2 + R346K and BA.2.2 + G446S in vaccinated and breakthrough infection sera. We found that the neutralizing GMT was 623.7 against BA.2.2 + R346K, with a 0.83-fold reduction against BA.2.2 (P = .492), and the GMT was 234.7 against BA.2.2 + G446S, with a 2.21-fold reduction against BA.2.2 (P = .002).
Mutations that escaped T-cell regulation would be restricted in the epitopes presented by the corresponding HLA. Therefore, we determined the HLA genotype of patients with nonsynonymous SARS-CoV-2 mutations. We examined whether the nonsynonymous mutations occurred in the reported epitopes of the same HLA allele in each patient. Out of 98 nonsynonymous mutations, 13 were associated with the respective HLA allele, suggesting that some variants have advantages in escaping the T-cell response (Table 2).
Description . | Starting Position . | Ending Position . | CDS Region . | Mutation Site . | Classification . | Detected HLA Type in Patients With Virus Mutation . | Reported Correlated MHC Molecule . |
---|---|---|---|---|---|---|---|
QLSLPVLQV | 15 | 23 | ORF1ab | S17R | iSNV | HLA-A*02:07 | HLA-A*02:01; |
SANNCTFEYVSQPFLMD | 162 | 178 | S | P174T | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
EYVSQPFLMDLEGKQGN | 169 | 185 | S | P174T | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
KNIDGYFKIY | 195 | 204 | S | G199V | iSNV | HLA-A*02:01 | HLA-A*02:01; HLA-B*15:01; |
EKGIYQTSNFRVQPTES | 309 | 325 | S | Q314L | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
RLFRKSNLK | 454 | 462 | S | L455S | iSNV | HLA-A*11:01 | HLA-A*03:01; HLA-A*11:01; |
QSINFVRIIMRLWLC | 116 | 130 | ORF3a | W128L | iSNV | HLA-DRB1*09:01; HLA-DRB1*04:05 | HLA-DRB1*01:01; HLA-DRB1*03:01; HLA-DRB1*04:01; HLA-DRB1*04:05; HLA-DRB1*07:01; HLA-DRB1*08:02; HLA-DRB1*09:01; HLA-DRB1*11:01; HLA-DRB1*12:01; HLA-DRB1*15:01; HLA-DRB4*01:01; HLA-DRB5*01:01; |
QFAYANRNRFLYIIK | 36 | 50 | M | L46F | iSNV | HLA-DRB1*03:01 | HLA-DRB1*01:01;HLA-DRB1*03:01;HLA-DRB1*04:01;HLA-DRB1*04:05;HLA-DRB1*07:01;HLA-DRB1*08:02;HLA-DRB1*09:01;HLA-DRB1*11:01;HLA-DRB1*12:01;HLA-DRB1*13:02;HLA-DRB1*15:01;HLA-DRB3*01:01;HLA-DRB3*02:02;HLA-DRB5*01:01; |
RGHLRIAGHHLGRCD | 146 | 160 | M | G147R | iSNV | HLA-DRB1*04:05 | HLA-DRB1*01:01;HLA-DRB1*04:01;HLA-DRB1*04:05;HLA-DRB1*07:01;HLA-DRB1*08:02;HLA-DRB1*09:01;HLA-DRB1*11:01;HLA-DRB1*13:02;HLA-DRB1*15:01;HLA-DRB3*01:01;HLA-DRB3*02:02;HLA-DRB4*01:01;HLA-DRB5*01:01;HLA-DRB1*03:01;HLA-DRB1*12:01;HLA-DPA1*01:03/DPB1*04:01;HLA-DPB1*02:01;HLA-DQA1*01:01/DQB1*05:01;HLA-DQA1*05:01/DQB1*03:01; |
TQDLFLPFFSNVTWF | 51 | 65 | S | L54F | SNP | HLA-DRB1*15:01; | HLA-DRB1*01:01; HLA-DRB1*04:01; HLA-DRB1*15:01; |
QYIKWPWYI | 1208 | 1216 | S | I1216T | SNP | HLA-A*24:02; | HLA-A*24:02; |
YDANYFLCW | 141 | 149 | ORF3a | Y141H | SNP | HLA-B*44 | HLA-B*44:02; |
CLVGLMWLSYFIASF | 86 | 100 | M | F100L | SNP | HLA-DRB1*01:01; HLA-DRB1*15:01; | HLA-DRB1*01:01; HLA-DRB1*04:05; HLA-DRB1*07:01; HLA-DRB1*09:01; HLA-DRB1*12:01; HLA-DRB1*15:01; |
SYFIASFRLF | 94 | 103 | M | F100L | SNP | HLA-A*24:02; | HLA-A*24:02; |
IMLIIFWFSL | 23 | 32 | ORF7b | S31L | SNP | HLA-A*02:01 | HLA-A*02:01; HLA-A*02:02; |
IIFWFSLEL | 26 | 34 | ORF7b | S31L | SNP | HLA-A*02:01 | HLA-A*02:01; |
Description . | Starting Position . | Ending Position . | CDS Region . | Mutation Site . | Classification . | Detected HLA Type in Patients With Virus Mutation . | Reported Correlated MHC Molecule . |
---|---|---|---|---|---|---|---|
QLSLPVLQV | 15 | 23 | ORF1ab | S17R | iSNV | HLA-A*02:07 | HLA-A*02:01; |
SANNCTFEYVSQPFLMD | 162 | 178 | S | P174T | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
EYVSQPFLMDLEGKQGN | 169 | 185 | S | P174T | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
KNIDGYFKIY | 195 | 204 | S | G199V | iSNV | HLA-A*02:01 | HLA-A*02:01; HLA-B*15:01; |
EKGIYQTSNFRVQPTES | 309 | 325 | S | Q314L | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
RLFRKSNLK | 454 | 462 | S | L455S | iSNV | HLA-A*11:01 | HLA-A*03:01; HLA-A*11:01; |
QSINFVRIIMRLWLC | 116 | 130 | ORF3a | W128L | iSNV | HLA-DRB1*09:01; HLA-DRB1*04:05 | HLA-DRB1*01:01; HLA-DRB1*03:01; HLA-DRB1*04:01; HLA-DRB1*04:05; HLA-DRB1*07:01; HLA-DRB1*08:02; HLA-DRB1*09:01; HLA-DRB1*11:01; HLA-DRB1*12:01; HLA-DRB1*15:01; HLA-DRB4*01:01; HLA-DRB5*01:01; |
QFAYANRNRFLYIIK | 36 | 50 | M | L46F | iSNV | HLA-DRB1*03:01 | HLA-DRB1*01:01;HLA-DRB1*03:01;HLA-DRB1*04:01;HLA-DRB1*04:05;HLA-DRB1*07:01;HLA-DRB1*08:02;HLA-DRB1*09:01;HLA-DRB1*11:01;HLA-DRB1*12:01;HLA-DRB1*13:02;HLA-DRB1*15:01;HLA-DRB3*01:01;HLA-DRB3*02:02;HLA-DRB5*01:01; |
RGHLRIAGHHLGRCD | 146 | 160 | M | G147R | iSNV | HLA-DRB1*04:05 | HLA-DRB1*01:01;HLA-DRB1*04:01;HLA-DRB1*04:05;HLA-DRB1*07:01;HLA-DRB1*08:02;HLA-DRB1*09:01;HLA-DRB1*11:01;HLA-DRB1*13:02;HLA-DRB1*15:01;HLA-DRB3*01:01;HLA-DRB3*02:02;HLA-DRB4*01:01;HLA-DRB5*01:01;HLA-DRB1*03:01;HLA-DRB1*12:01;HLA-DPA1*01:03/DPB1*04:01;HLA-DPB1*02:01;HLA-DQA1*01:01/DQB1*05:01;HLA-DQA1*05:01/DQB1*03:01; |
TQDLFLPFFSNVTWF | 51 | 65 | S | L54F | SNP | HLA-DRB1*15:01; | HLA-DRB1*01:01; HLA-DRB1*04:01; HLA-DRB1*15:01; |
QYIKWPWYI | 1208 | 1216 | S | I1216T | SNP | HLA-A*24:02; | HLA-A*24:02; |
YDANYFLCW | 141 | 149 | ORF3a | Y141H | SNP | HLA-B*44 | HLA-B*44:02; |
CLVGLMWLSYFIASF | 86 | 100 | M | F100L | SNP | HLA-DRB1*01:01; HLA-DRB1*15:01; | HLA-DRB1*01:01; HLA-DRB1*04:05; HLA-DRB1*07:01; HLA-DRB1*09:01; HLA-DRB1*12:01; HLA-DRB1*15:01; |
SYFIASFRLF | 94 | 103 | M | F100L | SNP | HLA-A*24:02; | HLA-A*24:02; |
IMLIIFWFSL | 23 | 32 | ORF7b | S31L | SNP | HLA-A*02:01 | HLA-A*02:01; HLA-A*02:02; |
IIFWFSLEL | 26 | 34 | ORF7b | S31L | SNP | HLA-A*02:01 | HLA-A*02:01; |
Abbreviations: CDS, coding sequence; HLA, human leukocyte antigen; iSNV, intrahost single-nucleotide variant; MHC, major histocompatibility complex; SNP, single-nucleotide polymorphism.
Description . | Starting Position . | Ending Position . | CDS Region . | Mutation Site . | Classification . | Detected HLA Type in Patients With Virus Mutation . | Reported Correlated MHC Molecule . |
---|---|---|---|---|---|---|---|
QLSLPVLQV | 15 | 23 | ORF1ab | S17R | iSNV | HLA-A*02:07 | HLA-A*02:01; |
SANNCTFEYVSQPFLMD | 162 | 178 | S | P174T | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
EYVSQPFLMDLEGKQGN | 169 | 185 | S | P174T | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
KNIDGYFKIY | 195 | 204 | S | G199V | iSNV | HLA-A*02:01 | HLA-A*02:01; HLA-B*15:01; |
EKGIYQTSNFRVQPTES | 309 | 325 | S | Q314L | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
RLFRKSNLK | 454 | 462 | S | L455S | iSNV | HLA-A*11:01 | HLA-A*03:01; HLA-A*11:01; |
QSINFVRIIMRLWLC | 116 | 130 | ORF3a | W128L | iSNV | HLA-DRB1*09:01; HLA-DRB1*04:05 | HLA-DRB1*01:01; HLA-DRB1*03:01; HLA-DRB1*04:01; HLA-DRB1*04:05; HLA-DRB1*07:01; HLA-DRB1*08:02; HLA-DRB1*09:01; HLA-DRB1*11:01; HLA-DRB1*12:01; HLA-DRB1*15:01; HLA-DRB4*01:01; HLA-DRB5*01:01; |
QFAYANRNRFLYIIK | 36 | 50 | M | L46F | iSNV | HLA-DRB1*03:01 | HLA-DRB1*01:01;HLA-DRB1*03:01;HLA-DRB1*04:01;HLA-DRB1*04:05;HLA-DRB1*07:01;HLA-DRB1*08:02;HLA-DRB1*09:01;HLA-DRB1*11:01;HLA-DRB1*12:01;HLA-DRB1*13:02;HLA-DRB1*15:01;HLA-DRB3*01:01;HLA-DRB3*02:02;HLA-DRB5*01:01; |
RGHLRIAGHHLGRCD | 146 | 160 | M | G147R | iSNV | HLA-DRB1*04:05 | HLA-DRB1*01:01;HLA-DRB1*04:01;HLA-DRB1*04:05;HLA-DRB1*07:01;HLA-DRB1*08:02;HLA-DRB1*09:01;HLA-DRB1*11:01;HLA-DRB1*13:02;HLA-DRB1*15:01;HLA-DRB3*01:01;HLA-DRB3*02:02;HLA-DRB4*01:01;HLA-DRB5*01:01;HLA-DRB1*03:01;HLA-DRB1*12:01;HLA-DPA1*01:03/DPB1*04:01;HLA-DPB1*02:01;HLA-DQA1*01:01/DQB1*05:01;HLA-DQA1*05:01/DQB1*03:01; |
TQDLFLPFFSNVTWF | 51 | 65 | S | L54F | SNP | HLA-DRB1*15:01; | HLA-DRB1*01:01; HLA-DRB1*04:01; HLA-DRB1*15:01; |
QYIKWPWYI | 1208 | 1216 | S | I1216T | SNP | HLA-A*24:02; | HLA-A*24:02; |
YDANYFLCW | 141 | 149 | ORF3a | Y141H | SNP | HLA-B*44 | HLA-B*44:02; |
CLVGLMWLSYFIASF | 86 | 100 | M | F100L | SNP | HLA-DRB1*01:01; HLA-DRB1*15:01; | HLA-DRB1*01:01; HLA-DRB1*04:05; HLA-DRB1*07:01; HLA-DRB1*09:01; HLA-DRB1*12:01; HLA-DRB1*15:01; |
SYFIASFRLF | 94 | 103 | M | F100L | SNP | HLA-A*24:02; | HLA-A*24:02; |
IMLIIFWFSL | 23 | 32 | ORF7b | S31L | SNP | HLA-A*02:01 | HLA-A*02:01; HLA-A*02:02; |
IIFWFSLEL | 26 | 34 | ORF7b | S31L | SNP | HLA-A*02:01 | HLA-A*02:01; |
Description . | Starting Position . | Ending Position . | CDS Region . | Mutation Site . | Classification . | Detected HLA Type in Patients With Virus Mutation . | Reported Correlated MHC Molecule . |
---|---|---|---|---|---|---|---|
QLSLPVLQV | 15 | 23 | ORF1ab | S17R | iSNV | HLA-A*02:07 | HLA-A*02:01; |
SANNCTFEYVSQPFLMD | 162 | 178 | S | P174T | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
EYVSQPFLMDLEGKQGN | 169 | 185 | S | P174T | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
KNIDGYFKIY | 195 | 204 | S | G199V | iSNV | HLA-A*02:01 | HLA-A*02:01; HLA-B*15:01; |
EKGIYQTSNFRVQPTES | 309 | 325 | S | Q314L | iSNV | HLA-DRB1*04 | HLA-DRA*01:01/DRB1*04:01; |
RLFRKSNLK | 454 | 462 | S | L455S | iSNV | HLA-A*11:01 | HLA-A*03:01; HLA-A*11:01; |
QSINFVRIIMRLWLC | 116 | 130 | ORF3a | W128L | iSNV | HLA-DRB1*09:01; HLA-DRB1*04:05 | HLA-DRB1*01:01; HLA-DRB1*03:01; HLA-DRB1*04:01; HLA-DRB1*04:05; HLA-DRB1*07:01; HLA-DRB1*08:02; HLA-DRB1*09:01; HLA-DRB1*11:01; HLA-DRB1*12:01; HLA-DRB1*15:01; HLA-DRB4*01:01; HLA-DRB5*01:01; |
QFAYANRNRFLYIIK | 36 | 50 | M | L46F | iSNV | HLA-DRB1*03:01 | HLA-DRB1*01:01;HLA-DRB1*03:01;HLA-DRB1*04:01;HLA-DRB1*04:05;HLA-DRB1*07:01;HLA-DRB1*08:02;HLA-DRB1*09:01;HLA-DRB1*11:01;HLA-DRB1*12:01;HLA-DRB1*13:02;HLA-DRB1*15:01;HLA-DRB3*01:01;HLA-DRB3*02:02;HLA-DRB5*01:01; |
RGHLRIAGHHLGRCD | 146 | 160 | M | G147R | iSNV | HLA-DRB1*04:05 | HLA-DRB1*01:01;HLA-DRB1*04:01;HLA-DRB1*04:05;HLA-DRB1*07:01;HLA-DRB1*08:02;HLA-DRB1*09:01;HLA-DRB1*11:01;HLA-DRB1*13:02;HLA-DRB1*15:01;HLA-DRB3*01:01;HLA-DRB3*02:02;HLA-DRB4*01:01;HLA-DRB5*01:01;HLA-DRB1*03:01;HLA-DRB1*12:01;HLA-DPA1*01:03/DPB1*04:01;HLA-DPB1*02:01;HLA-DQA1*01:01/DQB1*05:01;HLA-DQA1*05:01/DQB1*03:01; |
TQDLFLPFFSNVTWF | 51 | 65 | S | L54F | SNP | HLA-DRB1*15:01; | HLA-DRB1*01:01; HLA-DRB1*04:01; HLA-DRB1*15:01; |
QYIKWPWYI | 1208 | 1216 | S | I1216T | SNP | HLA-A*24:02; | HLA-A*24:02; |
YDANYFLCW | 141 | 149 | ORF3a | Y141H | SNP | HLA-B*44 | HLA-B*44:02; |
CLVGLMWLSYFIASF | 86 | 100 | M | F100L | SNP | HLA-DRB1*01:01; HLA-DRB1*15:01; | HLA-DRB1*01:01; HLA-DRB1*04:05; HLA-DRB1*07:01; HLA-DRB1*09:01; HLA-DRB1*12:01; HLA-DRB1*15:01; |
SYFIASFRLF | 94 | 103 | M | F100L | SNP | HLA-A*24:02; | HLA-A*24:02; |
IMLIIFWFSL | 23 | 32 | ORF7b | S31L | SNP | HLA-A*02:01 | HLA-A*02:01; HLA-A*02:02; |
IIFWFSLEL | 26 | 34 | ORF7b | S31L | SNP | HLA-A*02:01 | HLA-A*02:01; |
Abbreviations: CDS, coding sequence; HLA, human leukocyte antigen; iSNV, intrahost single-nucleotide variant; MHC, major histocompatibility complex; SNP, single-nucleotide polymorphism.
Relationship Among Mutations, Virus Replication Activity, and the Local Host Response
We included fully/booster-vaccinated and unvaccinated participants before and 6 months after infections (Supplementary Figure 6). Serum samples were obtained from prevaccination, 180 days after 2 doses of vaccination, and 180 days after the third dose. The total antibody of the booster group on day 180 was significantly higher than the 2-dose vaccination group on day 180 and the unvaccinated participants (433.4 vs 12.01 and 0.60 binding antibody units [BAU]/mL, both P < .001). We also had fully vaccinated, partially vaccinated, and unvaccinated cases during this outbreak. At 6 months after breakthrough infections, the median antibody was 2074, 1358.0, and 16.10 BAU/mL in fully vaccinated, partially vaccinated, and unvaccinated groups, respectively. The antibody of the fully vaccinated group was significantly higher than that of the unvaccinated group (2074 vs 16.10 BAU/mL, P < .001).
To evaluate the local host response and viral replication activity in the swabs, RNA sequencing identified subgenomic RNA (sgRNA) through the viral sequence following the common leader sequence. sgRNA indicates active SARS-CoV-2 replication because only actively replicating SARS-CoV-2 produces sgRNA (Figure 4A). Thus, the ratio of sgRNA to total genomic RNA is an indicator of viral replication activity [16–18]. Although the activity did not significantly differ between the unvaccinated and fully vaccinated groups, there was a tendency for decreased activity in the vaccinated group (Figure 4B). Regarding the proportion of different subgenomics, the nucleocapsid subgenome was present in the highest proportion in all 3 groups (P < .001). In each subgenome, no significant differences were observed among the vaccination groups (Figure 4B and 4C).
Conversely, as vaccine-induced immunity prevents the development of severe disease, the comparative transcriptome analyses of the host genome revealed that the B-cell activation, T-cell activation, and differentiation responses were significantly increased in the fully vaccinated group (Figure 4D). Several immune response genes were correlated with the total number of iSNVs or the number of C > U/G > A substitutions, including genes involved in the innate immune response (eg, IFNG, IL1β, IL10, IFNA8, IFNA17, IFIT2, IFIT3, CCL4, CXCL10, TREM2) [19–21]. It is worth noting that there was a significant positive correlation between IL10 and C > U/G > A substitutions (P = .049). Additionally, several T-cell activation–related genes were upregulated in samples with more abundant iSNVs or C > U/G > A substitutions (CD8A, CD3D, CD274, DPP4, CD86, HLA-DRA, IL2RA). Subsets of genes decoding epithelial cell injury were more highly expressed in samples with more iSNVs or C > U/G > A substitutions (S100A7, LCE3C, SPRR2D, SPINK, CALM5, EMP1). Together, these data indicated high numbers of iSNVs or C > U/G > A substitutions in patients with severe immunopathology.
DISCUSSION
In this study, we delineated 2 pathways through which novel mutations arise: selection within the human population during community spread and intrahost variation in chronic infection. The emergence of transmissible variants in communities has also been reported in other studies [22]. Although the sampling of SARS-CoV-2 was randomized in the Omicron wave, we observed the process of minor allele sites becoming fixed SNPs in 4 mutations. These 4 sites were detected in samples from unvaccinated and vaccinated patients, indicating that the selection could stem from naive natural infection or breakthrough infection. Although purifying selection limited iSNV transmission and fixation [8, 9, 23], immune pressure may drive the occurrence of these mutations.
Investigations have documented prolonged infection during an immunosuppressed disease state [24]; as such, treatment with convalescent plasma therapy or antiviral medication can lead to variable and potentially even widespread immune escape. As virus replication lasts for a longer time in patients who are immunocompromised, the intrahost viral evolution may become more apparent. Studies reported that rebound infections occur in patients with underlying hematologic diseases [25]. Moreover, 2% to 4% of rebound infections occurred after Paxlovid treatment in Omicron sublineages [26]. Closely monitoring immune escape variants via iSNVs would be beneficial when designing immunogens and developing neutralizing antibodies to prevent such variants and treat breakthrough infections. Variants generated from chronic infections may result in forward transmission to others [27], potentially leading to community spread. These findings indicate the promptness of prevention and biosafety measures to control the potential spread of chronic infections into the community.
Our study is the first to validate human iSNVs and SNPs using serum from vaccinated participants and BA.2 breakthrough infection. Although the GMT in the breakthrough group decreased, it remained with a 100.00% positive neutralization rate, indicating that the Omicron variant could elicit neutralizing antibodies cross-reactive to the prototype virus and be adopted for future vaccine development. Here, under the T-cell immune response, we identified several patients having HLA-associated iSNV or SNP mutations with a linear epitope, similar to the HLA-associated selection pressure reported in other studies [10, 28, 29].
Vaccinations have been used to build immunity and protect against SARS-CoV-2 infection, and several previous studies focused on the effect of the vaccine on infection and the prevention of severe infection [30–32]. For the first time, our results showed that vaccination could lead to fewer iSNVs and hypermutations. This observation may result from diminished viral replication and transcription in the vaccinees [33]. Additionally, 2 or 3 doses of vaccination significantly shortened the virus clearance time [33], thereby reducing the probability of variants emerging. Our study found that vaccination could boost humoral responses by inducing higher antibody levels and increasing the T cell–mediated immune response, which is in agreement with previous studies [34–36].
The enrichment of C > U/G > A hypermutations in unvaccinated patients highlighted host-virus editing and was associated with high transcription and replication activities. Apolipoprotein B mRNA-editing enzyme catalytic polypeptide (APOBEC) commonly mediates C > U/G > A hypermutations by deaminating viral RNA cytosines of the viral RNA. SARS-CoV-2 evolution may have been driven by RNA editing to adapt to the human host. APOBEC3 belongs to the interferon-stimulated gene and has been reported to inhibit viral replication [37–39]. The strong relationship between the number of iSNVs/hypermutations and the innate response genes was consistent with the high expression of several cytokines. These host responses were also consistent with earlier observations, including an interferon-induced antiviral response in APOBEC3 editing in patients with severe infection [40]. As part of adaptive immunity [41], APOBEC3-mediated mutagenesis could produce truncated viral proteins degraded by the cell's proteolytic processing pathways, thereby increasing antigen presentation and major histocompatibility complex class I binding.
Mutations in spiked protein can cause replication and pathogenic fitness in addition to immune evasion capacity. Several studies have shown that mutations in the spike protein lead to immune evasion capacity change with replication fitness [42, 43]. The transmissibility of Delta outcompeted Omicron in the absence of selection pressure, but this situation was altered when immune selection pressure was involved [44]. In this study, we identified that the spike mutations G446S, L455S, and S939F had stronger immune evasion phenotypes, while the replication and pathogenicity were not performed, which could be completed in the future.
Several limitations should be noted in our study. The limited sample size might not be fully reflective of immune selection during this massive outbreak, and more sequences in distinct scenarios and sequential samples could be studied in the future. The number of samples analyzed by RNA sequencing was small because we used strict selection criteria. Although most patients received reverse transcriptase PCR tests during the early stage of the disease as a nonpharmaceutical public health intervention in Shanghai's BA.2.2 wave, the specific onset day of sampling remained unclear, which may have influenced the interpretation of within-host dynamics.
In conclusion, our analysis revealed a unique pattern of intrahost variation that was significantly affected by vaccination. These mutations may lead to more severe immune escape or the emergence of new variants of concern. Intrahost genetic variation in SARS-CoV-2 showed an association between vaccine-induced immunity and reduced within-host virus diversity, as well as milder inflammatory pathophysiology. Continuous surveillance of SARS-CoV-2 variants should be implemented following the adjustment prevention and control policies.
Supplementary Data
Supplementary materials are available at The Journal of Infectious Diseases online (http://jid.oxfordjournals.org/). Supplementary materials consist of data provided by the author that are published to benefit the reader. The posted materials are not copyedited. The contents of all supplementary data are the sole responsibility of the authors. Questions or messages regarding errors should be addressed to the author.
Notes
Author contributions. Y. Zhang, Y. Zhou, J. C., and J. W. performed PCR and sequencing. X. W. and P. W. performed pseudovirus neutralization tests. Y. Zhang, Y. Zhou, J. C., J. W., Y. Zhang, S. W., P. C., Z. S., Y. X., and T. X. collected RNA samples. Y. L., Q. Z., J. C., and H. Z. collected clinical data. W. Z., C. Q., N. J., and J. A. supervised this study. Y. Zhang and C. Q. drafted the manuscript. All authors have read the manuscript.
Financial support This work was supported by the National Natural Science Foundation of China (82041010, 92169212, 82161138018); Shanghai Science and Technology Committee-Shanghai Municipal Science and Technology Major Project (HS2021SHZX001); Shanghai Science and Technology Committee (21NL2600100, 20dz2260100); and National Science and Technology Major Project (2023YFC3043500).
References
COVID-19 dashboard by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU). Accessed 8 August 2022. Available at: https://www.arcgis.com/apps/opsdashboard/index.html#bda7594740fd40299423467b48e9ecf6[EB/OL]
Author notes
Y. Zhang, Y. Zhou, J. C., J. W., and X. W. contributed equally to the study.
J. A., N. J., C. Q., and W. Z. contributed equally to the study.
Potential conflicts of interest All authors: No reported conflicts.
All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest.