Venky Soundararajan

Venky Soundararajan

Cambridge, Massachusetts, United States
14K followers 500+ connections

About

Building the World's largest biomedical augmented Intelligence (aI) software platform for…

Experience & Education

  • nference

View Venky’s full experience

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

  • Durability analysis of the highly effective BNT162b2 vaccine against COVID-19

    Proceedings of the National Academy of Sciences (PNAS) Nexus

    SARS-CoV-2 breakthrough infections have been increasingly reported in fully vaccinated individuals. We conducted a test-negative case-control study to assess the durability of protection after full vaccination with BNT162b2, defined as 14 days after the second dose, against polymerase chain reaction (PCR)-confirmed symptomatic SARS-CoV-2 infection, in a national medical practice between February 1, 2021 and August 22, 2021. We fit conditional logistic regression (CLR) models stratified on…

    SARS-CoV-2 breakthrough infections have been increasingly reported in fully vaccinated individuals. We conducted a test-negative case-control study to assess the durability of protection after full vaccination with BNT162b2, defined as 14 days after the second dose, against polymerase chain reaction (PCR)-confirmed symptomatic SARS-CoV-2 infection, in a national medical practice between February 1, 2021 and August 22, 2021. We fit conditional logistic regression (CLR) models stratified on residential county and calendar time of testing to assess the association between time elapsed since vaccination and the odds of symptomatic infection or non-COVID-19 hospitalization (negative control), adjusted for several covariates. The primary population included 652 individuals who had a positive symptomatic test after full vaccination with BNT162b2 (cases) and 5,946 individuals with at least one negative symptomatic test after full vaccination (controls). The adjusted odds of symptomatic infection were higher 120 days after full vaccination versus at the date of full vaccination (Odds Ratio [OR]: 3.21, 95% confidence interval [CI]: 1.33-7.74). Importantly, the odds of infection were still lower 150 days after the first BNT162b2 dose as compared to 4 days after the first dose (OR: 0.3, 95% CI: 0.19-0.45), when immune protection approximates the unvaccinated status. Low rates of COVID-19 associated hospitalization or death in this cohort precluded analyses of these severe outcomes. The odds of experiencing a non-COVID-19 hospitalization decreased with time since vaccination, suggesting a possible underestimation of waning protection by this approach due to confounding factors. Taken together, these data constitute an early signal for waning protection against symptomatic illness while also providing reassurance that BNT162b2 continues to protect against symptomatic SARS-CoV-2 infection several months after full vaccination.

    Other authors
    See publication
  • Surveillance of Safety of 3 Doses of COVID-19 mRNA Vaccination Using Electronic Health Records

    Journal of American Medical Association (JAMA) Network Open

    Results  Among 47 999 individuals who received 3-dose COVID-19 mRNA vaccines, 38 094 individuals (21 835 [57.3%] women; median [IQR] age, 67.4 [52.5-76.5] years) received BNT162b2 (79.4%) and 9905 individuals (5099 [51.5%] women; median [IQR] age, 67.7 [59.5-73.9] years) received mRNA-1273 (20.6%). Reporting of severe adverse events remained low after the third vaccine dose, with rates of pericarditis (0.01%; 95% CI, 0%-0.02%), anaphylaxis (0%; 95% CI, 0%-0.01%), myocarditis (0%; 95% CI…

    Results  Among 47 999 individuals who received 3-dose COVID-19 mRNA vaccines, 38 094 individuals (21 835 [57.3%] women; median [IQR] age, 67.4 [52.5-76.5] years) received BNT162b2 (79.4%) and 9905 individuals (5099 [51.5%] women; median [IQR] age, 67.7 [59.5-73.9] years) received mRNA-1273 (20.6%). Reporting of severe adverse events remained low after the third vaccine dose, with rates of pericarditis (0.01%; 95% CI, 0%-0.02%), anaphylaxis (0%; 95% CI, 0%-0.01%), myocarditis (0%; 95% CI, 0%-0.01%), and cerebral venous sinus thrombosis (no individuals) consistent with results from earlier studies. Significantly more individuals reported low-severity adverse events after the third dose compared with after the second dose, including fatigue (2360 individuals [4.92%] vs 1665 individuals [3.47%]; P < .001), lymphadenopathy (1387 individuals [2.89%] vs 995 individuals [2.07%]; P < .001), nausea (1259 individuals [2.62%] vs 979 individuals [2.04%]; P < .001), headache (1185 individuals [2.47%] vs 992 individuals [2.07%]; P < .001), arthralgia (1019 individuals [2.12%] vs 816 individuals [1.70%]; P < .001), myalgia (956 individuals [1.99%] vs 784 individuals [1.63%]; P < .001), diarrhea (817 individuals [1.70%] vs 595 individuals [1.24%]; P < .001), fever (533 individuals [1.11%] vs 391 individuals [0.81%]; P < .001), vomiting (528 individuals [1.10%] vs 385 individuals [0.80%]; P < .001), and chills (224 individuals [0.47%] vs 175 individuals [0.36%]; P = .01).

    Conclusions and Relevance  This study found that although third-dose vaccination against SARS-CoV-2 infection was associated with increased reporting of low-severity adverse events, risk of severe adverse events remained comparable with risk associated with the standard 2-dose regime. These findings suggest the safety of third vaccination doses in individuals who were eligible for booster vaccination at the time of this study.

    Other authors
    See publication
  • Genetic alteration of human MYH6 is mimicked by SARS-CoV-2 polyprotein: mapping viral variants of cardiac interest

    Cell Death Discovery (Nature publishing group)

    Acute cardiac injury has been observed in a subset of COVID-19 patients, but the molecular basis for this clinical phenotype is unknown. It has been hypothesized that molecular mimicry may play a role in triggering an autoimmune inflammatory reaction in some individuals after SARS-CoV-2 infection. Here we investigate if linear peptides contained in proteins that are primarily expressed in the heart also occur in the SARS-CoV-2 proteome. Specifically, we compared the library of 136,704 8-mer…

    Acute cardiac injury has been observed in a subset of COVID-19 patients, but the molecular basis for this clinical phenotype is unknown. It has been hypothesized that molecular mimicry may play a role in triggering an autoimmune inflammatory reaction in some individuals after SARS-CoV-2 infection. Here we investigate if linear peptides contained in proteins that are primarily expressed in the heart also occur in the SARS-CoV-2 proteome. Specifically, we compared the library of 136,704 8-mer peptides from 144 human proteins (including splicing variants) to 9,926 8-mers from all 17 viral proteins in the reference SARS-CoV-2 proteome. No 8-mers were exactly identical between the reference human proteome and the reference SARS-CoV-2 proteome. However, there were 45 8-mers that differed by only one amino acid when compared to the reference SARS-CoV-2 proteome. Interestingly, analysis of protein-coding mutations from 141,456 individuals showed that one of these 8-mers from the SARS-CoV-2 Replicase polyprotein 1a/1ab (KIALKGGK) is identical to a MYH6 peptide encoded by the c.5410C>A (Q1804K) genetic variation, which has been observed at low prevalence in Africans/African Americans (0.08%), East Asians (0.3%), South Asians (0.06%) and Latino/Admixed Americans (0.003%). Furthermore, analysis of 4.85 million SARS-CoV-2 genomes from over 200 countries shows that viral evolution has already resulted in 20 additional 8-mer peptides that are identical to human heart-enriched proteins encoded by reference sequences or genetic variants. Whether such mimicry contributes to cardiac inflammation during or after COVID-19 illness warrants further experimental evaluation. We suggest that SARS-CoV-2 variants harboring peptides identical to human cardiac proteins should be investigated as ‘viral variants of cardiac interest’.

    Other authors
    See publication
  • Genomic diversification of long polynucleotide fragments is a signature of emerging SARS-CoV-2 variants of concern

    PNAS Nexus

    With over 5 million SARS-CoV-2 genomes sequenced globally over the last 2 years, there is unprecedented data to decipher how competitive viral evolution results in the emergence of fitter SARS-CoV-2 variants. Much attention has been directed to studying how specific mutations in the Spike protein impact its binding to the ACE2 receptor or viral neutralization by antibodies, but there is limited knowledge of genomic signatures shared primarily by dominant variants. Here we introduce a…

    With over 5 million SARS-CoV-2 genomes sequenced globally over the last 2 years, there is unprecedented data to decipher how competitive viral evolution results in the emergence of fitter SARS-CoV-2 variants. Much attention has been directed to studying how specific mutations in the Spike protein impact its binding to the ACE2 receptor or viral neutralization by antibodies, but there is limited knowledge of genomic signatures shared primarily by dominant variants. Here we introduce a methodology to quantify the genome-wide distinctiveness of polynucleotide fragments of various lengths (3- to 240-mers) that constitute SARS-CoV-2 lineage genomes. Compared to standard phylogenetic distance metrics and overall mutational load, the quantification of distinctive 9-mer polynucleotides provides a higher resolution of separation between variants of concern (Reference = 89, IQR: 65-108; Alpha = 166, IQR: 150-182; Beta 130, IQR: 113-147; Gamma = 165, IQR: 152-180; Delta = 234, IQR: 216-253; and Omicron = 294, IQR: 287-315). The similar scoring of the Alpha and Gamma variants by our methodology is consistent with these strains emerging at approximately the same time and circulating in distinct geographical regions as dominant strains. Furthermore, evaluation of genomic distinctiveness for 1,363 lineages annotated in GISAID highlights that polynucleotide diversity has increased over time (R2 = 0.37) and that VOCs show high distinctiveness compared to non-VOC contemporary lineages. To facilitate similar real-time assessments on the competitive fitness potential of future variants, we are launching a freely accessible resource for infusing pandemic preparedness with genomic inference (“GENI” – https://academia.nferx.com/GENI). This study demonstrates the value of characterizing new SARS-CoV-2 variants by their genome-wide polynucleotide distinctiveness and emphasizes the need to go beyond a narrow set of mutations at known functionally salient sites.

    Other authors
    See publication
  • Comparative effectiveness of mRNA-1273 and BNT162b2 against symptomatic SARS-CoV-2 infection

    Med (Cell Press)

    Background:
    mRNA COVID-19 vaccines are safe and effective, but increasing reports of breakthrough infections highlight the need to vigilantly monitor and compare the effectiveness of these vaccines.

    Methods:
    We retrospectively compared protection against symptomatic infection conferred by mRNA-1273 and BNT162b2 at Mayo Clinic sites from December 2020 to September 2021. We used a test-negative case-control design to estimate vaccine effectiveness (VE) and to compare the odds of…

    Background:
    mRNA COVID-19 vaccines are safe and effective, but increasing reports of breakthrough infections highlight the need to vigilantly monitor and compare the effectiveness of these vaccines.

    Methods:
    We retrospectively compared protection against symptomatic infection conferred by mRNA-1273 and BNT162b2 at Mayo Clinic sites from December 2020 to September 2021. We used a test-negative case-control design to estimate vaccine effectiveness (VE) and to compare the odds of symptomatic infection after full vaccination with mRNA-1273 versus BNT162b2, while adjusting for age, sex, race, ethnicity, geography, comorbidities, and calendar time of vaccination and testing.

    Findings:
    Both vaccines were highly effective over the study duration (VEmRNA-1273: 84.1%, 95% CI: 81.6-86.2%; VEBNT162b2: 75.6%, 95% CI: 72.2-78.7%), but their effectiveness was reduced during July-September (VEmRNA-1273: 75.6%, 95% CI: 70.1-80%; VEBNT162b2: 63.5%, 95% CI: 55.8-69.9%) as compared to December-May (VEmRNA-1273: 93.7%, 95% CI: 90.4-95.9%; VEBNT162b2: 85.7%, 95% CI: 81.4-88.9%). Adjusted for demographic characteristics, clinical comorbidities, time of vaccination, and time of testing, the odds of experiencing a symptomatic breakthrough infection were lower after full vaccination with mRNA-1273 than with BNT162b2 (Odds Ratio: 0.60, 95% CI: 0.55-0.67).

    Conclusions:
    Both mRNA-1273 and BNT162b2 strongly protect against symptomatic SARS-CoV-2 infection. It is imperative to continue monitoring and comparing available vaccines over time and with respect to emerging variants to inform public and global health decisions.

    Other authors
    See publication
  • Omicron variant of SARS-CoV-2 harbors a unique insertion mutation of putative viral or human genomic origin

    OSF Preprints (Peer-review ongoing)

    The emergence of a heavily mutated SARS-CoV-2 variant (B.1.1.529, Omicron) and it’s spread to 6 continents within a week of initial discovery has set off a global public health alarm. Characterizing the mutational profile of Omicron is necessary to interpret its shared or distinctive clinical phenotypes with other SARS-CoV-2 variants. We compared the mutations of Omicron with prior variants of concern (Alpha, Beta, Gamma, Delta), variants of interest (Lambda, Mu, Eta, Iota and Kappa), and all…

    The emergence of a heavily mutated SARS-CoV-2 variant (B.1.1.529, Omicron) and it’s spread to 6 continents within a week of initial discovery has set off a global public health alarm. Characterizing the mutational profile of Omicron is necessary to interpret its shared or distinctive clinical phenotypes with other SARS-CoV-2 variants. We compared the mutations of Omicron with prior variants of concern (Alpha, Beta, Gamma, Delta), variants of interest (Lambda, Mu, Eta, Iota and Kappa), and all 1523 SARS-CoV-2 lineages constituting 5.4 million SARS-CoV-2 genomes. Omicron’s Spike protein has 26 amino acid mutations (23 substitutions, two deletions and one insertion) that are distinct compared to other variants of concern. Whereas the substitution and deletion mutations have appeared in previous SARS-CoV-2 lineages, the insertion mutation (ins214EPE) has not been previously observed in any SARS-CoV-2 lineage other than Omicron. The nucleotide sequence encoding for ins214EPE could have been acquired by template switching involving the genomes of other viruses that infect the same host cells as SARS-CoV-2 or the human transcriptome of host cells infected with SARS-CoV-2. For instance, given recent clinical reports of co-infections in COVID-19 patients with seasonal coronaviruses (e.g. HCoV-229E), single cell RNA-sequencing data showing co-expression of the SARS-CoV-2 and HCoV-229E entry receptors (ACE2 and ANPEP) in respiratory and gastrointestinal cells, and HCoV genomes harboring sequences homologous to the nucleotide sequence that encodes ins214EPE, it is plausible that the Omicron insertion could have evolved in a co-infected individual. There is a need to understand the function of the Omicron insertion and whether human host cells are being exploited by SARS-CoV-2 as an ‘evolutionary sandbox’ for host-virus and inter-viral genomic interplay.

    Other authors
    See publication
  • Analysis of Effectiveness of Ad26.COV2.S Adenoviral Vector Vaccine for COVID-19

    JAMA Network Open

    Importance - Continuous assessment of the effectiveness and safety of the US Food and Drug Administration–authorized SARS-CoV-2 vaccines is critical to amplify transparency, build public trust, and ultimately improve overall health outcomes.

    Objective - To evaluate the effectiveness of the Johnson & Johnson Ad26.COV2.S vaccine for preventing SARS-CoV-2 infection.

    Design, Setting, and Participants - This comparative effectiveness research study used large-scale longitudinal…

    Importance - Continuous assessment of the effectiveness and safety of the US Food and Drug Administration–authorized SARS-CoV-2 vaccines is critical to amplify transparency, build public trust, and ultimately improve overall health outcomes.

    Objective - To evaluate the effectiveness of the Johnson & Johnson Ad26.COV2.S vaccine for preventing SARS-CoV-2 infection.

    Design, Setting, and Participants - This comparative effectiveness research study used large-scale longitudinal curation of electronic health records from the multistate Mayo Clinic Health System (Minnesota, Arizona, Florida, Wisconsin, and Iowa) to identify vaccinated and unvaccinated adults between February 27 and July 22, 2021. The unvaccinated cohort was matched on a propensity score derived from age, sex, zip code, race, ethnicity, and previous number of SARS-CoV-2 polymerase chain reaction tests. The final study cohort consisted of 8889 patients in the vaccinated group and 88 898 unvaccinated matched patients.

    Exposure - Single dose of the Ad26.COV2.S vaccine.

    Main Outcomes and Measures - The incidence rate ratio of SARS-CoV-2 infection in the vaccinated vs unvaccinated control cohorts, measured by SARS-CoV-2 polymerase chain reaction testing.

    Results - The study was composed of 8889 vaccinated patients (4491 men [50.5%]; mean [SD] age, 52.4 [16.9] years) and 88 898 unvaccinated patients (44 748 men [50.3%]; mean [SD] age, 51.7 [16.7] years). The incidence rate ratio of SARS-CoV-2 infection in the vaccinated vs unvaccinated control cohorts was 0.26 (95% CI, 0.20-0.34) (60 of 8889 vaccinated patients vs 2236 of 88 898 unvaccinated individuals), which corresponds to an effectiveness of 73.6% (95% CI, 65.9%-79.9%) and a 3.73-fold reduction in SARS-CoV-2 infections.

    Other authors
    See publication
  • High diversity in Delta variant across countries revealed via genome-wide analysis of SARS-CoV-2 beyond the Spike protein

    bioRxiv (Peer-review ongoing)

    The highly contagious Delta variant of SARS-CoV-2 has emerged as the new dominant global strain, and reports of reduced effectiveness of COVID-19 vaccines against the Delta variant are highly concerning. While there has been extensive focus on understanding the amino acid mutations in the Delta variant ‘s Spike protein, the mutational landscape of the rest of the SARS-CoV-2 proteome (25 proteins) remains poorly understood. To this end, we performed a systematic analysis of mutations in all the…

    The highly contagious Delta variant of SARS-CoV-2 has emerged as the new dominant global strain, and reports of reduced effectiveness of COVID-19 vaccines against the Delta variant are highly concerning. While there has been extensive focus on understanding the amino acid mutations in the Delta variant ‘s Spike protein, the mutational landscape of the rest of the SARS-CoV-2 proteome (25 proteins) remains poorly understood. To this end, we performed a systematic analysis of mutations in all the SARS-CoV-2 proteins from nearly 2 million SARS-CoV-2 genomes from 176 countries/territories. Six highly-prevalent missense mutations in the viral life cycle-associated Membrane (I82T), Nucleocapsid (R203M, D377Y), NS3 (S26L), and NS7a (V82A, T120I) proteins are almost exclusive to the Delta variant compared to other variants of concern (mean prevalence across genomes: Delta = 99.74%, Alpha = 0.06%, Beta = 0.09%, Gamma = 0.22%). Furthermore, we find that the Delta variant harbors a more diverse repertoire of mutations across countries compared to the previously dominant Alpha variant (cosine similarity: meanAlpha = 0.94, S.D.Alpha = 0.05; meanDelta = 0.86, S.D.Delta = 0.1; Cohen ‘s dAlpha-Delta = 1.17, p-value < 0.001). Overall, our study underscores the high diversity of the Delta variant between countries and identifies a list of targetable amino acid mutations in the Delta variant ‘s proteome for probing the mechanistic basis of pathogenic features such as high viral loads, high transmissibility, and reduced susceptibility against neutralization by vaccines.

    Other authors
    See publication
  • Counties with lower insurance coverage are associated with both slower vaccine rollout and higher COVID-19 incidence across the United States

    Vaccines

    Equitable vaccination distribution is a priority for outcompeting the transmission of COVID-19. Here, the impact of demographic, socioeconomic, and environmental factors on county-level vaccination rates and COVID-19 incidence changes is assessed. In particular, using data from 3142 US counties with over 328 million individuals, correlations were computed between cumulative vaccination rate and change in COVID-19 incidence from 1 December 2020 to 6 June 2021, with 44 different demographic…

    Equitable vaccination distribution is a priority for outcompeting the transmission of COVID-19. Here, the impact of demographic, socioeconomic, and environmental factors on county-level vaccination rates and COVID-19 incidence changes is assessed. In particular, using data from 3142 US counties with over 328 million individuals, correlations were computed between cumulative vaccination rate and change in COVID-19 incidence from 1 December 2020 to 6 June 2021, with 44 different demographic, environmental, and socioeconomic factors. This correlation analysis was also performed using multivariate linear regression to adjust for age as a potential confounding variable. These correlation analyses demonstrated that counties with high levels of uninsured individuals have significantly lower COVID-19 vaccination rates (Spearman correlation: −0.460, p-value: <0.001). In addition, severe housing problems and high housing costs were strongly correlated with increased COVID-19 incidence (Spearman correlations: 0.335, 0.314, p-values: <0.001, <0.001). This study shows that socioeconomic factors are strongly correlated to both COVID-19 vaccination rates and incidence rates, underscoring the need to improve COVID-19 vaccination campaigns in marginalized communities.

    Other authors
    See publication
  • Real-time analysis of a mass vaccination effort confirms the safety of FDA-authorized mRNA COVID-19 vaccines

    Med (Cell Press)

    Highlights
    Emergent clinical visits are not increased after receiving BNT162b2 or mRNA-1273
    Side effect reports are rare in EHR notes compared to clinical trials and V-safe
    Myalgia and arthralgia are increased after vaccination with BNT162b2 and mRNA-1273
    Severe adverse effects are rare among individuals receiving BNT162b2 or mRNA-1273

    Context and significance
    This is a study of the mRNA COVID-19 vaccines developed by Pfizer/BioNTech and Moderna. Although these vaccines…

    Highlights
    Emergent clinical visits are not increased after receiving BNT162b2 or mRNA-1273
    Side effect reports are rare in EHR notes compared to clinical trials and V-safe
    Myalgia and arthralgia are increased after vaccination with BNT162b2 and mRNA-1273
    Severe adverse effects are rare among individuals receiving BNT162b2 or mRNA-1273

    Context and significance
    This is a study of the mRNA COVID-19 vaccines developed by Pfizer/BioNTech and Moderna. Although these vaccines have been shown to be safe and tolerated in clinical trials, it is important to confirm their safety profiles in practice. The results from this study show that individuals receiving these vaccines are likely to experience muscle and joint soreness, but they are not more likely to seek out emergent clinical care or experience severe medical events than unvaccinated individuals. As one of the largest real-world safety studies of COVID-19 vaccines to date, these data reinforce that we should continue expanding efforts to deliver more vaccines with high confidence in their safety.

    Methods:
    In this retrospective study, we deployed deep neural networks over a large EHR system to automatically curate the adverse effects mentioned by physicians in over 1.2 million clinical notes between December 1, 2020 and April 20, 2021. We compared notes from 68,266 individuals who received at least one dose of BNT162b2 (n = 51,795) or mRNA-1273 (n = 16,471) to notes from 68,266 unvaccinated individuals who were matched by demographic, geographic, and clinical features.

    Conclusions:
    This analysis of vaccine-related adverse effects from over 1.2 million EHR notes of more than 130,000 individuals reaffirms the safety and tolerability of the FDA-authorized mRNA COVID-19 vaccines in practice.

    Other authors
    See publication
  • Diversity of Coronavirus Receptors

    Preprints.org

    Several recent surges in COVID-19 cases due to newly emerging variant strains of SARS-CoV-2 with greater transmissibility have highlighted the virus’s capability to directly modulate spike-ACE2 interactions and promote immune evasion by sterically masking the immunogenic epitopes. Recently, there have also been reports of the bidirectional transfer of coronavirus between different animal species and humans. The ability of coronavirus to infect and adapt to a wide range of hosts can be…

    Several recent surges in COVID-19 cases due to newly emerging variant strains of SARS-CoV-2 with greater transmissibility have highlighted the virus’s capability to directly modulate spike-ACE2 interactions and promote immune evasion by sterically masking the immunogenic epitopes. Recently, there have also been reports of the bidirectional transfer of coronavirus between different animal species and humans. The ability of coronavirus to infect and adapt to a wide range of hosts can be attributed to new variants that modify the molecular recognition profile of the spike protein (S protein). The receptor-binding domain of the spike protein specifically interacts with key host receptor molecules present on the host cell membranes to gain entry into the host and begin the infection cycle. In this review, we discuss the molecular, structural, and functional diversity associated with the coronavirus receptors across their different phylogenetic lineages and its relevance to various symptomatology in the rapid human-to-human infection in COVID-19 patients, tropism, and zoonosis. Despite this seeming diversity of host receptors, there may be some common underlying mechanisms that influence the host range, virus transmissibility, and pathogenicity. Understanding these mechanisms may be crucial in not only controlling the ongoing pandemic but also help in stopping the resurgence of such virus threats in the future.

    Other authors
    See publication
  • Mapping each pre-existing condition’s association to short-term and long-term COVID-19 complications

    Nature Digital Medicine

    Understanding the relationships between pre-existing conditions and complications of COVID-19 infection is critical to identifying which patients will develop severe disease. Here, we leverage ~1.1 million clinical notes from 1803 hospitalized COVID-19 patients and deep neural network models to characterize associations between 21 pre-existing conditions and the development of 20 complications (e.g. respiratory, cardiovascular, renal, and hematologic) of COVID-19 infection throughout the course…

    Understanding the relationships between pre-existing conditions and complications of COVID-19 infection is critical to identifying which patients will develop severe disease. Here, we leverage ~1.1 million clinical notes from 1803 hospitalized COVID-19 patients and deep neural network models to characterize associations between 21 pre-existing conditions and the development of 20 complications (e.g. respiratory, cardiovascular, renal, and hematologic) of COVID-19 infection throughout the course of infection (i.e. 0–30 days, 31–60 days, and 61–90 days). Pleural effusion was the most frequent complication of early COVID-19 infection (89/1803 patients, 4.9%) followed by cardiac arrhythmia (45/1803 patients, 2.5%). Notably, hypertension was the most significant risk factor associated with 10 different complications including acute respiratory distress syndrome, cardiac arrhythmia, and anemia. The onset of new complications after 30 days is rare and most commonly involves pleural effusion (31–60 days: 11 patients, 61–90 days: 9 patients). Lastly, comparing the rates of complications with a propensity-matched COVID-negative hospitalized population confirmed the importance of hypertension as a risk factor for early-onset complications. Overall, the associations between pre-COVID conditions and COVID-associated complications presented here may form the basis for the development of risk assessment scores to guide clinical care pathways.

    See publication
  • Antigenic minimalism of SARS-CoV-2 is linked to surges in COVID-19 community transmission and vaccine breakthrough infections

    medRxiv (Peer-review Ongoing)

    The raging COVID-19 pandemic in India and reports of “vaccine breakthrough infections” globally have raised alarm mandating the characterization of the immuno-evasive features of SARS-CoV-2. Here, we systematically analyzed 1.57 million SARS-CoV-2 genomes from 187 countries/territories and performed whole-genome viral sequencing from 53 COVID-19 patients, including 20 vaccine breakthrough infections. We identified 89 Spike protein mutations that increased in prevalence during at least one surge…

    The raging COVID-19 pandemic in India and reports of “vaccine breakthrough infections” globally have raised alarm mandating the characterization of the immuno-evasive features of SARS-CoV-2. Here, we systematically analyzed 1.57 million SARS-CoV-2 genomes from 187 countries/territories and performed whole-genome viral sequencing from 53 COVID-19 patients, including 20 vaccine breakthrough infections. We identified 89 Spike protein mutations that increased in prevalence during at least one surge in SARS-CoV-2 test positivity in any country over a three-month window. Deletions in the Spike protein N-terminal domain (NTD) are highly enriched for these ‘surge-associated mutations’ (Odds Ratio = 41.8, 95% CI: 6.36-1758, p-value = 7.7e-05). In the recent COVID-19 surge in India, an NTD deletion (ΔF157/R158) increased over 10-fold in prevalence from February 2021 (1.1%) to April 2021 (15%). During the recent surge in Chile, an NTD deletion (Δ246-253) increased rapidly over 30-fold in prevalence from January 2021 (0.86%) to April 2021 (33%). Strikingly, these simultaneously emerging deletions associated with surges in different parts of the world both occur at an antigenic supersite that is targeted by neutralizing antibodies. Finally, we generated clinically annotated SARS-CoV-2 whole genome sequences and identified deletions within this NTD antigenic supersite in a patient with vaccine breakthrough infection (Δ156-164) and other deletions from unvaccinated severe COVID-19 patients that could represent emerging deletion-prone regions. Overall, the expanding repertoire of Spike protein deletions throughout the pandemic and their association with case surges and vaccine breakthrough infections point to antigenic minimalism as an emerging evolutionary strategy for SARS-CoV-2 to evade immune responses.

    Other authors
    See publication
  • COVID-19 vaccines dampen genomic diversity of SARS-CoV-2: Unvaccinated patients exhibit more antigenic mutational variance

    medRxiv (Peer-review Ongoing)

    Variants of SARS-CoV-2 are evolving under a combination of immune selective pressure in infected hosts and natural genetic drift, raising a global alarm regarding the durability of COVID-19 vaccines. Here, we conducted longitudinal analysis over 1.8 million SARS-CoV-2 genomes from 183 countries or territories to capture vaccination-associated viral evolutionary patterns. To augment this macroscale analysis, we performed viral genome sequencing in 23 vaccine breakthrough COVID-19 patients and 30…

    Variants of SARS-CoV-2 are evolving under a combination of immune selective pressure in infected hosts and natural genetic drift, raising a global alarm regarding the durability of COVID-19 vaccines. Here, we conducted longitudinal analysis over 1.8 million SARS-CoV-2 genomes from 183 countries or territories to capture vaccination-associated viral evolutionary patterns. To augment this macroscale analysis, we performed viral genome sequencing in 23 vaccine breakthrough COVID-19 patients and 30 unvaccinated COVID-19 patients for whom we also conducted machine-augmented curation of the electronic health records (EHRs). Strikingly, we find the diversity of the SARS-CoV-2 lineages is declining at the country-level with increased rate of mass vaccination (n = 25 countries, mean correlation coefficient = −0.72, S.D. = 0.20). Given that the COVID-19 vaccines leverage B-cell and T-cell epitopes, analysis of mutation rates shows neutralizing B-cell epitopes to be particularly more mutated than comparable amino acid clusters (4.3-fold, p < 0.001). Prospective validation of these macroscale evolutionary patterns using clinically annotated SARS-CoV-2 whole genome sequences confirms that vaccine breakthrough patients indeed harbor viruses with significantly lower diversity in known B cell epitopes compared to unvaccinated COVID-19 patients (2.3-fold, 95% C.I. 1.4-3.7). Incidentally, in these study cohorts, vaccinated breakthrough patients also displayed fewer COVID-associated complications and pre-existing conditions relative to unvaccinated COVID-19 patients. This study presents the first known evidence that COVID-19 vaccines are fundamentally restricting the evolutionary and antigenic escape pathways accessible to SARS-CoV-2. The societal benefit of mass vaccination may consequently go far beyond the widely reported mitigation of SARS-CoV-2 infection risk and amelioration of community transmission, to include stemming of rampant viral evolution.

    Other authors
    See publication
  • Anemia during SARS-CoV-2 infection is associated with rehospitalization after viral clearance

    iScience (Cell Press)

    Highlights:
    •Patients rehospitalized after SARS-CoV-2 clearance have distinct laboratory test profiles
    •Rehospitalized patients have lower hemoglobin before and during SARS-CoV-2 infection
    •Rehospitalized patients are more likely to experience anemia during active infection

    Summary
    Patients with COVID-19 can experience symptoms and complications after viral clearance. It is important to identify clinical features of patients who are likely to experience these prolonged…

    Highlights:
    •Patients rehospitalized after SARS-CoV-2 clearance have distinct laboratory test profiles
    •Rehospitalized patients have lower hemoglobin before and during SARS-CoV-2 infection
    •Rehospitalized patients are more likely to experience anemia during active infection

    Summary
    Patients with COVID-19 can experience symptoms and complications after viral clearance. It is important to identify clinical features of patients who are likely to experience these prolonged effects. We conducted a retrospective study to compare longitudinal laboratory test measurements (hemoglobin, hematocrit, estimated glomerular filtration rate, serum creatinine, and blood urea nitrogen) in patients rehospitalized after PCR-confirmed SARS-CoV-2 clearance (n = 104) versus patients not rehospitalized after viral clearance (n = 278). Rehospitalized patients had lower median hemoglobin levels in the year prior to COVID-19 diagnosis (Cohen's D = −0.50; p = 1.2 × 10−3) and during their active SARS-CoV-2 infection (Cohen's D = −0.71; p = 4.6 × 10−8). Rehospitalized patients were also more likely to be diagnosed with moderate or severe anemia during their active infection (Odds Ratio = 4.07; p = 4.99 × 10−9). These findings suggest that anemia-related laboratory tests should be considered in risk stratification algorithms for patients with COVID-19.

    Other authors
    See publication
  • Cerebral Venous Sinus Thrombosis is not Significantly Linked to COVID-19 Vaccines or Non-COVID Vaccines in a Large Multi-State Health System

    Journal of Stroke & Cerebrovascular Diseases

    Objective
    To assess the association of COVID-19 vaccines and non-COVID-19 vaccines with cerebral venous sinus thrombosis (CVST).
    Materials and method
    We retrospectively analyzed a cohort of 771,805 vaccination events across 266,094 patients in the Mayo Clinic Health System between 01/01/2017 and 03/15/2021. The primary outcome was a positive diagnosis of CVST, identified either by the presence of a corresponding ICD code or by an NLP algorithm which detected positive diagnosis of CVST…

    Objective
    To assess the association of COVID-19 vaccines and non-COVID-19 vaccines with cerebral venous sinus thrombosis (CVST).
    Materials and method
    We retrospectively analyzed a cohort of 771,805 vaccination events across 266,094 patients in the Mayo Clinic Health System between 01/01/2017 and 03/15/2021. The primary outcome was a positive diagnosis of CVST, identified either by the presence of a corresponding ICD code or by an NLP algorithm which detected positive diagnosis of CVST within free-text clinical notes. For each vaccine we calculated the relative risk by dividing the incidence of CVST in the 30 days following vaccination to that in the 30 days preceding vaccination.
    Results
    We identified vaccination events for all FDA-approved COVID-19 vaccines including Pfizer-BioNTech (n = 94,818 doses), Moderna (n = 36,350 doses) and Johnson & Johnson - J&J (n = 1,745 doses). We also identified vaccinations events for 10 common FDA-approved non-COVID-19 vaccines (n = 771,805 doses). There was no statistically significant difference in the incidence rate of CVST in 30-days before and after vaccination for any vaccine in this population. We further found the baseline CVST incidence in the study population between 2017 and 2021 to be 45 to 98 per million patient years.
    Conclusions
    This real-world evidence-based study finds that CVST is rare and is not significantly associated with COVID-19 vaccination in our patient cohort. Limitations include the rarity of CVST in our dataset, a relatively small number of J&J COVID-19 vaccination events, and the use of a population drawn from recipients of a SARS-CoV-2 PCR test in a single health system.

    Other authors
    See publication
  • A Literature-Derived Knowledge Graph Augments the Interpretation of Single Cell RNA-seq Datasets

    Genes

    Technology to generate single cell RNA-sequencing (scRNA-seq) datasets and tools to annotate them have advanced rapidly in the past several years. Such tools generally rely on existing transcriptomic datasets or curated databases of cell type defining genes, while the application of scalable natural language processing (NLP) methods to enhance analysis workflows has not been adequately explored. Here we deployed an NLP framework to objectively quantify associations between a comprehensive set…

    Technology to generate single cell RNA-sequencing (scRNA-seq) datasets and tools to annotate them have advanced rapidly in the past several years. Such tools generally rely on existing transcriptomic datasets or curated databases of cell type defining genes, while the application of scalable natural language processing (NLP) methods to enhance analysis workflows has not been adequately explored. Here we deployed an NLP framework to objectively quantify associations between a comprehensive set of over 20,000 human protein-coding genes and over 500 cell type terms across over 26 million biomedical documents. The resultant gene-cell type associations (GCAs) are significantly stronger between a curated set of matched cell type-marker pairs than the complementary set of mismatched pairs (Mann Whitney p = 6.15 × 10−76, r = 0.24; cohen’s D = 2.6). Building on this, we developed an augmented annotation algorithm (single cell Annotation via Literature Encoding, or scALE) that leverages GCAs to categorize cell clusters identified in scRNA-seq datasets, and we tested its ability to predict the cellular identity of 133 clusters from nine datasets of human breast, colon, heart, joint, ovary, prostate, skin, and small intestine tissues. With the optimized settings, the true cellular identity matched the top prediction in 59% of tested clusters and was present among the top five predictions for 91% of clusters. scALE slightly outperformed an existing method for reference data driven automated cluster annotation, and we demonstrate that integration of scALE can meaningfully improve the annotations derived from such methods. Further, contextualization of differential expression analyses with these GCAs highlights poorly characterized markers of well-studied cell types, such as CLIC6 and DNASE1L3 in retinal pigment epithelial cells and endothelial cells, respectively.

    Other authors
    See publication
  • Building a best-in-class automated de-identification tool for electronic health records through ensemble learning

    Patterns (Cell Press)

    Highlights
    •An ensemble approach to automated de-identification of unstructured clinical text
    •Our approach leverages advances in deep learning along with heuristics
    •Detected personally identifiable information is replaced with suitable surrogates
    • Patient data are de-identified at scale to accelerate medical discovery

    The bigger picture:
    Clinical notes in electronic health records convey rich historical information regarding disease and treatment progression. However…

    Highlights
    •An ensemble approach to automated de-identification of unstructured clinical text
    •Our approach leverages advances in deep learning along with heuristics
    •Detected personally identifiable information is replaced with suitable surrogates
    • Patient data are de-identified at scale to accelerate medical discovery

    The bigger picture:
    Clinical notes in electronic health records convey rich historical information regarding disease and treatment progression. However, this unstructured text often contains personally identifiable information such as names, phone numbers, or residential addresses of patients, thereby limiting its dissemination for research purposes. The removal of patient identifiers, through the process of de-identification, enables sharing of clinical data while preserving patient privacy. Here, we present a best-in-class approach to de-identification, which automatically detects identifiers and substitutes them with fabricated ones. Our approach enables de-identification of patient data at the scale required to harness the unstructured, context-rich information in electronic health records to aid in medical research and advancement.

    Summary
    Here, we describe an automated de-identification system that employs an ensemble architecture, incorporating attention-based deep-learning models and rule-based methods, supported by heuristics for detecting PII in EHR data. Detected identifiers are then transformed into plausible, though fictional, surrogates to further obfuscate any leaked identifier. Our approach outperforms existing tools, with a recall of 0.992 and precision of 0.979 on the i2b2 2014 dataset and a recall of 0.994 and precision of 0.967 on a dataset of 10,000 notes from the Mayo Clinic. The de-identification system presented here enables the generation of de-identified patient data at the scale required for modern machine-learning applications to help accelerate medical discoveries.

    Other authors
    See publication
  • Case fatality rates for COVID-19 are higher than case fatality rates for motor vehicle accidents for individuals over 40 years of age

    medRxiv (Peer-review Ongoing)

    The death toll of the COVID-19 pandemic has been unprecedented, due to both the high number of SARS-CoV-2 infections and the seriousness of the disease resulting from these infections. Here, we present mortality rates and case fatality rates for COVID-19 over the past year compared with other historic leading causes of death in the United States. Among the risk categories considered, COVID-19 is the third leading cause of death for individuals 40 years old and over, with an overall annual…

    The death toll of the COVID-19 pandemic has been unprecedented, due to both the high number of SARS-CoV-2 infections and the seriousness of the disease resulting from these infections. Here, we present mortality rates and case fatality rates for COVID-19 over the past year compared with other historic leading causes of death in the United States. Among the risk categories considered, COVID-19 is the third leading cause of death for individuals 40 years old and over, with an overall annual mortality rate of 325 deaths per 100K individuals, behind only cancer (385 deaths per 100K individuals) and heart disease (412 deaths per 100K individuals). In addition, for individuals 40 years old and over, the case fatality rate for COVID-19 is greater than the case fatality rate for motor vehicle accidents. In particular, for the age group 40-49, the relative case fatality rate of COVID-19 is 1.5 fold (95% CI: [1.3, 1.7]) that of a motor vehicle accident, demonstrating that SARS-CoV-2 infection may be significantly more dangerous than a car crash for this age group. For older adults, COVID-19 is even more dangerous, and the relative case fatality rate of COVID-19 is 29.4 fold (95% CI: [23.2, 35.7]) that of a motor vehicle accident for individuals over 80 years old. On the other hand, motor vehicle accidents have a 4.5 fold (95% CI: [3.9, 5.1]) greater relative case fatality rate compared to COVID-19 for the age group of 20-29 years. These results highlight the severity of the COVID-19 pandemic especially for adults above 40 years of age and underscore the need for large-scale preventative measures to mitigate risks for these populations. Given that FDA-authorized COVID-19 vaccines have now been validated by multiple studies for their outstanding real-world effectiveness and safety, vaccination of all individuals who are over 40 years of age is one of the most pressing public health priorities of our time.

    Other authors
    See publication
  • Female-male differences in COVID vaccine adverse events have precedence in seasonal flu shots: a potential link to sex-associated baseline gene expression patterns

    medRxiv (Peer-review Ongoing)

    Nearly 150 million doses of FDA-authorized COVID vaccines have been administered in the United States. Sex-based differences of adverse events remain poorly understood, mandating the need for real-world investigation from Electronic Health Records (EHRs) and broader epidemiological data sets. Based on an augmented curation of EHR clinical notes of 31,064 COVID-vaccinated individuals (19,321 females and 11,743 males) in the Mayo Clinic, we find that nausea and vomiting were documented…

    Nearly 150 million doses of FDA-authorized COVID vaccines have been administered in the United States. Sex-based differences of adverse events remain poorly understood, mandating the need for real-world investigation from Electronic Health Records (EHRs) and broader epidemiological data sets. Based on an augmented curation of EHR clinical notes of 31,064 COVID-vaccinated individuals (19,321 females and 11,743 males) in the Mayo Clinic, we find that nausea and vomiting were documented significantly more frequently in females than males after both vaccine doses (nausea: RRDose 1 = 1.67, pDose 1 <0.001, RRDose 2 = 2.2, pDose 1 < 0.001; vomiting: RRDose 1 = 1.58, pDose 1 < 0.001, RRDose 2 = 1.88, pDose 1 = 3.4×10−2). Conversely, fever, fatigue, and lymphadenopathy were more common in males after the first dose vaccination (fever RR = 0.62; p = 8.65×10−3; fatigue RR = 0.86, p = 2.89×10−2; lymphadenopathy RR = 0.61, p = 3.45×10−3). Analysis of the Vaccine Adverse Events Reporting System (VAERS) database further confirms that nausea comprises a larger fraction of total reports among females than males (RR: 1.58; p<0.001), while fever comprises a larger fraction of total reports among males than females (RR: 0.84; p<0.001). Importantly, increased reporting of nausea and fever among females and males, respectively, is also observed for prior influenza vaccines in the VAERS database, establishing that these differences are not unique to the recently developed COVID-19 vaccines. Investigating the mechanistic basis underlying these clinical findings, an analysis of bulk RNA-sequencing data from 12,158 human blood samples (8626 female, 3532 male) reveals 85 genes that are not only significantly different in their gene expression between females and males at baseline, but also have established literature-based associations to COVID-19 as well as the vaccine-related adverse events of clinical consequence.

    Other authors
    See publication
  • Pre-existing conditions are associated with COVID-19 patients’ hospitalization, despite confirmed clearance of SARS-CoV-2 virus

    eClinicalMedicine (Lancet publication)

    Abstract
    Background
    Consecutive negative SARS-CoV-2 PCR test results are being considered to estimate viral clearance in COVID-19 patients. However, there are anecdotal reports of hospitalization from protracted COVID-19 complications despite such confirmed viral clearance, presenting a clinical conundrum.
    Methods
    We conducted a retrospective analysis of 222 hospitalized COVID-19 patients to compare those that were readmitted post-viral clearance (hospitalized post-clearance cohort,…

    Abstract
    Background
    Consecutive negative SARS-CoV-2 PCR test results are being considered to estimate viral clearance in COVID-19 patients. However, there are anecdotal reports of hospitalization from protracted COVID-19 complications despite such confirmed viral clearance, presenting a clinical conundrum.
    Methods
    We conducted a retrospective analysis of 222 hospitalized COVID-19 patients to compare those that were readmitted post-viral clearance (hospitalized post-clearance cohort, n = 49) with those that were not re-admitted post-viral clearance (non-hospitalized post-clearance cohort, n = 173) between February and October 2020. In order to differentiate these two cohorts, we used neural network models for the ‘augmented curation’ of comorbidities and complications with positive sentiment in the Electronic Hosptial Records physician notes.

    Findings
    In the year preceding COVID-19 onset, anemia (n = 13 [26.5%], p-value: 0.007), cardiac arrhythmias (n = 14 [28.6%], p-value: 0.015), and acute kidney injury (n = 7 [14.3%], p-value: 0.030) were significantly enriched in the physician notes of the hospitalized post-clearance cohort.
    Interpretation
    Overall, this retrospective study highlights specific pre-existing conditions that are associated with higher hospitalization rates in COVID-19 patients despite viral clearance and motivates follow-up prospective research into the associated risk factors.

    Other authors
    See publication
  • Plasma IL-6 levels following corticosteroid therapy as an indicator of ICU length of stay in critically ill COVID-19 patients

    Cell Death Discovery (Nature publishing group)

    Intensive care unit (ICU) admissions and mortality in severe COVID-19 patients are driven by “cytokine storms” and acute respiratory distress syndrome (ARDS). Interim clinical trial results suggest that the corticosteroid dexamethasone displays better 28-day survival in severe COVID-19 patients requiring ventilation or oxygen. In this study, 10 out of 16 patients (62.5%) that had an average plasma IL-6 value over 10 pg/mL post administration of corticosteroids also had worse outcomes (i.e., ICU…

    Intensive care unit (ICU) admissions and mortality in severe COVID-19 patients are driven by “cytokine storms” and acute respiratory distress syndrome (ARDS). Interim clinical trial results suggest that the corticosteroid dexamethasone displays better 28-day survival in severe COVID-19 patients requiring ventilation or oxygen. In this study, 10 out of 16 patients (62.5%) that had an average plasma IL-6 value over 10 pg/mL post administration of corticosteroids also had worse outcomes (i.e., ICU stay >15 days or death), compared to 8 out of 41 patients (19.5%) who did not receive corticosteroids (p-value = 0.0024). Given this potential association between post-corticosteroid IL-6 levels and COVID-19 severity, we hypothesized that the glucocorticoid receptor (GR or NR3C1) may be coupled to IL-6 expression in specific cell types that govern cytokine release syndrome (CRS). Examining single-cell RNA-seq data from BALF of severe COVID-19 patients and nearly 2 million cells from a pan-tissue scan shows that alveolar macrophages, smooth muscle cells, and endothelial cells co-express NR3C1 and IL-6, motivating future studies on the links between the regulation of NR3C1 function and IL-6 levels.

    Other authors
    See publication
  • Enoxaparin is associated with lower rates of mortality than unfractionated Heparin in hospitalized COVID-19 patients

    eClinicalMedicine (Lancet publication)

    Background
    Coagulopathies are a major class among COVID-19 associated complications. Although anticoagulants such as unfractionated Heparin and Enoxaparin are both being used for therapeutic mitigation of COVID associated coagulopathy (CAC), differences in their clinical outcomes remain to be investigated.
    Methods
    We analyzed records of 1,113 patients in the Mayo Clinic Electronic Health Record (EHR) database who were admitted to the hospital for COVID-19 between April 4, 2020 and…

    Background
    Coagulopathies are a major class among COVID-19 associated complications. Although anticoagulants such as unfractionated Heparin and Enoxaparin are both being used for therapeutic mitigation of COVID associated coagulopathy (CAC), differences in their clinical outcomes remain to be investigated.
    Methods
    We analyzed records of 1,113 patients in the Mayo Clinic Electronic Health Record (EHR) database who were admitted to the hospital for COVID-19 between April 4, 2020 and August 31, 2020, including 19 different Mayo Clinic sites in Arizona, Florida, Minnesota, and Wisconsin. Among this patient population, we compared cohorts of patients who received different types of anticoagulants, including 441 patients who received unfractionated Heparin and 166 patients who received Enoxaparin. Clinical outcomes at 28 days were compared, and propensity score matching was used to control for potential confounding variables including: demographics, comorbidities, ICU status, chronic kidney disease stage, and oxygenation status. Patients with a history of acute kidney injury and patients who received multiple types of anticoagulants were excluded from the study.

    Findings
    We find that COVID-19 patients administered unfractionated Heparin but not Enoxaparin have higher rates of 28-day mortality (risk ratio: 4.3; 95% Confidence Interval [C.I.].: [1.8, 10.2]; p-value: 8.5e−4, Benjamini Hochberg [BH] adjusted p-value: 2.1e−3), after controlling for potential confounding factors.
    Interpretation
    This study emphasizes the need for mechanistically investigating differential modulation of the COVID-associated coagulation cascades by Enoxaparin versus unfractionated Heparin.

    Other authors
    See publication
  • Exploratory analysis of immunization records highlights decreased SARS-CoV-2 rates in individuals with recent non-COVID-19 vaccinations

    Scientific Reports (Nature publishing group)

    Clinical studies are ongoing to assess whether existing vaccines may afford protection against SARS-CoV-2 infection through trained immunity. In this exploratory study, we analyze immunization records from 137,037 individuals who received SARS-CoV-2 PCR tests. We find that polio, Haemophilus influenzae type-B (HIB), measles-mumps-rubella (MMR), Varicella, pneumococcal conjugate (PCV13), Geriatric Flu, and hepatitis A/hepatitis B (HepA–HepB) vaccines administered in the past 1, 2, and 5 years…

    Clinical studies are ongoing to assess whether existing vaccines may afford protection against SARS-CoV-2 infection through trained immunity. In this exploratory study, we analyze immunization records from 137,037 individuals who received SARS-CoV-2 PCR tests. We find that polio, Haemophilus influenzae type-B (HIB), measles-mumps-rubella (MMR), Varicella, pneumococcal conjugate (PCV13), Geriatric Flu, and hepatitis A/hepatitis B (HepA–HepB) vaccines administered in the past 1, 2, and 5 years are associated with decreased SARS-CoV-2 infection rates, even after adjusting for geographic SARS-CoV-2 incidence and testing rates, demographics, comorbidities, and number of other vaccinations. Furthermore, age, race/ethnicity, and blood group stratified analyses reveal significantly lower SARS-CoV-2 rate among black individuals who have taken the PCV13 vaccine, with relative risk of 0.45 at the 5 year time horizon (n: 653, 95% CI (0.32, 0.64), p-value: 6.9e−05). Overall, this study identifies existing approved vaccines which can be promising candidates for pre-clinical research and Randomized Clinical Trials towards combating COVID-19.

    Other authors
    See publication
  • Long-term SARS-CoV-2 RNA shedding and its temporal association to IgG seropositivity

    Cell Death Discovery (Nature publishing group)

    Longitudinal characterization of SARS-CoV-2 PCR testing from COVID-19 patient’s nasopharynx and its juxtaposition with blood-based IgG-seroconversion diagnostic assays is critical to understanding SARS-CoV-2 infection durations. Here, we retrospectively analyze 851 SARS-CoV-2-positive patients with at least two positive PCR tests and find that 99 of these patients remain SARS-CoV-2-positive after 4 weeks from their initial diagnosis date. For the 851-patient cohort, the mean lower bound of…

    Longitudinal characterization of SARS-CoV-2 PCR testing from COVID-19 patient’s nasopharynx and its juxtaposition with blood-based IgG-seroconversion diagnostic assays is critical to understanding SARS-CoV-2 infection durations. Here, we retrospectively analyze 851 SARS-CoV-2-positive patients with at least two positive PCR tests and find that 99 of these patients remain SARS-CoV-2-positive after 4 weeks from their initial diagnosis date. For the 851-patient cohort, the mean lower bound of viral RNA shedding was 17.3 days (SD: 7.8), and the mean upper bound of viral RNA shedding from 668 patients transitioning to confirmed PCR-negative status was 22.7 days (SD: 11.8). Among 104 patients with an IgG test result, 90 patients were seropositive to date, with mean upper bound of time to seropositivity from initial diagnosis being 37.8 days (95% CI: 34.3–41.3). Our findings from juxtaposing IgG and PCR tests thus reveal that some SARS-CoV-2-positive patients are non-hospitalized and seropositive, yet actively shed viral RNA (14 of 90 patients). This study emphasizes the need for monitoring viral loads and neutralizing antibody titers in long-term non-hospitalized shedders as a means of characterizing the SARS-CoV-2 infection lifecycle.

    Other authors
    See publication
  • Benchmarking evolutionary tinkering underlying human–viral molecular mimicry shows multiple host pulmonary–arterial peptides mimicked by SARS-CoV-2

    Cell Death Discovery (Nature publishing group)

    The hand of molecular mimicry in shaping SARS-CoV-2 evolution and immune evasion remains to be deciphered. Here, we report 33 distinct 8-mer/9-mer peptides that are identical between SARS-CoV-2 and the human reference proteome. We benchmark this observation against other viral–human 8-mer/9-mer peptide identity, which suggests generally similar extents of molecular mimicry for SARS-CoV-2 and many other human viruses. Interestingly, 20 novel human peptides mimicked by SARS-CoV-2 have not been…

    The hand of molecular mimicry in shaping SARS-CoV-2 evolution and immune evasion remains to be deciphered. Here, we report 33 distinct 8-mer/9-mer peptides that are identical between SARS-CoV-2 and the human reference proteome. We benchmark this observation against other viral–human 8-mer/9-mer peptide identity, which suggests generally similar extents of molecular mimicry for SARS-CoV-2 and many other human viruses. Interestingly, 20 novel human peptides mimicked by SARS-CoV-2 have not been observed in any previous coronavirus strains (HCoV, SARS-CoV, and MERS). Furthermore, four of the human 8-mer/9-mer peptides mimicked by SARS-CoV-2 map onto HLA-B*40:01, HLA-B*40:02, and HLA-B*35:01 binding peptides from human PAM, ANXA7, PGD, and ALOX5AP proteins. This mimicry of multiple human proteins by SARS-CoV-2 is made salient by single-cell RNA-seq (scRNA-seq) analysis that shows the targeted genes significantly expressed in human lungs and arteries; tissues implicated in COVID-19 pathogenesis. Finally, HLA-A*03 restricted 8-mer peptides are found to be shared broadly by human and coronaviridae helicases in functional hotspots, with potential implications for nucleic acid unwinding upon initial infection. This study presents the first scan of human peptide mimicry by SARS-CoV-2, and via its benchmarking against human–viral mimicry more broadly, presents a computational framework for follow-up studies to assay how evolutionary tinkering may relate to zoonosis and herd immunity.

    Other authors
    See publication
  • Inference from longitudinal laboratory tests characterizes temporal evolution of COVID-19-associated coagulopathy (CAC)

    eLife

    Temporal inference from laboratory testing results and triangulation with clinical outcomes extracted from unstructured electronic health record (EHR) provider notes is integral to advancing precision medicine. Here, we studied 246 SARS-CoV-2 PCR-positive (COVIDpos) patients and propensity-matched 2460 SARS-CoV-2 PCR-negative (COVIDneg) patients subjected to around 700,000 lab tests cumulatively across 194 assays. Compared to COVIDneg patients at the time of diagnostic testing, COVIDpos…

    Temporal inference from laboratory testing results and triangulation with clinical outcomes extracted from unstructured electronic health record (EHR) provider notes is integral to advancing precision medicine. Here, we studied 246 SARS-CoV-2 PCR-positive (COVIDpos) patients and propensity-matched 2460 SARS-CoV-2 PCR-negative (COVIDneg) patients subjected to around 700,000 lab tests cumulatively across 194 assays. Compared to COVIDneg patients at the time of diagnostic testing, COVIDpos patients tended to have higher plasma fibrinogen levels and lower platelet counts. However, as the infection evolves, COVIDpos patients distinctively show declining fibrinogen, increasing platelet counts, and lower white blood cell counts. Augmented curation of EHRs suggests that only a minority of COVIDpos patients develop thromboembolism, and rarely, disseminated intravascular coagulopathy (DIC), with patients generally not displaying platelet reductions typical of consumptive coagulopathies. These temporal trends provide fine-grained resolution into COVID-19 associated coagulopathy (CAC) and set the stage for personalizing thromboprophylaxis.

    Other authors
    See publication
  • Augmented curation of clinical notes from a massive EHR system reveals symptoms of impending COVID-19 diagnosis

    eLife

    Understanding temporal dynamics of COVID-19 symptoms could provide fine-grained resolution to guide clinical decision-making. Here, we use deep neural networks over an institution-wide platform for the augmented curation of clinical notes from 77,167 patients subjected to COVID-19 PCR testing. By contrasting Electronic Health Record (EHR)-derived symptoms of COVID-19-positive (COVIDpos; n = 2,317) versus COVID-19-negative (COVIDneg; n = 74,850) patients for the week preceding the PCR testing…

    Understanding temporal dynamics of COVID-19 symptoms could provide fine-grained resolution to guide clinical decision-making. Here, we use deep neural networks over an institution-wide platform for the augmented curation of clinical notes from 77,167 patients subjected to COVID-19 PCR testing. By contrasting Electronic Health Record (EHR)-derived symptoms of COVID-19-positive (COVIDpos; n = 2,317) versus COVID-19-negative (COVIDneg; n = 74,850) patients for the week preceding the PCR testing date, we identify anosmia/dysgeusia (27.1-fold), fever/chills (2.6-fold), respiratory difficulty (2.2-fold), cough (2.2-fold), myalgia/arthralgia (2-fold), and diarrhea (1.4-fold) as significantly amplified in COVIDpos over COVIDneg patients. The combination of cough and fever/chills has 4.2-fold amplification in COVIDpos patients during the week prior to PCR testing, in addition to anosmia/dysgeusia, constitutes the earliest EHR-derived signature of COVID-19. This study introduces an Augmented Intelligence platform for the real-time synthesis of institutional biomedical knowledge. The platform holds tremendous potential for scaling up curation throughput, thus enabling EHR-powered early disease diagnosis.

    Other authors
    See publication
  • Knowledge synthesis of 100 million biomedical documents augments the deep expression profiling of coronavirus receptorsKnowledge synthesis of 100 million biomedical documents augments the deep expression profiling of coronavirus receptors

    eLife

    The COVID-19 pandemic demands assimilation of all biomedical knowledge to decode mechanisms of pathogenesis. Despite the recent renaissance in neural networks, a platform for the real-time synthesis of the exponentially growing biomedical literature and deep omics insights is unavailable. Here, we present the nferX platform for dynamic inference from over 45 quadrillion possible conceptual associations from unstructured text, and triangulation with insights from single-cell RNA-sequencing, bulk…

    The COVID-19 pandemic demands assimilation of all biomedical knowledge to decode mechanisms of pathogenesis. Despite the recent renaissance in neural networks, a platform for the real-time synthesis of the exponentially growing biomedical literature and deep omics insights is unavailable. Here, we present the nferX platform for dynamic inference from over 45 quadrillion possible conceptual associations from unstructured text, and triangulation with insights from single-cell RNA-sequencing, bulk RNA-seq and proteomics from diverse tissue types. A hypothesis-free profiling of ACE2 suggests tongue keratinocytes, olfactory epithelial cells, airway club cells and respiratory ciliated cells as potential reservoirs of the SARS-CoV-2 receptor. We find the gut as the putative hotspot of COVID-19, where a maturation correlated transcriptional signature is shared in small intestine enterocytes among coronavirus receptors (ACE2, DPP4, ANPEP). A holistic data science platform triangulating insights from structured and unstructured data holds potential for accelerating the generation of impactful biological insights and hypotheses.

    Other authors
    See publication
  • SARS-CoV-2 strategically mimics proteolytic activation of human ENaC

    eLife

    Molecular mimicry is an evolutionary strategy adopted by viruses to exploit the host cellular machinery. We report that SARS-CoV-2 has evolved a unique S1/S2 cleavage site, absent in any previous coronavirus sequenced, resulting in the striking mimicry of an identical FURIN-cleavable peptide on the human epithelial sodium channel α-subunit (ENaC-α). Genetic alteration of ENaC-α causes aldosterone dysregulation in patients, highlighting that the FURIN site is critical for activation of ENaC…

    Molecular mimicry is an evolutionary strategy adopted by viruses to exploit the host cellular machinery. We report that SARS-CoV-2 has evolved a unique S1/S2 cleavage site, absent in any previous coronavirus sequenced, resulting in the striking mimicry of an identical FURIN-cleavable peptide on the human epithelial sodium channel α-subunit (ENaC-α). Genetic alteration of ENaC-α causes aldosterone dysregulation in patients, highlighting that the FURIN site is critical for activation of ENaC. Single cell RNA-seq from 66 studies shows significant overlap between expression of ENaC-α and the viral receptor ACE2 in cell types linked to the cardiovascular-renal-pulmonary pathophysiology of COVID-19. Triangulating this cellular characterization with cleavage signatures of 178 proteases highlights proteolytic degeneracy wired into the SARS-CoV-2 lifecycle. Evolution of SARS-CoV-2 into a global pandemic may be driven in part by its targeted mimicry of ENaC-α, a protein critical for the homeostasis of airway surface liquid, whose misregulation is associated with respiratory conditions.

    Other authors
    See publication
  • Engineering and optimising deaminase fusions for genome editing

    Nature Communications

    Precise editing is essential for biomedical research and gene therapy. Yet, homology-directed genome modification is limited by the requirements for genomic lesions, homology donors and the endogenous DNA repair machinery. Here we engineered programmable cytidine deaminases and test if we could introduce site-specific cytidine to thymidine transitions in the absence of targeted genomic lesions. Our programmable deaminases effectively convert specific cytidines to thymidines with 13% efficiency…

    Precise editing is essential for biomedical research and gene therapy. Yet, homology-directed genome modification is limited by the requirements for genomic lesions, homology donors and the endogenous DNA repair machinery. Here we engineered programmable cytidine deaminases and test if we could introduce site-specific cytidine to thymidine transitions in the absence of targeted genomic lesions. Our programmable deaminases effectively convert specific cytidines to thymidines with 13% efficiency in Escherichia coli and 2.5% in human cells. However, off-target deaminations were detected more than 150 bp away from the target site. Moreover, whole genome sequencing revealed that edited bacterial cells did not harbour chromosomal abnormalities but demonstrated elevated global cytidine deamination at deaminase intrinsic binding sites. Therefore programmable deaminases represent a promising genome editing tool in prokaryotes and eukaryotes. Future engineering is required to overcome the processivity and the intrinsic DNA binding affinity of deaminases for safer therapeutic applications.

    Other authors
    See publication
  • Global connectivity of hub residues in Oncoprotein structures encodes genetic factors dictating personalized drug response to targeted Cancer therapy

    Nature Scientific Reports

    The efficacy and mechanisms of therapeutic action are largely described by atomic bonds and interactions local to drug binding sites. Here we introduce global connectivity analysis as a high-throughput computational assay of therapeutic action – inspired by the Google page rank algorithm that unearths most “globally connected” websites from the information-dense world wide web (WWW). We execute short timescale (30 ps) molecular dynamics simulations with high sampling frequency (0.01 ps), to…

    The efficacy and mechanisms of therapeutic action are largely described by atomic bonds and interactions local to drug binding sites. Here we introduce global connectivity analysis as a high-throughput computational assay of therapeutic action – inspired by the Google page rank algorithm that unearths most “globally connected” websites from the information-dense world wide web (WWW). We execute short timescale (30 ps) molecular dynamics simulations with high sampling frequency (0.01 ps), to identify amino acid residue hubs whose global connectivity dynamics are characteristic of the ligand or mutation associated with the target protein. We find that unexpected allosteric hubs – up to 20Å from the ATP binding site, but within 5Å of the phosphorylation site – encode the Gibbs free energy of inhibition (ΔGinhibition) for select protein kinase-targeted cancer therapeutics. We further find that clinically relevant somatic cancer mutations implicated in both drug resistance and personalized drug sensitivity can be predicted in a high-throughput fashion. Our results establish global connectivity analysis as a potent assay of protein functional modulation. This sets the stage for unearthing disease-causal exome mutations and motivates forecast of clinical drug response on a patient-by-patient basis. We suggest incorporation of structure-guided genetic inference assays into pharmaceutical and healthcare Oncology workflows.

    Other authors
    See publication
  • A voxel-based Monte Carlo model of drug release from bulk eroding nanoparticles

    Journal of Nanoscience and Nanotechnology

  • Multifunctional nanoscale platforms for targeting of the cancer cell immortality spectrum

    Macromolecular Rapid Communications

  • Nanotechnology for Targeting Cancer

    Handbook of Nanophysics: Nanomedicine and Nanorobotics (CRC PRESS) - Chapter 34

  • FDA-authorized mRNA COVID-19 vaccines are effective per real-world evidence synthesized across a multi-state health system

    Med (Cell Press)

    Highlights
    BNT162b2 and mRNA-1273 are highly effective in preventing SARS-CoV-2 infection
    The real-world effectiveness of BNT162b2 is 86.1% (95% CI: 82.4%–89.1%)
    The real-world effectiveness of mRNA-1273 is 93.3% (95% CI: 85.7%–97.4%)
    Both vaccines reduce the risk of COVID-19-related hospitalization and ICU admission

    Context and significance
    This is a study of the COVID-19 vaccines developed by Pfizer/BioNTech and Moderna. Although these vaccines have been shown to be…

    Highlights
    BNT162b2 and mRNA-1273 are highly effective in preventing SARS-CoV-2 infection
    The real-world effectiveness of BNT162b2 is 86.1% (95% CI: 82.4%–89.1%)
    The real-world effectiveness of mRNA-1273 is 93.3% (95% CI: 85.7%–97.4%)
    Both vaccines reduce the risk of COVID-19-related hospitalization and ICU admission

    Context and significance
    This is a study of the COVID-19 vaccines developed by Pfizer/BioNTech and Moderna. Although these vaccines have been shown to be effective in clinical trials, it is important to confirm that they work well in practice. The results from this study show that these vaccines are effective in reducing the risk of COVID-19 infection, COVID-19-associated hospitalization, and COVID-19-associated ICU admission. As one of the largest real-world evidence studies of COVID-19 vaccine effectiveness in the United States to date, this study provides strong evidence that COVID-19 vaccines work well in practice.

    Findings
    The real-world vaccine effectiveness of preventing SARS-CoV-2 infection was 86.1% (95% confidence interval [CI]: 82.4%–89.1%) for BNT162b2 and 93.3% (95% CI: 85.7%–97.4%) for mRNA-1273. BNT162b2 and mRNA-1273 were 88.8% (95% CI: 75.5%–95.7%) and 86.0% (95% CI: 71.6%–93.9%) effective in preventing COVID-19-associated hospitalization. Both vaccines were 100% effective (95% CIBNT162b2: 51.4%–100%; 95% CImRNA-1273: 43.3%–100%) in preventing COVID-19-associated ICU admission.

    Conclusions
    BNT162b2 and mRNA-1273 are effective in a real-world setting and are associated with reduced rates of SARS-CoV-2 infection and decreased burden of COVID-19 on the healthcare system.

    Other authors
    See publication
  • Real-time biomedical knowledge synthesis of the exponentially growing world wide web using unsupervised neural networks

    bioRxiv (Peer-review ongoing)

    Decoding disease mechanisms for addressing unmet clinical need demands the rapid assimilation of the exponentially growing biomedical knowledge. These are either inherently unstructured and non-conducive to current computing paradigms or siloed into structured databases requiring specialized bioinformatics. Despite the recent renaissance in unsupervised neural networks for deciphering unstructured natural languages and the availability of numerous bioinformatics resources, a holistic platform…

    Decoding disease mechanisms for addressing unmet clinical need demands the rapid assimilation of the exponentially growing biomedical knowledge. These are either inherently unstructured and non-conducive to current computing paradigms or siloed into structured databases requiring specialized bioinformatics. Despite the recent renaissance in unsupervised neural networks for deciphering unstructured natural languages and the availability of numerous bioinformatics resources, a holistic platform for real-time synthesis of the scientific literature and seamless triangulation with deep omic insights and real-world evidence has not been advanced. Here, we introduce the nferX platform that makes the highly unstructured biomedical knowledge computable and supports the seamless visual triangulation with statistical inference from diverse structured databases. The nferX platform will accelerate and amplify the research potential of subject-matter experts as well as non-experts across the life science ecosystem (https://academia.nferx.com/).

    Other authors
    See publication
  • Three doses of COVID-19 mRNA vaccination are safe based on adverse events reported in electronic health records

    -

    Recent reports on waning of COVID-19 vaccine induced immunity have led to the approval and roll-out of additional dose and booster vaccinations. At risk individuals are receiving additional vaccine dose(s), in addition to the regimen that was tested in clinical trials. The risks and the adverse event profiles associated with these additional vaccine doses are currently not well understood. Here, we performed a retrospective study analyzing vaccine-associated adverse events using electronic…

    Recent reports on waning of COVID-19 vaccine induced immunity have led to the approval and roll-out of additional dose and booster vaccinations. At risk individuals are receiving additional vaccine dose(s), in addition to the regimen that was tested in clinical trials. The risks and the adverse event profiles associated with these additional vaccine doses are currently not well understood. Here, we performed a retrospective study analyzing vaccine-associated adverse events using electronic health records (EHRs) of individuals that have received three doses of mRNA-based COVID-19 vaccines (n = 47,999). By comparing symptoms reported in 2-week time periods after each vaccine dose and in a 2-week period before the 1st vaccine dose, we assessed the risk associated with 3rd dose vaccination, for both BNT162b2 and mRNA-1273. Reporting of severe adverse events remained low after the 3rd vaccine dose, with rates of pericarditis (0.01%, 0%-0.02% 95% CI), anaphylaxis (0.00%, 0%-0.01% 95% CI), myocarditis (0.00%, 0%-0.01% 95% CI), and cerebral venous sinus thrombosis (no cases), consistent with earlier studies. Significantly more individuals (p-value < 0.05) report low-severity adverse events after their 3rd dose compared with after their 2nd dose, including fatigue (4.92% after 3rd dose vs 3.47% after 2nd dose), lymphadenopathy (2.89% vs 2.07%), nausea (2.62% vs 2.04%), headache (2.47% vs 2.07%), arthralgia (2.12% vs 1.70%), myalgia (1.99% vs 1.63%), diarrhea (1.70% vs 1.24%), fever (1.11% vs 0.81%), vomiting (1.10% vs 0.80%), and chills (0.47% vs 0.36%). Our results show that although 3rd dose vaccination against SARS-CoV-2 infection led to increased reporting of low-severity adverse events, risk of severe adverse events remained comparable to the standard 2-dose regime. This study provides support for the safety of 3rd vaccination doses of individuals that are at high-risk of severe COVID-19 and breakthrough infection.

    Other authors
    See publication

Patents

  • Engineered polypeptide agents for targeted broad spectrum influenza neutralization

    Issued US US8637456

    The present invention provides novel agents for broad spectrum influenza neutralization. The present invention provides agents for inhibiting influenza infection by bind to the influenza virus and/or hemagglutinin (HA) polypeptides and/or HA receptors, and reagents and methods relating thereto. The present invention provides a system for analyzing interactions between infolds and the interaction partners that bind to them.

    See patent
  • Agents for influenza neutralization

    Filed US US 13/829,675

    The present invention provides antibodies (e.g., monoclonal antibodies, human antibodies, humanized antibodies, etc.), which bind to multiple influenza strains. Such antibodies are useful, for example, in the prophylaxis, treatment, diagnosis, and/or study of influenza.

    See patent
  • Systems, methods, and computer readable media for visualization of semantic information and inference of temporal signals indicating salient associations between life science entities

    US US20180082197A1

    Disclosed systems , methods , and computer readable media can detect an association between semantic entities and generate semantic information between entities. For example , semantic entities and associated semantic collections present in knowledge bases can be identified. A time period can be determined and divided into time slices. For each time slice , word embeddings for the identified semantic entities can be generated ; a first semantic association strength between a first semantic…

    Disclosed systems , methods , and computer readable media can detect an association between semantic entities and generate semantic information between entities. For example , semantic entities and associated semantic collections present in knowledge bases can be identified. A time period can be determined and divided into time slices. For each time slice , word embeddings for the identified semantic entities can be generated ; a first semantic association strength between a first semantic entity input and a second semantic entity input can be determined; and a second semantic association strength between the first semantic entity input and semantic entities associated with a semantic
    collection that is associated with the second semantic entity can be determined . An output can be provided based on the first and second semantic association strengths.

    Other inventors
    See patent

Honors & Awards

  • 2018 MedTech Boston 40 UNDER 40 Healthcare Innovator

    MedTech Boston, Personal Connected Health Alliance

    https://medtechboston.medstro.com/blog/2018/09/18/the-2018-medtech-boston-40-under-40-healthcare-innovators/

  • MIT Outstanding Undergraduate Student Mentor Award

    Massachusetts institute of technology (MIT)

    MIT UROP Mentorship Award

View Venky’s full profile

  • See who you know in common
  • Get introduced
  • Contact Venky directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses