We read with great interest the article by Xia et al [1], published in this issue of The Journal of Infectious Diseases (JID), which employed Mendelian randomization to estimate the effect of serum levels of ticagrelor (or, more specifically, steady-state area under the curve of ticagrelor or its major active metabolite AR-C124910XX) on the risk of infections. As noted in a recent article in JID [2], causal inference has become a mainstream branch of statistics that attempts to test for the impact of an intervention on an outcome using study data. Causal inference recognizes that, when experimentation is impossible, the treatment allocation may not be independent of other factors that predict the outcome. Rather, there may be common factors that affect both the treatment and the outcome. For instance, in a nonrandomized study, sicker individuals may receive a more aggressive treatment, and sicker individuals may also be less likely to experience a full recovery; untangling differences in recovery rates that are due to the treatment versus other factors such as the baseline level of health requires statistical adjustment. Frequently, such adjustment is direct—for example, via choosing pairs of individuals, each one having received one of 2 competing treatments, where the individuals are matched with respect to initial health status, or by a regression analysis where the health status measure is included as a covariate in the regression model. However, for such adjustment strategies to be successful in removing any confounding bias (ie, the bias that arises due to a variable that predicts both the treatment and the outcome and thus distorts the treatment effect), all confounding variables must be recorded and available to the analyst. Unfortunately, this cannot always be ensured. In such circumstances, an alternative causal approach can be employed: an instrumental variables (IV) analysis.

As outlined by Goetghebeur et al and previously highlighted in JID [2, 3], learning about causal relationships requires the explicit formalization of all definitions (eg, of the population of interest, the exposure, the outcome, the timeframe, and so on), the target causal effect, and the method of estimation with all the associated assumptions on data availability. Causal graphs, also called directed acyclic graphs, are useful tools for encoding beliefs on proposed relationships between variables and to center the analyses on a single treatment or exposure. In Figure 1, we display an assumed data-generating mechanism that lends itself to an IV analysis, similar to Figure 1 in [1], so called because it relies on the existence of an instrument or instrumental variable, which must satisfy 3 conditions. In particular, (i) the instrument has a causal effect on the exposure, (ii) there are no common causes of the instrument and the outcome, and (iii) the instrument affects the outcome variable only through its effect on treatment but does not have a direct influence on the outcome. Condition (ii) states there is no confounding for the effect of the instrument on the outcome; (iii) is called the exclusion restriction. In Figure 1, condition (ii) is seen by the lack of any variables with a direct effect on both the outcome and the instrument, while condition (iii) is seen by the lack of any direct arrow from the instrument to the outcome. Note, however, that the treatment is confounded due to the presence of a common cause of both treatment and the outcome.

Example of a causal diagram showing a simple setting with a treatment or exposure (eg, serum levels of ticagrelor) whose effect on the outcome is confounded (eg, because of underlying health status, diet, socioeconomic status, and other variables that are predictive of both cardiovascular disease—and thus use of ticagrelor—and infection status). The instrument, or instrumental variable, is one or more variables that causally affect the exposure level but have no direct effect on the outcome. In these diagrams, the direction of an arrow is indicative of a causal relationship, where changing the level or value of the variable at the origin of the arrow will result in changes in the variable to which it points. A, Generic setting in which an instrumental variables analysis could be used. B, Setting of the study of Xia et al [1], where “confounder” may include factors such as underlying health status, etc. Abbreviations: AUC, area under the curve; SNP, single-nucleotide polymorphism.
Figure 1.

Example of a causal diagram showing a simple setting with a treatment or exposure (eg, serum levels of ticagrelor) whose effect on the outcome is confounded (eg, because of underlying health status, diet, socioeconomic status, and other variables that are predictive of both cardiovascular disease—and thus use of ticagrelor—and infection status). The instrument, or instrumental variable, is one or more variables that causally affect the exposure level but have no direct effect on the outcome. In these diagrams, the direction of an arrow is indicative of a causal relationship, where changing the level or value of the variable at the origin of the arrow will result in changes in the variable to which it points. A, Generic setting in which an instrumental variables analysis could be used. B, Setting of the study of Xia et al [1], where “confounder” may include factors such as underlying health status, etc. Abbreviations: AUC, area under the curve; SNP, single-nucleotide polymorphism.

Instruments may, in general, be difficult to identify; however, there are some special circumstances in which they arise. The first is in the setting of a randomized controlled trial where there is imperfect compliance. In such a setting, the randomly assigned treatment is the instrument, and the treatment actually taken is the exposure. Clearly, when the instrument is a randomization process, there is no confounding of the randomization and the decision to comply with assigned treatment (condition (ii)), nor is there an effect of the randomization on the outcome (condition (iii)); furthermore, it is reasonable to suppose that most participants in a trial will comply with treatment, such that randomization causally affects the treatment taken (condition (i)). The second setting in which an instrument may be plausibly identified is one in which genetic variants predict the exposure but are unrelated to the outcome. In this case, the IV analysis is often referred to as a Mendelian randomization (MR) analysis; the genetic variant acts as a random allocation mechanism, with the randomness inherent to assortment of parental genes during meiosis.

The simplest form of an instrumental variable analysis is a 2-stage regression approach, which can be used for a continuous outcome measure within a single dataset, combining statistics derived from (1) the estimated association between the instrument and the outcome (note that this association is due only to the path from the instrument to the outcome via the treatment of interest) and (2) the association between the instrument and the exposure itself. Xia et al [1] use a 2-sample MR analysis (see, eg, [4, 5]), which leverages the fact that an MR analysis uses estimates in (1) and (2) that need not be estimated in the same dataset. In a 2-sample MR, one database must be identified to find the genetic variants or single-nucleotide polymorphisms (SNPs) that are associated with the exposure of interest—in the present case, SNPs associated with serum levels of ticagrelor and its major active metabolite. The second database is then used to assess the relationship between the SNPs and the outcome, in this case infections. Under assumption (iii) above, any relationship observed in the second database between the SNPs and the outcome can be attributed to a causal effect of the ticagrelor serum levels on the risk of infections; the specific estimate of the causal effect is typically calculated as a ratio of the effect of the instrument on the outcome (effect of SNPs on infection) to the effect of the instrument on the exposure (effect of SNPs on ticagrelor serum levels). This 2-sample approach has as a major advantage that public data can be used to perform an MR analysis. In the study of Xia et al [1], the association between instruments (genetic variants) and the exposure (ticagrelor plasma levels) was obtained from a genome-wide association study on patients treated with ticagrelor, while the association between instruments and the different outcomes are obtained from 2 other large biobanks, the UK Biobank and FinnGen.

MR analyses, and 2-sample MR in particular, have gained an important foothold in causal analyses over the past decade. While IV analysis is not commonly employed in articles published in JID, Xia et al provide a welcome introduction to this method, leveraging large datasets to ensure adequate power.

All 3 IV assumptions should hold to obtain valid results. In the MR context, the third assumption requires that there is no horizontal pleiotropy, meaning that the genetic variant(s) have no effect on disease except through the effect on the exposure. In MR studies, the association between genetic variants and exposures are often very weak. To increase power, often multiple genetic variants are used simultaneously in an MR analysis. Fortunately, new MR methods exist that do not require horizontal pleiotropy for each genetic variant. For example, the weighted median method only requires that at least 50% of the genetic variants are valid instruments.

Two-stage MR analyses are easy to carry out with publicly available databases; however, the validity of any findings will rely crucially on the 3 assumptions needed for IV analyses. In particular, it is necessary that assumptions on pleiotropy of the methods hold. Xia et al tested to the extent possible that the exclusion restriction is met, and performed adjustments to the significance level to account for examining 5 different (though related) infection outcomes. It is exciting to see modern methods of causal inference with all of the care and clear and direct statements of assumptions and data requirements (in this case, making explicit use of the Strengthening the Reporting of Observational Studies in Epidemiology using Mendelian Randomization [STROBE-MR] checklist [6]; an alternative tool is the critical appraisal checklist of Davies et al [7]) gaining traction within the journal. As with any nonexperimental study, caution is warranted in interpreting the findings, which should be viewed as contributing to the body of evidence on a particular causal relationship rather than a definitive answer.

Notes

Financial support. E. E. M. M. acknowledges funding from a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada; is supported by a Chercheur de mérite career award from the Fonds de recherche du Québec, Santé; and holds a Canada Research Chair in Statistical Methods for Precision Medicine from the Canadian Institutes of Health Research.

References

1

Xia
 
M
,
Wu
 
Q
,
Wang
 
Y
,
Peng1
 
Y
,
Qian
 
C
.
Associations between ticagrelor 1 use and the risk of infections: a Mendelian randomization study
.
J Infect Dis
 
2024
; https://doi.org/10.1093/infdis/jiae177.

2

Moodie
 
EEM
.
Causal inference and confounding: a primer for interpreting and conducting infectious disease research
.
J Infect Dis
 
2023
;
228
:
365
7
.

3

Goetghebeur
 
E
,
le Cessie
 
S
,
De Stavola
 
B
,
Moodie
 
EEM
,
Waernbaum
 
I
.
Tutorial: formulating causal questions and principled statistical answers
.
Stat Med
 
2020
;
39
:
4922
48
.

4

Lawlor
 
DA
.
Commentary: two-sample Mendelian randomization: opportunities and challenges
.
Int J Epidemiol
 
2016
;
45
:
908
15
.

5

Hartwig
 
FP
,
Davies
 
NM
,
Hemani
 
G
,
Davey Smith
 
G
.
Two-sample Mendelian randomization: avoiding the downsides of a powerful, widely applicable but potentially fallible technique
.
Int J Epidemiol
 
2016
;
45
:
1717
26
.

6

Skrivankova
 
VW
,
Richmond
 
RC
,
Woolf
 
BAR
, et al.   
Strengthening the Reporting of Observational Studies in Epidemiology using Mendelian Randomization: the STROBE-MR statement
.
J Am Med Assoc
 
2021
;
326
:
1614
21
.

7

Davies
 
NM
,
Holmes
 
MV
,
Davey Smith
 
G
.
Reading Mendelian randomisation studies: a guide, glossary, and checklist for clinicians
.
Br Med J
 
2018
;
362
:
k601
.

Author notes

Potential conflicts of interest. The authors: No reported conflicts.

Both authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/pages/standard-publication-reuse-rights)