-
Understanding and avoiding the "weights of regression": Heterogeneous effects, misspecification, and longstanding solutions
Authors:
Chad Hazlett,
Tanvi Shinkre
Abstract:
Researchers in many fields endeavor to estimate treatment effects by regressing outcome data (Y) on a treatment (D) and observed confounders (X). Even absent unobserved confounding, the regression coefficient on the treatment reports a weighted average of strata-specific treatment effects (Angrist, 1998). Where heterogeneous treatment effects cannot be ruled out, the resulting coefficient is thus…
▽ More
Researchers in many fields endeavor to estimate treatment effects by regressing outcome data (Y) on a treatment (D) and observed confounders (X). Even absent unobserved confounding, the regression coefficient on the treatment reports a weighted average of strata-specific treatment effects (Angrist, 1998). Where heterogeneous treatment effects cannot be ruled out, the resulting coefficient is thus not generally equal to the average treatment effect (ATE), and is unlikely to be the quantity of direct scientific or policy interest. The difference between the coefficient and the ATE has led researchers to propose various interpretational, bounding, and diagnostic aids (Humphreys, 2009; Aronow and Samii, 2016; Sloczynski, 2022; Chattopadhyay and Zubizarreta, 2023). We note that the linear regression of Y on D and X can be misspecified when the treatment effect is heterogeneous in X. The "weights of regression", for which we provide a new (more general) expression, simply characterize how the OLS coefficient will depart from the ATE under the misspecification resulting from unmodeled treatment effect heterogeneity. Consequently, a natural alternative to suffering these weights is to address the misspecification that gives rise to them. For investigators committed to linear approaches, we propose relying on the slightly weaker assumption that the potential outcomes are linear in X. Numerous well-known estimators are unbiased for the ATE under this assumption, namely regression-imputation/g-computation/T-learner, regression with an interaction of the treatment and covariates (Lin, 2013), and balancing weights. Any of these approaches avoid the apparent weighting problem of the misspecified linear regression, at an efficiency cost that will be small when there are few covariates relative to sample size. We demonstrate these lessons using simulations in observational and experimental settings.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Causal progress with imperfect placebo treatments and outcomes
Authors:
Adam Rohde,
Chad Hazlett
Abstract:
In the quest to make defensible causal claims from observational data, it is sometimes possible to leverage information from "placebo treatments" and "placebo outcomes" (or "negative outcome controls"). Existing approaches employing such information focus largely on point identification and assume (i) "perfect placebos", meaning placebo treatments have precisely zero effect on the outcome and the…
▽ More
In the quest to make defensible causal claims from observational data, it is sometimes possible to leverage information from "placebo treatments" and "placebo outcomes" (or "negative outcome controls"). Existing approaches employing such information focus largely on point identification and assume (i) "perfect placebos", meaning placebo treatments have precisely zero effect on the outcome and the real treatment has precisely zero effect on a placebo outcome; and (ii) "equiconfounding", meaning that the treatment-outcome relationship where one is a placebo suffers the same amount of confounding as does the real treatment-outcome relationship, on some scale. We instead consider an omitted variable bias framework, in which users can postulate non-zero effects of placebo treatment on real outcomes or of real treatments on placebo outcomes, and the relative strengths of confounding suffered by a placebo treatment/outcome compared to the true treatment-outcome relationship. Once postulated, these assumptions identify or bound the linear estimates of treatment effects. While applicable in many settings, one ubiquitous use-case for this approach is to employ pre-treatment outcomes as (perfect) placebo outcomes. In this setting, the parallel trends assumption of difference-in-difference is in fact a strict equiconfounding assumption on a particular scale, which can be relaxed in our framework. Finally, we demonstrate the use of our framework with two applications, employing an R package that implements these approaches.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Real Effect or Bias? Best Practices for Evaluating the Robustness of Real-World Evidence through Quantitative Sensitivity Analysis for Unmeasured Confounding
Authors:
Douglas Faries,
Chenyin Gao,
Xiang Zhang,
Chad Hazlett,
James Stamey,
Shu Yang,
Peng Ding,
Mingyang Shan,
Kristin Sheffield,
Nancy Dreyer
Abstract:
The assumption of no unmeasured confounders is a critical but unverifiable assumption required for causal inference yet quantitative sensitivity analyses to assess robustness of real-world evidence remains underutilized. The lack of use is likely in part due to complexity of implementation and often specific and restrictive data requirements required for application of each method. With the advent…
▽ More
The assumption of no unmeasured confounders is a critical but unverifiable assumption required for causal inference yet quantitative sensitivity analyses to assess robustness of real-world evidence remains underutilized. The lack of use is likely in part due to complexity of implementation and often specific and restrictive data requirements required for application of each method. With the advent of sensitivity analyses methods that are broadly applicable in that they do not require identification of a specific unmeasured confounder, along with publicly available code for implementation, roadblocks toward broader use are decreasing. To spur greater application, here we present a best practice guidance to address the potential for unmeasured confounding at both the design and analysis stages, including a set of framing questions and an analytic toolbox for researchers. The questions at the design stage guide the research through steps evaluating the potential robustness of the design while encouraging gathering of additional data to reduce uncertainty due to potential confounding. At the analysis stage, the questions guide researchers to quantifying the robustness of the observed result and providing researchers with a clearer indication of the robustness of their conclusions. We demonstrate the application of the guidance using simulated data based on a real-world fibromyalgia study, applying multiple methods from our analytic toolbox for illustration purposes.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Finding quadruply imaged quasars with machine learning. I. Methods
Authors:
A. Akhazhanov,
A. More,
A. Amini,
C. Hazlett,
T. Treu,
S. Birrer,
A. Shajib,
P. Schechter,
C. Lemon,
B. Nord,
M. Aguena,
S. Allam,
F. Andrade-Oliveira,
J. Annis,
D. Brooks,
E. Buckley-Geer,
D. L. Burke,
A. Carnero Rosell,
M. Carrasco Kind,
J. Carretero,
A. Choi,
C. Conselice,
M. Costanzi,
L. N. da Costa,
M. E. S. Pereira
, et al. (46 additional authors not shown)
Abstract:
Strongly lensed quadruply imaged quasars (quads) are extraordinary objects. They are very rare in the sky -- only a few tens are known to date -- and yet they provide unique information about a wide range of topics, including the expansion history and the composition of the Universe, the distribution of stars and dark matter in galaxies, the host galaxies of quasars, and the stellar initial mass f…
▽ More
Strongly lensed quadruply imaged quasars (quads) are extraordinary objects. They are very rare in the sky -- only a few tens are known to date -- and yet they provide unique information about a wide range of topics, including the expansion history and the composition of the Universe, the distribution of stars and dark matter in galaxies, the host galaxies of quasars, and the stellar initial mass function. Finding them in astronomical images is a classic "needle in a haystack" problem, as they are outnumbered by other (contaminant) sources by many orders of magnitude. To solve this problem, we develop state-of-the-art deep learning methods and train them on realistic simulated quads based on real images of galaxies taken from the Dark Energy Survey, with realistic source and deflector models, including the chromatic effects of microlensing. The performance of the best methods on a mixture of simulated and real objects is excellent, yielding area under the receiver operating curve in the range 0.86 to 0.89. Recall is close to 100% down to total magnitude i~21 indicating high completeness, while precision declines from 85% to 70% in the range i~17-21. The methods are extremely fast: training on 2 million samples takes 20 hours on a GPU machine, and 10^8 multi-band cutouts can be evaluated per GPU-hour. The speed and performance of the method pave the way to apply it to large samples of astronomical sources, bypassing the need for photometric pre-selection that is likely to be a major cause of incompleteness in current samples of known quads.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Kpop: A kernel balancing approach for reducing specification assumptions in survey weighting
Authors:
Erin Hartman,
Chad Hazlett,
Ciara Sterbenz
Abstract:
With the precipitous decline in response rates, researchers and pollsters have been left with highly non-representative samples, relying on constructed weights to make these samples representative of the desired target population. Though practitioners employ valuable expert knowledge to choose what variables, $X$ must be adjusted for, they rarely defend particular functional forms relating these v…
▽ More
With the precipitous decline in response rates, researchers and pollsters have been left with highly non-representative samples, relying on constructed weights to make these samples representative of the desired target population. Though practitioners employ valuable expert knowledge to choose what variables, $X$ must be adjusted for, they rarely defend particular functional forms relating these variables to the response process or the outcome. Unfortunately, commonly-used calibration weights -- which make the weighted mean $X$ in the sample equal that of the population -- only ensure correct adjustment when the portion of the outcome and the response process left unexplained by linear functions of $X$ are independent. To alleviate this functional form dependency, we describe kernel balancing for population weighting (kpop). This approach replaces the design matrix $\mathbf{X}$ with a kernel matrix, $\mathbf{K}$ encoding high-order information about $\mathbf{X}$. Weights are then found to make the weighted average row of $\mathbf{K}$ among sampled units approximately equal that of the target population. This produces good calibration on a wide range of smooth functions of $X$, without relying on the user to decide which $X$ or what functions of them to include. We describe the method and illustrate it by application to polling data from the 2016 U.S. presidential election.
△ Less
Submitted 2 March, 2024; v1 submitted 16 July, 2021;
originally announced July 2021.
-
Kernel Balancing: A flexible non-parametric weighting procedure for estimating causal effects
Authors:
Chad Hazlett
Abstract:
In the absence of unobserved confounders, matching and weighting methods are widely used to estimate causal quantities including the Average Treatment Effect on the Treated (ATT). Unfortunately, these methods do not necessarily achieve their goal of making the multivariate distribution of covariates for the control group identical to that of the treated, leaving some (potentially multivariate) fun…
▽ More
In the absence of unobserved confounders, matching and weighting methods are widely used to estimate causal quantities including the Average Treatment Effect on the Treated (ATT). Unfortunately, these methods do not necessarily achieve their goal of making the multivariate distribution of covariates for the control group identical to that of the treated, leaving some (potentially multivariate) functions of the covariates with different means between the two groups. When these "imbalanced" functions influence the non-treatment potential outcome, the conditioning on observed covariates fails, and ATT estimates may be biased. Kernel balancing, introduced here, targets a weaker requirement for unbiased ATT estimation, specifically, that the expected non-treatment potential outcome for the treatment and control groups are equal. The conditional expectation of the non-treatment potential outcome is assumed to fall in the space of functions associated with a choice of kernel, implying a set of basis functions in which this regression surface is linear. Weights are then chosen on the control units such that the treated and control group have equal means on these basis functions. As a result, the expectation of the non-treatment potential outcome must also be equal for the treated and control groups after weighting, allowing unbiased ATT estimation by subsequent difference in means or an outcome model using these weights. Moreover, the weights produced are (1) precisely those that equalize a particular kernel-based approximation of the multivariate distribution of covariates for the treated and control, and (2) equivalent to a form of stabilized inverse propensity score weighting, though it does not require assuming any model of the treatment assignment mechanism. An R package, KBAL, is provided to implement this approach.
△ Less
Submitted 30 April, 2016;
originally announced May 2016.