[HTML][HTML] Thinking twice about sum scores

D McNeish, MG Wolf�- Behavior research methods, 2020 - Springer
Behavior research methods, 2020Springer
A common way to form scores from multiple-item scales is to sum responses of all items.
Though sum scoring is often contrasted with factor analysis as a competing method, we
review how factor analysis and sum scoring both fall under the larger umbrella of latent
variable models, with sum scoring being a constrained version of a factor analysis. Despite
similarities, reporting of psychometric properties for sum scored or factor analyzed scales
are quite different. Further, if researchers use factor analysis to validate a scale but�…
Abstract
A common way to form scores from multiple-item scales is to sum responses of all items. Though sum scoring is often contrasted with factor analysis as a competing method, we review how factor analysis and sum scoring both fall under the larger umbrella of latent variable models, with sum scoring being a constrained version of a factor analysis. Despite similarities, reporting of psychometric properties for sum scored or factor analyzed scales are quite different. Further, if researchers use factor analysis to validate a scale but subsequently sum score the scale, this employs a model that differs from validation model. By framing sum scoring within a latent variable framework, our goal is to raise awareness that (a) sum scoring requires rather strict constraints, (b) imposing these constraints requires the same type of justification as any other latent variable model, and (c) sum scoring corresponds to a statistical model and is not a model-free arithmetic calculation. We discuss how unjustified sum scoring can have adverse effects on validity, reliability, and qualitative classification from sum score cut-offs. We also discuss considerations for how to use scale scores in subsequent analyses and how these choices can alter conclusions. The general goal is to encourage researchers to more critically evaluate how they obtain, justify, and use multiple-item scale scores.
Springer