Abstract
In order to facilitate comparisons across follow-up studies that have used different measures of effect size, we provide a table of effect size equivalencies for the three most common measures: ROC area (AUC), Cohen's d, and r. We outline why AUC is the preferred measure of predictive or diagnostic accuracy in forensic psychology or psychiatry, and we urge researchers and practitioners to use numbers rather than verbal labels to characterize effect sizes.
Similar content being viewed by others
References
Berlin, F. S., Galbreath, N. W., Geary, B., & McGlone, G. (2003). The use of actuarials at civil commitment hearings to predict the likelihood of future sexual violence. Sexual Abuse: A Journal of Research and Treatment, 15, 377–382.
Cohen, J. (1969). Statistical power analysis for the behavioral sciences. New York: Academic Press.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Cohen, J. (1992). A power primer. Psychological Bulletin, 122, 155–159.
Delaney, H. D., & Vargha, A. (2002). Comparing several robust tests of stochastic equality with ordinally scaled variables and small to moderate sized samples. Psychological Methods, 7, 485–503.
Harris, G. T., & Rice, M. E. (2003). Actuarial assessment of risk among sex offenders. In R. A. Prentky, E. S. Janus, & M. C. Seto (Eds.), Understanding and managing sexually coercive behavior, Vol. 989 (pp. 198–210). New York: Annals of the New York Academy of Sciences.
Hemphill, J. F. (2003). Interpreting the magnitudes of correlation coefficients. American Psychologist, 58, 78–80.
Hilton, N. Z., Carter, A. M., Harris, G. T., & Bryans, A. (2005). Using categorical judgments to communicate risk of violence. Unpublished manuscript.
McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111, 361–365.
Mossman, D. (1994). Assessing predictions of violence being accurate about accuracy. Journal of Consulting and Clinical Psychology, 62, 783–792.
Pearson, E. S., & Hartley, H. O. (Eds.). (1954). Biometrika tables for statisticians, Vol. 1 (1st ed.). Cambridge: Cambridge University Press.
Rice, M. E., & Harris, G. T. (1995). Violent recidivism: Assessing predictive validity. Journal of Consulting and Clinical Psychology, 63, 737–748.
Rosenthal, R. (1990). How are we doing in soft psychology? American Psychologist, 45, 775–777.
Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park, CA: Sage.
Rosenthal, R., & Rubin, D. B. (1982). A simple, general purpose display of magnitude of experimental effect. Journal of Educational Psychology, 74, 166–169.
Swets, J. A. (1986). Indices of discrimination or diagnostic accuracy: Their ROCs and implied models. Psychological Bulletin, 99, 100–117.
Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Psychological science can improve diagnostic decisions. Psychological Science in the Public Interest: A Journal of the American Psychological Society, 1, 1–26.
Author information
Authors and Affiliations
Corresponding author
Additional information
Strictly speaking, d values pertain only to variables scored on an interval scale. When the nondichotomous variable is ordinally scaled, r or AUC should be used. Nevertheless, the values in Table 1 allow one to compare the relative magnitudes across studies that have reported any of the three effect size measures.
About this article
Cite this article
Rice, M.E., Harris, G.T. Comparing Effect Sizes in Follow-Up Studies: ROC Area, Cohen's d, and r. Law Hum Behav 29, 615–620 (2005). https://doi.org/10.1007/s10979-005-6832-7
Issue Date:
DOI: https://doi.org/10.1007/s10979-005-6832-7