. 2018 Dec 21:7:e36163.

doi: 10.7554/eLife.36163.

Why we need to report more than 'Data were Analyzed by t-tests or ANOVA'

Tracey L Weissgerber^{1

2}, Oscar Garcia-Valencia¹, Vesna D Garovic¹, Natasa M Milic^#^{1

3}, Stacey J Winham^#⁴

Affiliations

¹ Division of Nephrology and Hypertension, Mayo Clinic, Rochester, United States.
² QUEST, Charité - Universitätsmedizin Berlin, Berlin Institutes of Health, Berlin, Germany.
³ Department of Medical Statistics & Informatics, Medical Faculty, University of Belgrade, Belgrade, Serbia.
⁴ Division of Biomedical Statistics & Informatics, Mayo Clinic, Rochester, United States.

^# Contributed equally.

PMID: 30574870
PMCID: PMC6326723
DOI: 10.7554/eLife.36163

Why we need to report more than 'Data were Analyzed by t-tests or ANOVA'

Tracey L Weissgerber et al. Elife. 2018.

. 2018 Dec 21:7:e36163.

doi: 10.7554/eLife.36163.

Authors

Tracey L Weissgerber^{1

2}, Oscar Garcia-Valencia¹, Vesna D Garovic¹, Natasa M Milic^#^{1

3}, Stacey J Winham^#⁴

Affiliations

¹ Division of Nephrology and Hypertension, Mayo Clinic, Rochester, United States.
² QUEST, Charité - Universitätsmedizin Berlin, Berlin Institutes of Health, Berlin, Germany.
³ Department of Medical Statistics & Informatics, Medical Faculty, University of Belgrade, Belgrade, Serbia.
⁴ Division of Biomedical Statistics & Informatics, Mayo Clinic, Rochester, United States.

^# Contributed equally.

PMID: 30574870
PMCID: PMC6326723
DOI: 10.7554/eLife.36163

Abstract

Transparent reporting is essential for the critical evaluation of studies. However, the reporting of statistical methods for studies in the biomedical sciences is often limited. This systematic review examines the quality of reporting for two statistical tests, t-tests and ANOVA, for papers published in a selection of physiology journals in June 2017. Of the 328 original research articles examined, 277 (84.5%) included an ANOVA or t-test or both. However, papers in our sample were routinely missing essential information about both types of tests: 213 papers (95% of the papers that used ANOVA) did not contain the information needed to determine what type of ANOVA was performed, and 26.7% of papers did not specify what post-hoc test was performed. Most papers also omitted the information needed to verify ANOVA results. Essential information about t-tests was also missing in many papers. We conclude by discussing measures that could be taken to improve the quality of reporting.

Keywords: analysis of variance; human biology; medicine; meta-research; none; statistics; systematic review; t-test; transparency.

PubMed Disclaimer

Conflict of interest statement

TW, OG, VG, NM, SW No competing interests declared

Figures

**Figure 1.. Systematic review flow chart.**
The flow chart illustrates the selection of articles for inclusion in this analysis at each stage of the screening process.

**Figure 2.. Many papers lack the information needed to determine what type of ANOVA was performed.**
The figure illustrates the proportion of papers in our sample that reported information needed to determine what type of ANOVA was performed, including the number of factors, the names of factors, and the type of post-hoc tests. The top panel presents the proportion of all papers that included ANOVA (n = 225). 'Sometimes' indicates that the information was reported for some ANOVAs but not others. The bottom row examines the proportion of papers that specified whether each factor was between vs. within-subjects. Papers are subdivided into those that reported using repeated measures ANOVA (n = 41), and those that did not report using repeated measures ANOVA (n = 184). RM: repeated measures.

**Figure 3.. Why it matters whether investigators use a one-way vs two-way ANOVA for a study design with two factors.**
The two-way ANOVA allows investigators to determine how much of the variability explained by the model is attributed to the first factor, the second factor, and the interaction between the two factors. When a one-way ANOVA is used for a study with two factors, this information is missed because all variability explained by the model is assigned to a single factor. We cannot determine how much variability is explained by each of the two factors, or test for an interaction. The simulated dataset includes four groups – wild-type mice receiving placebo (closed blue circles), wild-type mice receiving an experimental drug (open blue circles), knockout mice receiving placebo (closed red circles) and knockout mice receiving an experimental drug (open red circles). The same dataset was used for all four examples, except that means for particular groups were shifted to show a main effect of strain, a main effect of treatment, and interaction between strain and treatment or no main effects and no interaction. One- and two-way (strain x treatment) ANOVAs were applied to illustrate differences between how these two tests interpret the variability explained by the model.

**Figure 4.. Additional implications of using a one-way vs two-way ANOVA.**
This figure compares key features of one- and two-way ANOVAs to illustrate potential problems with using a one-way ANOVA for a design with two or more factors. When used for a study with two factors, the one-way ANOVA incorrectly assumes that the groups are unrelated, generates a single p-value that does not provide information about which groups are different, and does not test for interactions. The two-way ANOVA correctly interprets the study design, which can increase power. The two-way ANOVA also allows for the generation of a set of p-values that provide more information about which groups may be different, can test for interactions, and may eliminate the need for unnecessary post-hoc comparisons. This figure uses an experimental design with four groups (wild-type mice receiving placebo, wild-type mice receiving an experimental drug, knockout mice receiving placebo and knockout mice receiving an experimental drug). See Figure 2 for a detailed explanation of the material in the statistical implications section. KO: knockout; WT: wild-type; Pla: placebo.

**Figure 5.. Why it matters whether investigators used an ANOVA with vs. without repeated measures.**
This figure highlights the differences between ANOVA with vs. without repeated measures and illustrates the problems with using an ANOVA without repeated measures when the study design includes longitudinal or non-independent measurements. These two tests interpret the data differently, test different hypotheses, use information differently when calculating the test statistic, and give different results.

**Figure 6.. Why papers need to contain sufficient detail to confirm that the appropriate t-test was used.**
This figure highlights the differences between unpaired and paired t-tests by illustrating how these tests interpret the data differently, test different hypotheses, use information differently when calculating the test statistic, and give different results. If the wrong t-test is used, the result may be misleading because the test will make incorrect assumptions about the experimental design and may test the wrong hypothesis. Without the original data, it is very difficult to determine what the result should have been (see Figure 6).

**Figure 7.. Differences between the results of statistical tests depend on the data.**
The three datasets use different pairings of the values shown in the dot plot on the left. The comments on the right side of the figure illustrate what happens when an unpaired t-test is inappropriately used to compare paired, or related, measurements. We expect paired data to be positively correlated – two paired observations are usually more similar than two unrelated observations. The strength of this correlation will vary. We expect observations from the same participant to be more similar (strongly correlated) than observations from pairs of participants matched for age and sex. Stronger correlations result in greater discrepancies between the results of the paired and unpaired t-tests. Very strong correlations between paired data are unusual but are presented here to illustrate this relationship. We do not expect paired data to be negatively correlated – if this happens it is important to review the experimental design and data to ensure that everything is correct.

**Figure 8.. Few papers report the details needed to confirm that the result of the ANOVA was correct.**
This figure reports the proportion of papers with ANOVAs (n = 225) that reported the F-statistic, degrees of freedom and exact p-values. Sometimes indicates that the information was reported for some ANOVAs contained in the paper but not for others.

See this image and copyright information in PMC

Cited by

Poor statistical reporting, inadequate data presentation and spin persist despite Journal awareness and updated Information for Authors.
Héroux M, Diong J, Bye E, Fisher G, Robertson L, Butler A, Gandevia S. Héroux M, et al. F1000Res. 2023 Nov 20;12:1483. doi: 10.12688/f1000research.142841.1. eCollection 2023. F1000Res. 2023. PMID: 38434651 Free PMC article.
Peer review of clinical and translational research manuscripts: Perspectives from statistical collaborators.
Schulte PJ, Goldberg JD, Oster RA, Ambrosius WT, Bonner LB, Cabral H, Carter RE, Chen Y, Desai M, Li D, Lindsell CJ, Pomann GM, Slade E, Tosteson TD, Yu F, Spratt H. Schulte PJ, et al. J Clin Transl Sci. 2024 Jan 4;8(1):e20. doi: 10.1017/cts.2023.707. eCollection 2024. J Clin Transl Sci. 2024. PMID: 38384899 Free PMC article. Review.
Teaching students to R3eason, not merely to solve problem sets: The role of philosophy and visual data communication in accessible data science education.
Ciubotariu II, Bosch G. Ciubotariu II, et al. PLoS Comput Biol. 2023 Jun 8;19(6):e1011160. doi: 10.1371/journal.pcbi.1011160. eCollection 2023 Jun. PLoS Comput Biol. 2023. PMID: 37289659 Free PMC article.
Exploring the relationship between personal and work characteristics of project managers and psychological safety in virtual teams.
Dzandu MD, Theophilus I, Issa D. Dzandu MD, et al. Procedia Comput Sci. 2023;219:2067-2074. doi: 10.1016/j.procs.2023.01.509. Epub 2023 Mar 22. Procedia Comput Sci. 2023. PMID: 36968673 Free PMC article.
Is it reliable to make a decision based on visual changes in the patient's diaper in the evaluation of post circumcision bleeding?
Akman M. Akman M. Afr J Paediatr Surg. 2023 Jan-Mar;20(1):12-20. doi: 10.4103/ajps.ajps_157_21. Afr J Paediatr Surg. 2023. PMID: 36722564 Free PMC article.

See all "Cited by" articles

References

1. Cumming G. The new statistics: why and how. Psychological Science. 2014;25:7–29. doi: 10.1177/0956797613504966. - DOI - PubMed
1. Diong J, Butler AA, Gandevia SC, Héroux ME. Poor statistical reporting, inadequate data presentation and spin persist despite editorial advice. PLoS One. 2018;13:e0202121. doi: 10.1371/journal.pone.0202121. - DOI - PMC - PubMed
1. Ellis DA, Merdian HL. Thinking outside the box: developing dynamic data visualizations for psychology with shiny. Frontiers in Psychology. 2015;6:1782. doi: 10.3389/fpsyg.2015.01782. - DOI - PMC - PubMed
1. EMBO Press Author Guidelines (The EMBO Journal) [December 15 , 2018];2017 http://emboj.embopress.org/authorguide#statisticalanalysis
1. Eskamp S, Nuijten MB. statcheck: Extract statistics from articles and recompute p values. 1.2.22016 http://CRAN.R-project.org/package=statcheck

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Why we need to report more than 'Data were Analyzed by t-tests or ANOVA'

Affiliations

Why we need to report more than 'Data were Analyzed by t-tests or ANOVA'

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources