. 2013 Nov 26;110(48):19313-7.

doi: 10.1073/pnas.1313476110. Epub 2013 Nov 11.

Revised standards for statistical evidence

Valen E Johnson¹

Affiliations

PMID: 24218581
PMCID: PMC3845140
DOI: 10.1073/pnas.1313476110

Revised standards for statistical evidence

Valen E Johnson. Proc Natl Acad Sci U S A. 2013.

. 2013 Nov 26;110(48):19313-7.

doi: 10.1073/pnas.1313476110. Epub 2013 Nov 11.

Author

Valen E Johnson¹

Affiliation

¹ Department of Statistics, Texas A&M University, College Station, TX 77843-3143.

PMID: 24218581
PMCID: PMC3845140
DOI: 10.1073/pnas.1313476110

Abstract

Recent advances in Bayesian hypothesis testing have led to the development of uniformly most powerful Bayesian tests, which represent an objective, default class of Bayesian hypothesis tests that have the same rejection regions as classical significance tests. Based on the correspondence between these two classes of tests, it is possible to equate the size of classical hypothesis tests with evidence thresholds in Bayesian tests, and to equate P values with Bayes factors. An examination of these connections suggest that recent concerns over the lack of reproducibility of scientific studies can be attributed largely to the conduct of significance tests at unjustifiably high levels of significance. To correct this problem, evidence thresholds required for the declaration of a significant finding should be increased to 25-50:1, and to 100-200:1 for the declaration of a highly significant finding. In terms of classical hypothesis tests, these evidence standards mandate the conduct of tests at the 0.005 or 0.001 level of significance.

PubMed Disclaimer

Conflict of interest statement

The author declares no conflict of interest.

Figures

**Fig. 1.**
Evidence thresholds and size of corresponding significance tests. The UMPBT and significance tests used to construct this plot have the same (z, , and binomial tests) or approximately the same (t tests) rejection regions. The smooth curves represent, from *Top* to *Bottom*, t tests based on 20, 30, and 60 degrees of freedom, the z test, and the χ² test on 1 degree of freedom. The discontinuous curves reflect the correspondence between tests of a binomial proportion based on 20, 30, or 60 observations when the null hypothesis is p₀ = 0.5.

formula image — **Fig. 1.**
Evidence thresholds and size of corresponding significance tests. The UMPBT and significance tests used to construct this plot have the same (z, , and binomial tests) or approximately the same (t tests) rejection regions. The smooth curves represent, from *Top* to *Bottom*, t tests based on 20, 30, and 60 degrees of freedom, the z test, and the χ² test on 1 degree of freedom. The discontinuous curves reflect the correspondence between tests of a binomial proportion based on 20, 30, or 60 observations when the null hypothesis is p₀ = 0.5.

**Fig. 2.**
P values versus UMPBT Bayes factors. This plot depicts approximate Bayes factors derived from 765 t statistics reported by Wetzels et al. (20). A breakdown of the curvilinear relationship between Bayes factors and P values occurs in the lower right portion of the plot, which corresponds to t statistics that produce Bayes factors that are near their maximum value.

**Fig. 3.**
Histogram of P values that were less than 0.05 and reported in ref. .

See this image and copyright information in PMC

Comment in

Reproducibility issues in science, is P value really the only answer?
Gaudart J, Huiart L, Milligan PJ, Thiebaut R, Giorgi R. Gaudart J, et al. Proc Natl Acad Sci U S A. 2014 May 13;111(19):E1934. doi: 10.1073/pnas.1323051111. Epub 2014 Apr 23. Proc Natl Acad Sci U S A. 2014. PMID: 24760820 Free PMC article. No abstract available.
Revised evidence for statistical standards.
Gelman A, Robert CP. Gelman A, et al. Proc Natl Acad Sci U S A. 2014 May 13;111(19):E1933. doi: 10.1073/pnas.1322995111. Epub 2014 Apr 23. Proc Natl Acad Sci U S A. 2014. PMID: 24760821 Free PMC article. No abstract available.
Adaptive revised standards for statistical evidence.
Pericchi L, Pereira CA, Pérez ME. Pericchi L, et al. Proc Natl Acad Sci U S A. 2014 May 13;111(19):E1935. doi: 10.1073/pnas.1322191111. Epub 2014 Apr 23. Proc Natl Acad Sci U S A. 2014. PMID: 24760822 Free PMC article. No abstract available.
Reply to Gelman, Gaudart, Pericchi: More reasons to revise standards for statistical evidence.
Johnson VE. Johnson VE. Proc Natl Acad Sci U S A. 2014 May 13;111(19):E1936-7. doi: 10.1073/pnas.1400338111. Proc Natl Acad Sci U S A. 2014. PMID: 24940581 Free PMC article. No abstract available.

Cited by

The excitability of ipsilateral motor evoked potentials is not task-specific and spatially distinct from the contralateral motor hotspot.
Seusing N, Strauss S, Fleischmann R, Nafz C, Groppa S, Muthuraman M, Ding H, Byblow WD, Lotze M, Grothe M. Seusing N, et al. Exp Brain Res. 2024 Aug;242(8):1851-1859. doi: 10.1007/s00221-024-06851-6. Epub 2024 Jun 6. Exp Brain Res. 2024. PMID: 38842754 Free PMC article.
Altered cortical synaptic lipid signaling leads to intermediate phenotypes of mental disorders.
Tüscher O, Muthuraman M, Horstmann JP, Horta G, Radyushkin K, Baumgart J, Sigurdsson T, Endle H, Ji H, Kuhnhäuser P, Götz J, Kepser LJ, Lotze M, Grabe HJ, Völzke H, Leehr EJ, Meinert S, Opel N, Richers S, Stroh A, Daun S, Tittgemeyer M, Uphaus T, Steffen F, Zipp F, Groß J, Groppa S, Dannlowski U, Nitsch R, Vogt J. Tüscher O, et al. Mol Psychiatry. 2024 May 28. doi: 10.1038/s41380-024-02598-2. Online ahead of print. Mol Psychiatry. 2024. PMID: 38806692
Impact of redefining statistical significance on P-hacking and false positive rates: An agent-based model.
Fitzpatrick BG, Gorman DM, Trombatore C. Fitzpatrick BG, et al. PLoS One. 2024 May 16;19(5):e0303262. doi: 10.1371/journal.pone.0303262. eCollection 2024. PLoS One. 2024. PMID: 38753677 Free PMC article.
Effectiveness of Meditation-based Interventions on Health Problems Caused by COVID-19 Pandemic: Narrative Review.
Tseng AA. Tseng AA. Int J Yoga. 2023 May-Aug;16(2):72-78. doi: 10.4103/ijoy.ijoy_112_23. Epub 2023 Nov 21. Int J Yoga. 2023. PMID: 38204779 Free PMC article. Review.
On the use of receiver operating characteristic curve analysis to determine the most appropriate p value significance threshold.
Habibzadeh F. Habibzadeh F. J Transl Med. 2024 Jan 4;22(1):16. doi: 10.1186/s12967-023-04827-8. J Transl Med. 2024. PMID: 38178182 Free PMC article.

See all "Cited by" articles

References

1. Zimmer C. (April 16, 2012) A sharp rise in retractions prompts calls for reform. NY Times, Science Section.
1. Naik G. (December 2, 2011) Scientists’ elusive goal: Reproducing study results. Wall Street Journal, Health Section.
1. Begg CB, Mazumdar M. Operating characteristics of a rank correlation test for publication bias. Biometrics. 1994;50(4):1088–1101. - PubMed
1. Duval S, Tweedie R. Trim and fill: A simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics. 2000;56(2):455–463. - PubMed
1. Ioannidis JP. Contradicted and initially stronger effects in highly cited clinical research. JAMA. 2005;294(2):218–228. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

R01 CA158113/CA/NCI NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Revised standards for statistical evidence

Affiliation

Revised standards for statistical evidence

Author

Affiliation

Abstract

Conflict of interest statement

Figures

Comment in

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Abstract

Conflict of interest statement

Figures

Comment in

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical