Revised standards for statistical evidence
- PMID: 24218581
- PMCID: PMC3845140
- DOI: 10.1073/pnas.1313476110
Revised standards for statistical evidence
Abstract
Recent advances in Bayesian hypothesis testing have led to the development of uniformly most powerful Bayesian tests, which represent an objective, default class of Bayesian hypothesis tests that have the same rejection regions as classical significance tests. Based on the correspondence between these two classes of tests, it is possible to equate the size of classical hypothesis tests with evidence thresholds in Bayesian tests, and to equate P values with Bayes factors. An examination of these connections suggest that recent concerns over the lack of reproducibility of scientific studies can be attributed largely to the conduct of significance tests at unjustifiably high levels of significance. To correct this problem, evidence thresholds required for the declaration of a significant finding should be increased to 25-50:1, and to 100-200:1 for the declaration of a highly significant finding. In terms of classical hypothesis tests, these evidence standards mandate the conduct of tests at the 0.005 or 0.001 level of significance.
Conflict of interest statement
The author declares no conflict of interest.
Figures
![Fig. 1.](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/3845140/bin/pnas.1313476110fig01.gif)
![formula image](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/3845140/bin/pnas.1313476110i31.jpg)
![Fig. 2.](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/3845140/bin/pnas.1313476110fig02.gif)
![Fig. 3.](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/3845140/bin/pnas.1313476110fig03.gif)
Comment in
-
Reproducibility issues in science, is P value really the only answer?Proc Natl Acad Sci U S A. 2014 May 13;111(19):E1934. doi: 10.1073/pnas.1323051111. Epub 2014 Apr 23. Proc Natl Acad Sci U S A. 2014. PMID: 24760820 Free PMC article. No abstract available.
-
Revised evidence for statistical standards.Proc Natl Acad Sci U S A. 2014 May 13;111(19):E1933. doi: 10.1073/pnas.1322995111. Epub 2014 Apr 23. Proc Natl Acad Sci U S A. 2014. PMID: 24760821 Free PMC article. No abstract available.
-
Adaptive revised standards for statistical evidence.Proc Natl Acad Sci U S A. 2014 May 13;111(19):E1935. doi: 10.1073/pnas.1322191111. Epub 2014 Apr 23. Proc Natl Acad Sci U S A. 2014. PMID: 24760822 Free PMC article. No abstract available.
-
Reply to Gelman, Gaudart, Pericchi: More reasons to revise standards for statistical evidence.Proc Natl Acad Sci U S A. 2014 May 13;111(19):E1936-7. doi: 10.1073/pnas.1400338111. Proc Natl Acad Sci U S A. 2014. PMID: 24940581 Free PMC article. No abstract available.
Similar articles
-
On the Existence of Uniformly Most Powerful Bayesian Tests With Application to Non-Central Chi-Squared Tests.Bayesian Anal. 2021 Mar;16(1):93-109. doi: 10.1214/19-ba1194. Epub 2020 Jan 7. Bayesian Anal. 2021. PMID: 34113418 Free PMC article.
-
UNIFORMLY MOST POWERFUL BAYESIAN TESTS.Ann Stat. 2013;41(4):1716-1741. doi: 10.1214/13-AOS1123. Ann Stat. 2013. PMID: 24659829 Free PMC article.
-
Unscaled Bayes factors for multiple hypothesis testing in microarray experiments.Stat Methods Med Res. 2015 Dec;24(6):1030-43. doi: 10.1177/0962280212437827. Epub 2012 Feb 15. Stat Methods Med Res. 2015. PMID: 22337766
-
An analysis of the controversy over classical one-sided tests.Clin Trials. 2008;5(6):635-40. doi: 10.1177/1740774508098590. Clin Trials. 2008. PMID: 19029216 Review.
-
Bayesian hierarchical models.Methods Enzymol. 2000;321:305-30. doi: 10.1016/s0076-6879(00)21200-7. Methods Enzymol. 2000. PMID: 10909064 Review. No abstract available.
Cited by
-
The excitability of ipsilateral motor evoked potentials is not task-specific and spatially distinct from the contralateral motor hotspot.Exp Brain Res. 2024 Aug;242(8):1851-1859. doi: 10.1007/s00221-024-06851-6. Epub 2024 Jun 6. Exp Brain Res. 2024. PMID: 38842754 Free PMC article.
-
Altered cortical synaptic lipid signaling leads to intermediate phenotypes of mental disorders.Mol Psychiatry. 2024 May 28. doi: 10.1038/s41380-024-02598-2. Online ahead of print. Mol Psychiatry. 2024. PMID: 38806692
-
Impact of redefining statistical significance on P-hacking and false positive rates: An agent-based model.PLoS One. 2024 May 16;19(5):e0303262. doi: 10.1371/journal.pone.0303262. eCollection 2024. PLoS One. 2024. PMID: 38753677 Free PMC article.
-
Effectiveness of Meditation-based Interventions on Health Problems Caused by COVID-19 Pandemic: Narrative Review.Int J Yoga. 2023 May-Aug;16(2):72-78. doi: 10.4103/ijoy.ijoy_112_23. Epub 2023 Nov 21. Int J Yoga. 2023. PMID: 38204779 Free PMC article. Review.
-
On the use of receiver operating characteristic curve analysis to determine the most appropriate p value significance threshold.J Transl Med. 2024 Jan 4;22(1):16. doi: 10.1186/s12967-023-04827-8. J Transl Med. 2024. PMID: 38178182 Free PMC article.
References
-
- Zimmer C. (April 16, 2012) A sharp rise in retractions prompts calls for reform. NY Times, Science Section.
-
- Naik G. (December 2, 2011) Scientists’ elusive goal: Reproducing study results. Wall Street Journal, Health Section.
-
- Begg CB, Mazumdar M. Operating characteristics of a rank correlation test for publication bias. Biometrics. 1994;50(4):1088–1101. - PubMed
-
- Duval S, Tweedie R. Trim and fill: A simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics. 2000;56(2):455–463. - PubMed
-
- Ioannidis JP. Contradicted and initially stronger effects in highly cited clinical research. JAMA. 2005;294(2):218–228. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical