Revisions to confidence intervals for proportions containing a theoretically impossible value (zero)

deleted 1 character in body

Source Link

edited Jul 4 at 16:55

9.4k
11
23
38

This is really a hypothetical question not related to an actual issue I have, so this question is just out of curiosity. I'm aware of this other related question What should I do when a confidence interval includes an impossible range of values? but I think that the details of what I have in mind are different.

Say I want to estimate the proportion of something in a population. I know from qualitative studies that this "something" absolutely does exist in the population, even though it is rare (how rare, I don't really have information about that, except for the couple of observations described in the few qualitative studies on the subject). I plan to compute binomial confidence intervals (e.g. Wilson confidence intervals) to get an estimation of plausible proportions in the population.

However, after randomly sampling a few thousand observations from the population, I fail to identify any of this "something" in my sample, so the confidence interval includes 0, even though I know for a fact that zero is not a possible value, and that there are no problems with my sampling method (except a sample size that is apparently not large enough).

Example with R, where "lwr.ci" is the computed lower bound of the CI:

> library("DescTools")
> BinomCI(0, 50000, conf.level = 0.95, sides = "two.sided", method = "wilson")

      est  lwr.ci        upr.ci
[1,]    0       0  7.682327e-05

> library("DescTools")
> BinomCI(0, 50000, conf.level = 0.95, sides = "two.sided", method = "wilson")
    
          est  lwr.ci        upr.ci
    [1,]    0       0  7.682327e-05

What are some ways to solve this problem and compute an estimation that does not include 0 in the first place? Is it a situation where I should use credible intervals? (If so, what would be some correct ways to define the priors?)

This is really a hypothetical question not related to an actual issue I have, so this question is just out of curiosity. I'm aware of this other related question What should I do when a confidence interval includes an impossible range of values? but I think that the details of what I have in mind are different.

Say I want to estimate the proportion of something in a population. I know from qualitative studies that this "something" absolutely does exist in the population, even though it is rare (how rare, I don't really have information about that, except for the couple of observations described in the few qualitative studies on the subject). I plan to compute binomial confidence intervals (e.g. Wilson confidence intervals) to get an estimation of plausible proportions in the population.

However, after randomly sampling a few thousand observations from the population, I fail to identify any of this "something" in my sample, so the confidence interval includes 0, even though I know for a fact that zero is not a possible value, and that there are no problems with my sampling method (except a sample size that is apparently not large enough).

Example with R, where "lwr.ci" is the computed lower bound of the CI:

> library("DescTools")
> BinomCI(0, 50000, conf.level = 0.95, sides = "two.sided", method = "wilson")

      est  lwr.ci        upr.ci
[1,]    0       0  7.682327e-05

What are some ways to solve this problem and compute an estimation that does not include 0 in the first place? Is it a situation where I should use credible intervals? (If so, what would be some correct ways to define the priors?)

This is really a hypothetical question not related to an actual issue I have, so this question is just out of curiosity. I'm aware of this other related question What should I do when a confidence interval includes an impossible range of values? but I think that the details of what I have in mind are different.

Say I want to estimate the proportion of something in a population. I know from qualitative studies that this "something" absolutely does exist in the population, even though it is rare (how rare, I don't really have information about that, except for the couple of observations described in the few qualitative studies on the subject). I plan to compute binomial confidence intervals (e.g. Wilson confidence intervals) to get an estimation of plausible proportions in the population.

However, after randomly sampling a few thousand observations from the population, I fail to identify any of this "something" in my sample, so the confidence interval includes 0, even though I know for a fact that zero is not a possible value, and that there are no problems with my sampling method (except a sample size that is apparently not large enough).

Example with R, where "lwr.ci" is the computed lower bound of the CI:

> library("DescTools")
> BinomCI(0, 50000, conf.level = 0.95, sides = "two.sided", method = "wilson")
    
          est  lwr.ci        upr.ci
    [1,]    0       0  7.682327e-05

What are some ways to solve this problem and compute an estimation that does not include 0 in the first place? Is it a situation where I should use credible intervals? (If so, what would be some correct ways to define the priors?)

Became Hot Network Question

occurred Jul 3 at 17:31

example in R, in case my explanation isn't clear

Source Link

edited Jul 3 at 13:28

Coris

73
4

This is really a hypothetical question not related to an actual issue I have, so this question is just out of curiosity. I'm aware of this other related question What should I do when a confidence interval includes an impossible range of values? but I think that the details of what I have in mind are different.

Say I want to estimate the proportion of something in a population. I know from qualitative studies that this "something" absolutely does exist in the population, even though it is rare (how rare, I don't really have information about that, except for the couple of observations described in the few qualitative studies on the subject). I plan to compute binomial confidence intervals (e.g. Wilson confidence intervals) to get an estimation of plausible proportions in the population.

However, after randomly sampling a few thousand observations from the population, I fail to identify any of this "something" in my sample, so the confidence interval includes 0, even though I know for a fact that zero is not a possible value, and that there are no problems with my sampling method (except a sample size that is apparently not large enough).

Example with R, where "lwr.ci" is the computed lower bound of the CI:

> library("DescTools")
> BinomCI(0, 50000, conf.level = 0.95, sides = "two.sided", method = "wilson")

      est  lwr.ci        upr.ci
[1,]    0       0  7.682327e-05