3
$\begingroup$

I don't have too formal of a grounding in statistics, so sorry if this doesn't make much sense. But:

What are the differences between using a t-distribution to generate confidence intervals for small samples vs using a wilson score confidence intervals? Can they even both be used for this purpose, or am I misunderstanding one (or both)? Are t-intervals more appropriate in certain situations and wilson intervals in others?


Just to confirm my understanding, given $$\bar{x} = \frac{x_1 + \cdots + x_n}{n}$$

we use a t-score interval if we believe $x_i \sim\mathcal{N}(\mu,\sigma^2)$, and we use a Wilson interval if we believe $x_i \sim \mathcal{B}(1,p)$?

$\endgroup$

1 Answer 1

4
$\begingroup$

Are t-intervals more appropriate in certain situations and wilson intervals in others?

Precisely this. They apply to two different situations, the first being (at least approximately) normally distributed values, the second for proportions based on binomially distributed counts.

There are a number of t-intervals, but generally speaking, they are all used, in essence, when you are trying to construct an interval for a population mean, with unknown variance (so the two are estimated from the sample). Discussion of a basic case is here.

The Wilson score interval, on the other hand, is used when dealing with proportions. It's for constructing an interval for a population proportion (a proportion is itself a kind of mean, but one where the variance is related to the mean). It's one of a number of intervals used for binomial population proportions. It's used for situations where the basic data is counts.


A common way to derive intervals is via pivotal quantities (a good term to search on here, there are a number of answers that discuss simple examples). A pivotal quantity is a function of observations and unobservable parameters whose probability distribution does not depend on the unknown parameters.

So in the case of a t-interval, the interval is based on t-distributions because $\frac{\bar x -\mu}{s/\sqrt n}$* is a pivotal quantity which has a t-distribution.

* or a structurally similar statistic for other t-intervals

That this has a t-distribution relies on the independence of $\bar x$ and $s$, which you have under normality. In the case of a proportion, you don't have a separate estimate of variance, you only have the proportion itself. The independence isn't there, so there's no basis on which to construct a t-interval.

Instead, the Wilson interval is based on the statistic in a score test, which will be asymptotically normal.

$\endgroup$
5
  • $\begingroup$ What about the two makes them more appropriate for different situations? Is it just that the Wilson intervals is derived from a binomial distribution that makes them more appropriate for proportions? Why can't we simply use t-intervals for proportions as well? Are there assumptions being made in constructing a t-interval that wilson intervals are able to account for better? $\endgroup$
    – alecbz
    Commented Mar 27, 2014 at 21:48
  • $\begingroup$ alecbenzer -- See my edits. $\endgroup$
    – Glen_b
    Commented Mar 27, 2014 at 22:20
  • $\begingroup$ "which you have under normality" You mean the normality of $\bar{x}$, or are we assuming normality of the $x_i$ that we're averaging to get $\bar{x}$ ? $\endgroup$
    – alecbz
    Commented Mar 27, 2014 at 22:39
  • $\begingroup$ ok nvm I think I missed your first sentence before -- so we are assuming that the $x_i$ are normally distributed when using a t-interval? Not just that $\bar{x}$ is roughly normal (which I thought was the idea with confidence intervals in general) $\endgroup$
    – alecbz
    Commented Mar 27, 2014 at 22:49
  • $\begingroup$ That $\bar x$ is approximately normal only gives you that the numerator of a t-statistic is normal. It doesn't give you that $s^2$ is approximately scaled chi-square or that $\bar x$ and $s$ are independent, so we don't obtain a t-distribution for the pivotal quantity. Instead we could rely on Slutsky's theorem to show that a t-like quantity will be asymptotically normal; we have no good basis for asserting it's $t$-distributed (though we might use, say simulation in particular instances and observe that it's something close to a $t$). $\endgroup$
    – Glen_b
    Commented Mar 27, 2014 at 22:54

Not the answer you're looking for? Browse other questions tagged or ask your own question.