Interval Estimation

Definition

In statistics, interval estimation is the use of sample data to calculate an interval of possible (or probable) values of an unknown population parameter, in contrast to point estimation, which is a single number. Neyman (1937) identified interval estimation ("estimation by interval") as distinct from point estimation ("estimation by unique estimate"). In doing so, he recognised that then-recent work quoting results in the form of an estimate plus-or-minus a standard deviation indicated that interval estimation was actually the problem statisticians really had in mind.

Confidence Interval

In statistics, a confidence interval (CI) is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval (i.e it is calculated from the observations), in principle different from sample to sample, that frequently includes the parameter of interest, if the experiment is repeated. How frequently the observed interval contains the parameter is determined by the confidence level or confidence coefficient.

A confidence interval with a particular confidence level is intended to give the assurance that, if the statistical model is correct, then taken over all the data that might have been obtained, the procedure for constructing the interval would deliver a confidence interval that included the true value of the parameter the proportion of the time set by the confidence level. More specifically, the meaning of the term "confidence level" is that, if confidence intervals are constructed across many separate data analyses of repeated (and possibly different) experiments, the proportion of such intervals that contain the true value of the parameter will approximately match the confidence level; this is guaranteed by the reasoning underlying the construction of confidence intervals.

A confidence interval does not predict that the true value of the parameter has a particular probability of being in the confidence interval given the data actually obtained.

Interval Estimation Using t-Distribution

Suppose ${\displaystyle X_{1},...,X_{n}}$ are an independent sample from a normally distributed population with mean ${\displaystyle \mu }$ and variance ${\displaystyle \sigma ^{2}}$. Let

{\displaystyle {\begin{aligned}{\bar {X}}&=&{\frac {(X_{1}+...+X_{n})}{n}}\\S^{2}&=&{\frac {1}{n-1}}\sum _{i=1}^{n}(X_{i}-{\bar {X}})^{2}\end{aligned}}}

be the sample mean and sample variance.

Then

${\displaystyle T={\frac {{\bar {X}}-\mu }{S/{\sqrt {n}}}}}$

has a t-distribution with n-1 degrees of freedom. Not that the distribution of T does not depend on the values of the unobservable parameters ${\displaystyle \mu }$ and ${\displaystyle \sigma ^{2}}$.

Suppose we want to calculate a 90% confidence interval for the mean. Then we have:

${\displaystyle Pr({\bar {X}}-{\frac {cS}{\sqrt {n}}}<\mu <{\bar {X}}+{\frac {cS}{\sqrt {n}}})=0.9}$

where c is the 95th percentile of this distribution.

Therefore, we have the confidence interval

${\displaystyle [{\bar {x}}-{\frac {cs}{\sqrt {n}}},{\bar {x}}+{\frac {cs}{\sqrt {n}}}]}$