Sample Distribution

From UBC Wiki
Jump to: navigation, search
EconHelp.png This article is part of the EconHelp Tutoring Wiki


In statistics, a sampling distribution or finite-sample distribution is the distribution of a given statistic based on a random sample of size n. It may be considered as the distribution of the statistic for all possible samples from the same population of a given size. The sampling distribution depends on the underlying distribution of the population, the statistic being considered, and the sample size used.

For example, consider a normal population with mean μ and variance σ². Assume we repeatedly take samples of a given size from this population and calculate the arithmetic mean for each sample — this statistic is called the sample mean. Each sample has its own average value, and the distribution of these averages is called the “sampling distribution of the sample mean”. This distribution is normal since the underlying population is normal.

This is an example of a simple statistic taken from one of the simplest statistical populations. For other statistics and other populations the formulas are frequently more complicated, and oftentimes they don’t even exist in closed-form. In such cases the sampling distributions may be approximated through Monte-Carlo simulations, bootstrap method, or asymptotic distribution theory.

The standard deviation of the sampling distribution of the statistic is referred to as the standard error of that quantity. For the case where the statistic is the sample mean, the standard error is:

where σ is the standard deviation of the population distribution of that quantity and n is the size (number of items) in the sample.

A very important implication of this formula is that you must quadruple the sample size (4×) to achieve half (1/2) the measurement error. When designing statistical studies where cost is a factor, this may have a factor in understanding cost-benefit tradeoffs.

Alternatively, consider the sample median from the same population. It has a different sampling distribution which is generally not normal (but may be close under certain circumstances).