1.7 Variance and Standard Deviation

From UBC Wiki
Revision as of 09:55, 30 May 2012 by EdKroc (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Another important quantity related to a given random variable is its variance. The variance is a numerical description of the spread, or the dispersion, of the random variable. That is, the variance of a random variable X is a measure of how spread out the values of X are, given how likely each value is to be observed.

Variance and Standard Deviation of a Discrete Random Variable
The variance, Var(X), of a discrete random variable X is
\text{Var}(X) = \sum_{k=1}^{N} \Big(x_k - \mathbb{E}(X)\Big)^2\textrm{Pr}(X=x_k)

where N is the total number of possible values of X.

The standard deviation, σ, is the positive square root of the variance:

\sigma(X) = \sqrt{\text{Var}(X)}

Observe that the variance of a random variable is always nonnegative (since probabilities are nonnegative, and the square of a number is also nonnegative).

Observe also that much like the expectation of a random variable X, the variance (or standard deviation) is a weighted average of an expression of observable and calculable values. More precisely, notice that

\text{Var}(X) = \mathbb{E}\left(\left[X - \mathbb{E}(X)\right]^2\right).

Example: Test Scores

Using the test scores example of the previous sections, calculate the variance and standard deviation of the random variable X associated to randomly selecting a single exam.

Solution

The variance of the random variable X is given by

\begin{align}
\text{Var}(X)
&= \sum_{k=1}^{N} (x_k - \mathbb{E}(X))^2 \textrm{Pr}(X=x_k) \\
&= (30-64)^2 \frac{3}{10} + (60 - 64)^2\frac{2}{10} + (80 - 64)^2 \frac{3}{10} + (90-64)^2 \frac{1}{10} + (100-64)^2 \frac{1}{10} \\
&= 624
\end{align}

The standard deviation of X is then

\sigma(X) = \sqrt{624}\approx 24.979992

Interpretation of the Standard Deviation

For most "nice" random variables, i.e. ones that are not too wildly distributed, the standard deviation has a convenient informal interpretation. Consider the intervals

S_m = \left[\mathbb{E}(X) - m\sigma(X),\ \mathbb{E}(X) + m\sigma(X)\right],

for some positive integer m. As we increase the value of m, these intervals will contain more of the possible values of the random variable X.

A good rule of thumb is that for "nicely distributed" random variables, all of the most likely possible values of the random variable will be contained in the interval S3. Another way to say this is that, for discrete random variables, most of the PMF will live on the interval S3. We will see in the next chapter that a similar interpretation holds for continuous random variables.

Personal tools
Namespaces
Variants
Actions
Login
Tools

 

UBC Wiki
Email: wiki.support@ubc.ca

Emergency Procedures | Accessibility | Contact UBC