https://wiki.ubc.ca/api.php?action=feedcontributions&user=EdKroc&feedformat=atomUBC Wiki - User contributions [en]2024-03-29T13:51:45ZUser contributionsMediaWiki 1.39.6https://wiki.ubc.ca/index.php?title=1.7_Variance_and_Standard_Deviation&diff=1720891.7 Variance and Standard Deviation2012-05-30T16:55:35Z<p>EdKroc: </p>
<hr />
<div>Another important quantity related to a given random variable is its variance. The '''variance''' is a numerical description of the spread, or the ''dispersion'', of the random variable. That is, the variance of a random variable ''X'' is a measure of how spread out the values of ''X'' are, given how likely each value is to be observed.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Variance and Standard Deviation of a Discrete Random Variable<br />
|-<br />
| The variance, Var(''X''), of a discrete random variable ''X'' is <br />
<br />
<center><math>\text{Var}(X) = \sum_{k=1}^{N} \Big(x_k - \mathbb{E}(X)\Big)^2\textrm{Pr}(X=x_k)</math></center><br />
<br />
where ''N'' is the total number of possible values of ''X''. <br />
<br />
The '''standard deviation''', ''σ'', is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math></center><br />
<br />
|}<br />
<br />
Observe that the variance of a random variable is always nonnegative (since probabilities are nonnegative, and the square of a number is also nonnegative). <br />
<br />
Observe also that much like the expectation of a random variable ''X'', the variance (or standard deviation) is a weighted average of an expression of observable and calculable values. More precisely, notice that <br />
<br />
<math>\text{Var}(X) = \mathbb{E}\left(\left[X - \mathbb{E}(X)\right]^2\right).</math><br />
<br />
==Example: Test Scores==<br />
<br />
Using the test scores example of the previous sections, calculate the variance and standard deviation of the random variable ''X'' associated to randomly selecting a single exam.<br />
<br />
==Solution==<br />
<br />
The variance of the random variable ''X'' is given by<br />
<br />
<math>\begin{align}<br />
\text{Var}(X)<br />
&= \sum_{k=1}^{N} (x_k - \mathbb{E}(X))^2 \textrm{Pr}(X=x_k) \\<br />
&= (30-64)^2 \frac{3}{10} + (60 - 64)^2\frac{2}{10} + (80 - 64)^2 \frac{3}{10} + (90-64)^2 \frac{1}{10} + (100-64)^2 \frac{1}{10} \\<br />
&= 624<br />
\end{align}</math><br />
<br />
The standard deviation of ''X'' is then<br />
<br />
<math>\sigma(X) = \sqrt{624}\approx 24.979992</math><br />
<br />
==Interpretation of the Standard Deviation==<br />
<br />
For most "nice" random variables, i.e. ones that are not too wildly distributed, the standard deviation has a convenient informal interpretation. Consider the intervals <br />
<br />
<center><math>S_m = \left[\mathbb{E}(X) - m\sigma(X),\ \mathbb{E}(X) + m\sigma(X)\right],</math></center><br />
<br />
for some positive integer ''m''. As we increase the value of ''m'', these intervals will contain more of the possible values of the random variable ''X''. <br />
<br />
A good rule of thumb is that for "nicely distributed" random variables, all of the most likely possible values of the random variable will be contained in the interval ''S''<sub>3</sub>. Another way to say this is that, for discrete random variables, most of the PMF will live on the interval ''S''<sub>3</sub>. We will see in the next chapter that a similar interpretation holds for continuous random variables.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.5_Some_Common_Discrete_Distributions&diff=1720841.5 Some Common Discrete Distributions2012-05-30T16:52:15Z<p>EdKroc: </p>
<hr />
<div>A random variable is a theoretical representation of a physical or experimental process we wish to study. Formally, it is a function defined over a ''sample space'' of possible outcomes. For our simple coin tossing experiment, where we flip a fair coin once and observe the outcome, our sample space consists of the two outcomes, H or T. When tossing two fair coins sequentially, our sample space consists of the four outcomes HH, HT, TH or TT. <br />
<br />
Let us fix a sample space of ''n'' tosses of a fair coin. Experimentally, we may be interested in studying the number of "heads" observed after tossing the coin ''n'' times. Or we could be interested in studying the number of tosses needed to first observe "heads". Or we could be interested in studying how likely a certain sequence of "heads" and "tails" is to be observed. Each of these experiments are defined on the same sample space (the events generated by ''n'' tosses of a fair coin), yet each strive to quantify different things. Consequently, each experiment should be associated with a different random variable.<br />
<br />
<br />
==The Binomial Distribution==<br />
<br />
Let ''X<sub>n</sub>'' denote the random variable that counts the number of times we observe "heads" when flipping a fair coin ''n'' times. Clearly, ''X'' can take on any integer value from 0 to n, corresponding to the experimental outcome of observing 0 to n "heads". How likely is any particular outcome of this random variable? Notice that we do not care about the order of the observations here, so that if ''n'' = 3, the outcome THH is equivalent to the outcomes HTH and HHT. Each of these outcomes contains two "heads".<br />
<br />
The likelihood of any particular outcome is what is represented by the probability mass function (PMF) of the random variable. Suppose ''n'' = 2. Then we see that the PMF of ''X<sub>2</sub>'' is given by:<br />
<br />
* Pr''(X<sub>2</sub> = 0)'' = 1/4<br />
* Pr''(X<sub>2</sub> = 1)'' = 1/2<br />
* Pr''(X<sub>2</sub> = 2)'' = 1/4<br />
<br />
We say that ''X<sub>2</sub>'' is a '''binomial''' random variable with parameters 2 (the number of times we flip the fair coin) and 1/2 (the probability that we observe heads after a single flip of the coin). We can write ''X<sub>2</sub> ~'' Bin(2, 1/2). <br />
<br />
Just as we did with Bernoulli random variables, we can think of our coin tossing experiment a bit more abstractly. Specifically, we can think of observing "heads" as a success and observing "tails" as a failure. This abstraction will help us generalize our coin tossing procedure to more general experiments.<br />
<br />
If ''X'' is a binomial random variable associated to ''n'' independent trials, each with a success probability ''p'', then the probability mass function of ''X'' is:<br />
<br />
<center><math>\textrm{Pr}( X = k ) = \frac {n!}{k!(n-k)!}\cdot p^k(1-p)^{n-k},</math></center><br />
<br />
where ''k'' is any integer from 0 to ''n''. Recall that the ''factorial'' notation ''n''! denotes the product of the first ''n'' positive integers: ''n''! = 1·2·3···(''n''-1)·''n'', and that we observe the convention 0! = 1.<br />
<br />
For our coin tossing experiment, the probability of success - that is, the probability of observing "heads" - was the same as the probability of failure, observing "tails". In general, we may be interested in processes that have different probabilities of success and failure.<br />
<br />
For example, suppose that we know that 5% of all light bulbs produced by a particular manufacturer are defective. If we buy a package of 6 light bulbs and want to calculate the probability that at least one is defective, we can do so by identifying this experiment with a binomial random variable. Here, we can think of observing a defective bulb as a "success" and observing a functional bulb as a "failure". Then our experiment is given by the random variable ''X<sub>6</sub>'' ~ Bin(6, 1/20), since we will observe 6 bulbs in total and each has a probability of 5/100 = 1/20 of being defective.<br />
<br />
In general, we can think of observing ''n'' independent experimental trials and counting the number of "successes" that we witness. The probability distribution we associate with this setup is the '''binomial''' random variable with parameters ''n'' and ''p'', where ''p'' is the probability of "success." We can denote this distributional relationship to a random variable ''X'' by ''X'' ~ Bin(''n'', ''p''). <br />
<br />
==The Geometric Distribution==<br />
<br />
Now consider a slightly different experiment where we wish to flip our fair coin repeatedly until we first observe "heads". Since we can first observe heads on the first flip, the second flip, the third flip, or on any subsequent flip, we see that the possible values our random variable can take are 1, 2, 3,.... <br />
<br />
Of course, we can consider a more abstract experiment where we observe a sequence of trials until we first observe a success, where the probability of success is ''p''. If we let ''X'' denote such a random variable, then we say that ''X'' is a '''geometric''' random variable with parameter ''p''. We can denote this particular random variable by ''X'' ~ Geo(''p'').<br />
<br />
Letting S denote the outcome of "success" and F denote the outcome of "failure", we can summarize the possible outcomes of a geometric experiment and their likelihoods (the probability mass function) in the following table. Here, we write ''p'' for the probability of success and ''q'' for the probability of failure.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Experimental Outcome<br />
! Value of the Random Variable, ''X = x''<br />
! Probability <br />
|-<br />
| S<br />
| ''x'' = 1<br />
| ''p''<br />
|-<br />
| FS<br />
| ''x'' = 2<br />
| ''q·p''<br />
|-<br />
| FFS<br />
| ''x'' = 3<br />
| ''q<sup>2</sup>·p''<br />
|-<br />
| FFFS<br />
| ''x'' = 4<br />
| ''q<sup>3</sup>·p''<br />
|-<br />
| FFFFS<br />
| ''x'' = 5<br />
| ''q<sup>4</sup>·p''<br />
|-<br />
| ...<br />
| ...<br />
| ...<br />
<br />
|}<br />
<br />
When flipping a fair coin, we see that ''X'' ~ Geo(1/2), so that our PDF takes the particularly simple form Pr(''X = k'') = (1/2)<sup>''k''</sup> for any positive integer ''k''.<br />
<br />
==The Discrete Uniform Distribution==<br />
<br />
Now consider a coin tossing experiment of flipping a fair coin ''n'' times and observing the sequence of "heads" and "tails". Because each outcome of a single flip of the coin is equally likely, and because the outcome of a single flip does not affect the outcome of another flip, we see that the likelihood of observing any particular sequence of "heads" and "tails" will always be the same. Notice that for ''n'' = 2 or 6, we have already encountered this random variable (see Section 1.01 and Sections 1.02 - 1.04 respectively).<br />
<br />
We say that a random variable ''X'' has a '''discrete uniform''' distribution on ''n'' points if ''X'' can assume any one of ''n'' values, each with equal probability. Evidently then, if ''X'' takes integer values from 1 to ''n'', we find that the PMF of ''X'' must be Pr(''X = k'') = 1/n, for any integer ''k'' between 1 and ''n''.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.5_Some_Common_Discrete_Distributions&diff=1720831.5 Some Common Discrete Distributions2012-05-30T16:52:01Z<p>EdKroc: </p>
<hr />
<div>A random variable is a theoretical representation of a physical or experimental process we wish to study. Formally, it is a function defined over a ''sample space'' of possible outcomes. For our simple coin tossing experiment, where we flip a fair coin once and observe the outcome, our sample space consists of the two outcomes, H or T. When tossing two fair coins sequentially, our sample space consists of the four outcomes HH, HT, TH or TT. <br />
<br />
Let us fix a sample space of ''n'' tosses of a fair coin. Experimentally, we may be interested in studying the number of "heads" observed after tossing the coin ''n'' times. Or we could be interested in studying the number of tosses needed to first observe "heads". Or we could be interested in studying how likely a certain sequence of "heads" and "tails" is to be observed. Each of these experiments are defined on the same sample space (the events generated by ''n'' tosses of a fair coin), yet each strive to quantify different things. Consequently, each experiment should be associated with a different random variable.<br />
<br />
<br />
==The Binomial Distribution==<br />
<br />
Let ''X<sub>n</sub>'' denote the random variable that counts the number of times we observe "heads" when flipping a fair coin ''n'' times. Clearly, ''X'' can take on any integer value from 0 to n, corresponding to the experimental outcome of observing 0 to n "heads". How likely is any particular outcome of this random variable? Notice that we do not care about the order of the observations here, so that if ''n'' = 3, the outcome THH is equivalent to the outcomes HTH and HHT. Each of these outcomes contains two "heads".<br />
<br />
The likelihood of any particular outcome is what is represented by the probability mass function (PMF) of the random variable. Suppose ''n'' = 2. Then we see that the PMF of ''X<sub>2</sub>'' is given by:<br />
<br />
* Pr''(X<sub>2</sub> = 0)'' = 1/4<br />
* Pr''(X<sub>2</sub> = 1)'' = 1/2<br />
* Pr''(X<sub>2</sub> = 2)'' = 1/4<br />
<br />
We say that ''X<sub>2</sub>'' is a '''binomial''' random variable with parameters 2 (the number of times we flip the fair coin) and 1/2 (the probability that we observe heads after a single flip of the coin). We can write ''X<sub>2</sub> ~'' Bin(2, 1/2). <br />
<br />
Just as we did with Bernoulli random variables, we can think of our coin tossing experiment a bit more abstractly. Specifically, we can think of observing "heads" as a success and observing "tails" as a failure. This abstraction will help us generalize our coin tossing procedure to more general experiments.<br />
<br />
If ''X'' is a binomial random variable associated to ''n'' independent trials, each with a success probability ''p'', then the probability mass function of ''X'' is:<br />
<br />
<center><math>\textrm{Pr}( X = k ) = \frac {n!}{k!(n-k)!}\cdotp^k(1-p)^{n-k},</math></center><br />
<br />
where ''k'' is any integer from 0 to ''n''. Recall that the ''factorial'' notation ''n''! denotes the product of the first ''n'' positive integers: ''n''! = 1·2·3···(''n''-1)·''n'', and that we observe the convention 0! = 1.<br />
<br />
For our coin tossing experiment, the probability of success - that is, the probability of observing "heads" - was the same as the probability of failure, observing "tails". In general, we may be interested in processes that have different probabilities of success and failure.<br />
<br />
For example, suppose that we know that 5% of all light bulbs produced by a particular manufacturer are defective. If we buy a package of 6 light bulbs and want to calculate the probability that at least one is defective, we can do so by identifying this experiment with a binomial random variable. Here, we can think of observing a defective bulb as a "success" and observing a functional bulb as a "failure". Then our experiment is given by the random variable ''X<sub>6</sub>'' ~ Bin(6, 1/20), since we will observe 6 bulbs in total and each has a probability of 5/100 = 1/20 of being defective.<br />
<br />
In general, we can think of observing ''n'' independent experimental trials and counting the number of "successes" that we witness. The probability distribution we associate with this setup is the '''binomial''' random variable with parameters ''n'' and ''p'', where ''p'' is the probability of "success." We can denote this distributional relationship to a random variable ''X'' by ''X'' ~ Bin(''n'', ''p''). <br />
<br />
==The Geometric Distribution==<br />
<br />
Now consider a slightly different experiment where we wish to flip our fair coin repeatedly until we first observe "heads". Since we can first observe heads on the first flip, the second flip, the third flip, or on any subsequent flip, we see that the possible values our random variable can take are 1, 2, 3,.... <br />
<br />
Of course, we can consider a more abstract experiment where we observe a sequence of trials until we first observe a success, where the probability of success is ''p''. If we let ''X'' denote such a random variable, then we say that ''X'' is a '''geometric''' random variable with parameter ''p''. We can denote this particular random variable by ''X'' ~ Geo(''p'').<br />
<br />
Letting S denote the outcome of "success" and F denote the outcome of "failure", we can summarize the possible outcomes of a geometric experiment and their likelihoods (the probability mass function) in the following table. Here, we write ''p'' for the probability of success and ''q'' for the probability of failure.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Experimental Outcome<br />
! Value of the Random Variable, ''X = x''<br />
! Probability <br />
|-<br />
| S<br />
| ''x'' = 1<br />
| ''p''<br />
|-<br />
| FS<br />
| ''x'' = 2<br />
| ''q·p''<br />
|-<br />
| FFS<br />
| ''x'' = 3<br />
| ''q<sup>2</sup>·p''<br />
|-<br />
| FFFS<br />
| ''x'' = 4<br />
| ''q<sup>3</sup>·p''<br />
|-<br />
| FFFFS<br />
| ''x'' = 5<br />
| ''q<sup>4</sup>·p''<br />
|-<br />
| ...<br />
| ...<br />
| ...<br />
<br />
|}<br />
<br />
When flipping a fair coin, we see that ''X'' ~ Geo(1/2), so that our PDF takes the particularly simple form Pr(''X = k'') = (1/2)<sup>''k''</sup> for any positive integer ''k''.<br />
<br />
==The Discrete Uniform Distribution==<br />
<br />
Now consider a coin tossing experiment of flipping a fair coin ''n'' times and observing the sequence of "heads" and "tails". Because each outcome of a single flip of the coin is equally likely, and because the outcome of a single flip does not affect the outcome of another flip, we see that the likelihood of observing any particular sequence of "heads" and "tails" will always be the same. Notice that for ''n'' = 2 or 6, we have already encountered this random variable (see Section 1.01 and Sections 1.02 - 1.04 respectively).<br />
<br />
We say that a random variable ''X'' has a '''discrete uniform''' distribution on ''n'' points if ''X'' can assume any one of ''n'' values, each with equal probability. Evidently then, if ''X'' takes integer values from 1 to ''n'', we find that the PMF of ''X'' must be Pr(''X = k'') = 1/n, for any integer ''k'' between 1 and ''n''.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.5_Some_Common_Discrete_Distributions&diff=1720821.5 Some Common Discrete Distributions2012-05-30T16:51:44Z<p>EdKroc: </p>
<hr />
<div>A random variable is a theoretical representation of a physical or experimental process we wish to study. Formally, it is a function defined over a ''sample space'' of possible outcomes. For our simple coin tossing experiment, where we flip a fair coin once and observe the outcome, our sample space consists of the two outcomes, H or T. When tossing two fair coins sequentially, our sample space consists of the four outcomes HH, HT, TH or TT. <br />
<br />
Let us fix a sample space of ''n'' tosses of a fair coin. Experimentally, we may be interested in studying the number of "heads" observed after tossing the coin ''n'' times. Or we could be interested in studying the number of tosses needed to first observe "heads". Or we could be interested in studying how likely a certain sequence of "heads" and "tails" is to be observed. Each of these experiments are defined on the same sample space (the events generated by ''n'' tosses of a fair coin), yet each strive to quantify different things. Consequently, each experiment should be associated with a different random variable.<br />
<br />
<br />
==The Binomial Distribution==<br />
<br />
Let ''X<sub>n</sub>'' denote the random variable that counts the number of times we observe "heads" when flipping a fair coin ''n'' times. Clearly, ''X'' can take on any integer value from 0 to n, corresponding to the experimental outcome of observing 0 to n "heads". How likely is any particular outcome of this random variable? Notice that we do not care about the order of the observations here, so that if ''n'' = 3, the outcome THH is equivalent to the outcomes HTH and HHT. Each of these outcomes contains two "heads".<br />
<br />
The likelihood of any particular outcome is what is represented by the probability mass function (PMF) of the random variable. Suppose ''n'' = 2. Then we see that the PMF of ''X<sub>2</sub>'' is given by:<br />
<br />
* Pr''(X<sub>2</sub> = 0)'' = 1/4<br />
* Pr''(X<sub>2</sub> = 1)'' = 1/2<br />
* Pr''(X<sub>2</sub> = 2)'' = 1/4<br />
<br />
We say that ''X<sub>2</sub>'' is a '''binomial''' random variable with parameters 2 (the number of times we flip the fair coin) and 1/2 (the probability that we observe heads after a single flip of the coin). We can write ''X<sub>2</sub> ~'' Bin(2, 1/2). <br />
<br />
Just as we did with Bernoulli random variables, we can think of our coin tossing experiment a bit more abstractly. Specifically, we can think of observing "heads" as a success and observing "tails" as a failure. This abstraction will help us generalize our coin tossing procedure to more general experiments.<br />
<br />
If ''X'' is a binomial random variable associated to ''n'' independent trials, each with a success probability ''p'', then the probability mass function of ''X'' is:<br />
<br />
<center><math>\textrm{Pr}( X = k ) = \frac {n!}{k!(n-k)!}p^k(1-p)^{n-k},</math></center><br />
<br />
where ''k'' is any integer from 0 to ''n''. Recall that the ''factorial'' notation ''n''! denotes the product of the first ''n'' positive integers: ''n''! = 1·2·3···(''n''-1)·''n'', and that we observe the convention 0! = 1.<br />
<br />
For our coin tossing experiment, the probability of success - that is, the probability of observing "heads" - was the same as the probability of failure, observing "tails". In general, we may be interested in processes that have different probabilities of success and failure.<br />
<br />
For example, suppose that we know that 5% of all light bulbs produced by a particular manufacturer are defective. If we buy a package of 6 light bulbs and want to calculate the probability that at least one is defective, we can do so by identifying this experiment with a binomial random variable. Here, we can think of observing a defective bulb as a "success" and observing a functional bulb as a "failure". Then our experiment is given by the random variable ''X<sub>6</sub>'' ~ Bin(6, 1/20), since we will observe 6 bulbs in total and each has a probability of 5/100 = 1/20 of being defective.<br />
<br />
In general, we can think of observing ''n'' independent experimental trials and counting the number of "successes" that we witness. The probability distribution we associate with this setup is the '''binomial''' random variable with parameters ''n'' and ''p'', where ''p'' is the probability of "success." We can denote this distributional relationship to a random variable ''X'' by ''X'' ~ Bin(''n'', ''p''). <br />
<br />
==The Geometric Distribution==<br />
<br />
Now consider a slightly different experiment where we wish to flip our fair coin repeatedly until we first observe "heads". Since we can first observe heads on the first flip, the second flip, the third flip, or on any subsequent flip, we see that the possible values our random variable can take are 1, 2, 3,.... <br />
<br />
Of course, we can consider a more abstract experiment where we observe a sequence of trials until we first observe a success, where the probability of success is ''p''. If we let ''X'' denote such a random variable, then we say that ''X'' is a '''geometric''' random variable with parameter ''p''. We can denote this particular random variable by ''X'' ~ Geo(''p'').<br />
<br />
Letting S denote the outcome of "success" and F denote the outcome of "failure", we can summarize the possible outcomes of a geometric experiment and their likelihoods (the probability mass function) in the following table. Here, we write ''p'' for the probability of success and ''q'' for the probability of failure.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Experimental Outcome<br />
! Value of the Random Variable, ''X = x''<br />
! Probability <br />
|-<br />
| S<br />
| ''x'' = 1<br />
| ''p''<br />
|-<br />
| FS<br />
| ''x'' = 2<br />
| ''q·p''<br />
|-<br />
| FFS<br />
| ''x'' = 3<br />
| ''q<sup>2</sup>·p''<br />
|-<br />
| FFFS<br />
| ''x'' = 4<br />
| ''q<sup>3</sup>·p''<br />
|-<br />
| FFFFS<br />
| ''x'' = 5<br />
| ''q<sup>4</sup>·p''<br />
|-<br />
| ...<br />
| ...<br />
| ...<br />
<br />
|}<br />
<br />
When flipping a fair coin, we see that ''X'' ~ Geo(1/2), so that our PDF takes the particularly simple form Pr(''X = k'') = (1/2)<sup>''k''</sup> for any positive integer ''k''.<br />
<br />
==The Discrete Uniform Distribution==<br />
<br />
Now consider a coin tossing experiment of flipping a fair coin ''n'' times and observing the sequence of "heads" and "tails". Because each outcome of a single flip of the coin is equally likely, and because the outcome of a single flip does not affect the outcome of another flip, we see that the likelihood of observing any particular sequence of "heads" and "tails" will always be the same. Notice that for ''n'' = 2 or 6, we have already encountered this random variable (see Section 1.01 and Sections 1.02 - 1.04 respectively).<br />
<br />
We say that a random variable ''X'' has a '''discrete uniform''' distribution on ''n'' points if ''X'' can assume any one of ''n'' values, each with equal probability. Evidently then, if ''X'' takes integer values from 1 to ''n'', we find that the PMF of ''X'' must be Pr(''X = k'') = 1/n, for any integer ''k'' between 1 and ''n''.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.5_Some_Common_Discrete_Distributions&diff=1720801.5 Some Common Discrete Distributions2012-05-30T16:48:54Z<p>EdKroc: </p>
<hr />
<div>A random variable is a theoretical representation of a physical or experimental process we wish to study. Formally, it is a function defined over a ''sample space'' of possible outcomes. For our simple coin tossing experiment, where we flip a fair coin once and observe the outcome, our sample space consists of the two outcomes, H or T. When tossing two fair coins sequentially, our sample space consists of the four outcomes HH, HT, TH or TT. <br />
<br />
Let us fix a sample space of ''n'' tosses of a fair coin. Experimentally, we may be interested in studying the number of "heads" observed after tossing the coin ''n'' times. Or we could be interested in studying the number of tosses needed to first observe "heads". Or we could be interested in studying how likely a certain sequence of "heads" and "tails" is to be observed. Each of these experiments are defined on the same sample space (the events generated by ''n'' tosses of a fair coin), yet each strive to quantify different things. Consequently, each experiment should be associated with a different random variable.<br />
<br />
<br />
==The Binomial Distribution==<br />
<br />
Let ''X<sub>n</sub>'' denote the random variable that counts the number of times we observe "heads" when flipping a fair coin ''n'' times. Clearly, ''X'' can take on any integer value from 0 to n, corresponding to the experimental outcome of observing 0 to n "heads". How likely is any particular outcome of this random variable? Notice that we do not care about the order of the observations here, so that if ''n'' = 3, the outcome THH is equivalent to the outcomes HTH and HHT. Each of these outcomes contains two "heads".<br />
<br />
The likelihood of any particular outcome is what is represented by the probability mass function (PMF) of the random variable. Suppose ''n'' = 2. Then we see that the PMF of ''X<sub>2</sub>'' is given by:<br />
<br />
* Pr''(X<sub>2</sub> = 0)'' = 1/4<br />
* Pr''(X<sub>2</sub> = 1)'' = 1/2<br />
* Pr''(X<sub>2</sub> = 2)'' = 1/4<br />
<br />
We say that ''X<sub>2</sub>'' is a '''binomial''' random variable with parameters 2 (the number of times we flip the fair coin) and 1/2 (the probability that we observe heads after a single flip of the coin). We can write ''X<sub>2</sub> ~'' Bin(2, 1/2). <br />
<br />
Just as we did with Bernoulli random variables, we can think of our coin tossing experiment a bit more abstractly. Specifically, we can think of observing "heads" as a success and observing "tails" as a failure. This abstraction will help us generalize our coin tossing procedure to more general experiments.<br />
<br />
If ''X'' is a binomial random variable associated to ''n'' independent trials, each with a success probability ''p'', then the probability mass function of ''X'' is:<br />
<br />
<center><math>\textrm{Pr}( X = k ) = \frac {n!}{k!(n-k)!}\ ·p^k(1-p)^{n-k},</math></center><br />
<br />
where ''k'' is any integer from 0 to ''n''. Recall that the ''factorial'' notation ''n''! denotes the product of the first ''n'' positive integers: ''n''! = 1·2·3···(''n''-1)·''n'', and that we observe the convention 0! = 1.<br />
<br />
For our coin tossing experiment, the probability of success - that is, the probability of observing "heads" - was the same as the probability of failure, observing "tails". In general, we may be interested in processes that have different probabilities of success and failure.<br />
<br />
For example, suppose that we know that 5% of all light bulbs produced by a particular manufacturer are defective. If we buy a package of 6 light bulbs and want to calculate the probability that at least one is defective, we can do so by identifying this experiment with a binomial random variable. Here, we can think of observing a defective bulb as a "success" and observing a functional bulb as a "failure". Then our experiment is given by the random variable ''X<sub>6</sub>'' ~ Bin(6, 1/20), since we will observe 6 bulbs in total and each has a probability of 5/100 = 1/20 of being defective.<br />
<br />
In general, we can think of observing ''n'' independent experimental trials and counting the number of "successes" that we witness. The probability distribution we associate with this setup is the '''binomial''' random variable with parameters ''n'' and ''p'', where ''p'' is the probability of "success." We can denote this distributional relationship to a random variable ''X'' by ''X'' ~ Bin(''n'', ''p''). <br />
<br />
==The Geometric Distribution==<br />
<br />
Now consider a slightly different experiment where we wish to flip our fair coin repeatedly until we first observe "heads". Since we can first observe heads on the first flip, the second flip, the third flip, or on any subsequent flip, we see that the possible values our random variable can take are 1, 2, 3,.... <br />
<br />
Of course, we can consider a more abstract experiment where we observe a sequence of trials until we first observe a success, where the probability of success is ''p''. If we let ''X'' denote such a random variable, then we say that ''X'' is a '''geometric''' random variable with parameter ''p''. We can denote this particular random variable by ''X'' ~ Geo(''p'').<br />
<br />
Letting S denote the outcome of "success" and F denote the outcome of "failure", we can summarize the possible outcomes of a geometric experiment and their likelihoods (the probability mass function) in the following table. Here, we write ''p'' for the probability of success and ''q'' for the probability of failure.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Experimental Outcome<br />
! Value of the Random Variable, ''X = x''<br />
! Probability <br />
|-<br />
| S<br />
| ''x'' = 1<br />
| ''p''<br />
|-<br />
| FS<br />
| ''x'' = 2<br />
| ''q·p''<br />
|-<br />
| FFS<br />
| ''x'' = 3<br />
| ''q<sup>2</sup>·p''<br />
|-<br />
| FFFS<br />
| ''x'' = 4<br />
| ''q<sup>3</sup>·p''<br />
|-<br />
| FFFFS<br />
| ''x'' = 5<br />
| ''q<sup>4</sup>·p''<br />
|-<br />
| ...<br />
| ...<br />
| ...<br />
<br />
|}<br />
<br />
When flipping a fair coin, we see that ''X'' ~ Geo(1/2), so that our PDF takes the particularly simple form Pr(''X = k'') = (1/2)<sup>''k''</sup> for any positive integer ''k''.<br />
<br />
==The Discrete Uniform Distribution==<br />
<br />
Now consider a coin tossing experiment of flipping a fair coin ''n'' times and observing the sequence of "heads" and "tails". Because each outcome of a single flip of the coin is equally likely, and because the outcome of a single flip does not affect the outcome of another flip, we see that the likelihood of observing any particular sequence of "heads" and "tails" will always be the same. Notice that for ''n'' = 2 or 6, we have already encountered this random variable (see Section 1.01 and Sections 1.02 - 1.04 respectively).<br />
<br />
We say that a random variable ''X'' has a '''discrete uniform''' distribution on ''n'' points if ''X'' can assume any one of ''n'' values, each with equal probability. Evidently then, if ''X'' takes integer values from 1 to ''n'', we find that the PMF of ''X'' must be Pr(''X = k'') = 1/n, for any integer ''k'' between 1 and ''n''.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_Random_Variables&diff=1720711.1 Random Variables2012-05-30T16:42:04Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
We summarize the basic notions of a discrete random variable:<br />
<br />
# A discrete random variable ''X'' is a quantity that can assume any value ''x'' from a discrete list of values with a certain probability.<br />
# The probability that the discrete random variable ''X'' assumes the particular value ''x'' is denoted by Pr(''X'' = ''x''). This collection of probabilities, along with all possible values ''x'', is the '''probability distribution''' of the random variable ''X''.<br />
# A discrete list of values is any collection of values that is finite or countably infinite (i.e. can be written in a list).<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_Random_Variables&diff=1720691.1 Random Variables2012-05-30T16:41:45Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
We summarize the basic notions of a discrete random variable:<br />
<br />
# A discrete random variable ''X'' is a quantity that can assume any value ''x'' from a discrete list of values with a certain probability.<br />
<br />
# The probability that the discrete random variable ''X'' assumes the particular value ''x'' is denoted by Pr(''X'' = ''x''). This collection of probabilities, along with all possible values ''x'', is the '''probability distribution''' of the random variable ''X''.<br />
<br />
# A discrete list of values is any collection of values that is finite or countably infinite (i.e. can be written in a list).<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.3_Some_Common_Continuous_Distributions&diff=1720662.3 Some Common Continuous Distributions2012-05-30T16:36:37Z<p>EdKroc: </p>
<hr />
<div>Let us consider some common continuous random variables that often arise in practice. We should stress that this is indeed a very small sample of common continuous distributions.<br />
<br />
==The Beta Distribution==<br />
<br />
Suppose the proportion ''p'' of restaurants that make a profit in their first year of operation is given by a certain ''beta'' random variable ''X'', with probability density function:<br />
<br />
<math>f(p) = <br />
\begin{cases}<br />
12p(1 -p )^2 & \text{if } 0 \le p \le 1,\\<br />
0 & \text{elsewhere}.<br />
\end{cases}<br />
</math><br />
<br />
What is the probability that more than half of the restaurants will make a profit during their first year of operation? To answer this question, we calculate the probability as an area under the PDF curve as follows:<br />
<br />
<math> \begin{align}<br />
\mathrm{Pr}(0.5 \le X \le 1) &= \int_{0.5}^{1} f(p) dp \\<br />
&=\int_{0.5}^{1} 12p(1 -p )^2 dp \\<br />
&= \int_{0.5}^{1} \left(12p - 24p^2 + 12p^3\right) dp \\<br />
&= 6p^2 - 8p^3 + 3p^4 \Big|_{0.5}^1 \\<br />
&= (6 - 8 +3) - (1.5 - 1 + 0.1875) \\<br />
&= 0.3125<br />
\end{align} </math><br />
<br />
Therefore, Pr(0.5 ≤ ''P'' ≤ 1) = 0.3125.<br />
<br />
The example above is a particular case of a beta random variable. In general, a '''beta random variable''' has the generic PDF:<br />
<br />
<math>f(x) = \begin{cases}<br />
kx^{a-1}(1-x)^{b-1} & \text{if } 0 \le x \le 1,\\<br />
0 & \text{elsewhere}<br />
\end{cases}</math><br />
<br />
where the constants ''a'' and ''b'' are greater than zero, and the constant ''k'' is chosen so that the density ''f'' integrates to 1. <br />
<br />
We see that our previous example was a beta random variable given by the above density with ''a'' = 2 and ''b'' = 3. Let us find the associated cumulative distribution function ''F''(''p'') for this random variable. We compute:<br />
<br />
<math> \begin{align}<br />
F(p) &= \int_{-\infty}^{p} f(t) dt \\<br />
&= \int_0^p 12 t (1 - t)^2 dt \\<br />
&= 12 \int_0^p (t - 2t^2 + t^3) dt \\<br />
&= 12\Big( \frac{1}{2} t^2 - \frac{2}{3}t^3 + \frac{1}{4} t^4 \Big) \Big|_0^p \\<br />
&= p^2 ( 6 - 8p + 3p^2),<br />
\end{align} </math><br />
<br />
valid for 0 ≤ ''p'' ≤ 1.<br />
<br />
==The Exponential Distribution==<br />
<br />
The lifespan of a lightbulb can be modeled by a continuous random variable since lifespan - i.e. ''time'' - is a continuous quantity. A reasonable distribution for this random variable is what is known as an ''exponential distribution''.<br />
<br />
A random variable ''Y'' has an '''exponential distribution with parameter β''' > 0 if its PDF is given by <br />
<br />
<math>f(y) = \begin{cases}<br />
\frac 1{\beta}e^{-y/\beta} & \text{if } 0\leq y <\infty\\<br />
0 & \text{elsewhere}<br />
\end{cases}</math><br />
<br />
Suppose that the lifespan (in months) of lightbulbs manufactured at a certain facility can be modeled by an exponential random variable ''Y'' with parameter β = 4. What is the probability that a particular lightbulb lasts at least a year? Again, we can calculate this probability by evaluating an integral. Since there are 12 months in one year, we calculate<br />
<br />
<math> \begin{align}<br />
\mathrm{Pr}(Y \geq 12) &= \int_{12}^{\infty} f(y) dy \\<br />
&=\int_{12}^{\infty} \frac 14 e^{-y/4} dy \\<br />
&= -e^{-y/4} \Big|_{12}^{\infty} \\<br />
&= 0 - (-e^{-3}) \\<br />
&\approx 0.04979<br />
\end{align} </math><br />
<br />
Thus we can see that it is highly likely we would need to replace a lightbulb produced from this facility within one year of manufacture.<br />
<br />
==The Continuous Uniform Distribution==<br />
<br />
Our third example of a common continuous random variable is one that we have already encountered. Consider the experiment of randomly choosing a real number from the interval [a,b]. Letting ''X'' denote this random outcome, we say that ''X'' has a ''continuous uniform'' distribution on [a,b] if the probability that we choose a value in some subinterval of [a,b] is given by the relative size of that subinterval in [a,b]. More explicitly, we have the following:<br />
<br />
A random variable ''X'' has an '''continuous uniform''' distribution on [a,b] if its PDF is constant on [a,b]; i.e. its PDF is given by <br />
<br />
<math>f(x) = \begin{cases}<br />
\frac 1{b-a} & \text{if } a\leq x \leq b\\<br />
0 & \text{elsewhere}<br />
\end{cases}</math><br />
<br />
The continuous uniform distribution has a particularly simple representation, just as its discrete counterpart does. Nevertheless, this random variable has great practical and theoretical utility. We will explore this distribution in more detail in the following example and in the exercises.<br />
<br />
===A Geometric Problem===<br />
<br />
Consider the square in the ''xy''-plane bounded by the lines ''x'' = 0, ''x'' = 1, ''y'' = 0 and ''y'' = 1. Now consider a vertical line with equation ''x'' = ''b'', where 0 ≤ ''b'' ≤ 1 is fixed. Note that this line will intersect the unit square just defined.<br />
<br />
Suppose we select a point inside this square, uniformly at random. If we let ''X'' be the ''x''-coordinate of this random point, what is the probability that ''X'' is in the interval [0 , ''b'']?<br />
<br />
An illustration of our problem is given in the figure below. Graphically, we are trying to find the probability that a randomly selected point inside the square lies to the ''left'' of the red line. <br />
<br />
[[File:MATH105ProbabilitySquareExample.jpg|400px]]<br />
<br />
The region to the left of the red line is a rectangle with area equal to ''b''. The probability that our random point lies inside this rectangle is proportional to the area of that rectangle, since the larger the area of the rectangle, the larger the probability is that the point is inside of it. <br />
<br />
* If the probability that the point is between 0 and ''b'' were equal to 0.5, then the red line would have to divide the square into two equal halves: so ''b'' = 0.5.<br />
* If the probability that the point is between 0 and ''b'' were equal to 0.25, then the red line would have to divide the square at 1/4: so ''b'' = 0.25.<br />
* If the probability that the point is between 0 and ''b'' were equal to 1, then the red line would have to lie on the rightmost edge of the square itself: so ''b'' = 1<br />
<br />
In general, we see that we should have Pr(0 ≤ ''X'' ≤ ''b'') = ''b''. <br />
<br />
Notice that this result matches with the definition of our random variable ''X''. Since we want to select a random point ''uniformly at random'' from the unit square, the random variable ''X'' giving the ''x''-coordinate of this random point should be a ''continuous uniform'' random variable on the interval [0,1]. Thus, the PDF of ''X'' is simply <math>f(x) = 1\!</math>, where <math>0\leq x\leq 1\!</math>.<br />
<br />
<br />
Therefore, <math>\textrm{Pr}(0\leq X\leq b) = \int_0^b dx = b\!</math>, which agrees with the answer we derived using purely geometric considerations.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.7_Chapter_2_Summary&diff=1720002.7 Chapter 2 Summary2012-05-30T05:41:57Z<p>EdKroc: </p>
<hr />
<div>Chapter 2 defined continuous random variables and investigated some of their properties. We saw several examples of commonly used continuous distributions, including the famous normal distribution.<br />
<br />
==Relationship Between CDF and PDF==<br />
<br />
One of the key features of a random variable is its associated probability distribution, which gives the probabilities that we can observe a certain event, or set of values, under the given random variable. This distribution can take the form of either a cumulative distribution function (CDF) or a probability density function (PDF) for continuous random variables. These two functions are related by the Fundamental Theorem of Calculus: <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<math>F(x) = \int_{-\infty}^{x} f(t) dt</math><br />
|}<br />
<br />
The integrand is the PDF of our continuous random variable, and the corresponding integral is the CDF.<br />
<br />
==Calculating Probabilities==<br />
<br />
These two functions give the probabilities associated with observing certain events under a random variable ''X'' in question. The CDF has a direct probabilistic interpretation, given by<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<math>F(x) = \text{Pr}(X \le x)</math><br />
|}<br />
<br />
Using the relationship between the CDF and the PDF, probabilities for events associated to continuous random variables can be computed in two equivalent ways. Suppose we wish to calculate the probability that a continuous random variable ''X'' is between two values ''a'' and ''b''. We could use the PDF and integrate to find this probability.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<math>\text{Pr}(a \le X \le b) = \int_{a}^{b} f(x) dx</math><br />
|}<br />
<br />
Alternatively, if we wish to use the CDF, ''F''(''x''), we can evaluate the difference ''F''(''b'') - ''F''(''a'') to find this probability.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<math>\text{Pr}(a \le X \le b) = F(b) - F(a)</math><br />
|}<br />
<br />
Of course we know that both approaches yield the same result. This fact is precisely the statement of the Fundamental Theorem of Calculus.<br />
<br />
==Expected Value, Variance, and Standard Deviation==<br />
<br />
Just as with discrete random variables, the expectation represents the "center" of a random variable, an expected value of an experiment, or the average value of the outcomes of an experiment repeated many times. The variance and standard deviation of a random variable is a numerical measure of the spread, or dispersion, of the PDF of X. Given the PDF ''f''(''x'') of a continuous random variable ''X'', we can calculate these quantities.<br />
<br />
We collect the formulas for the expected value, variance, and standard deviation of a continuous random variable ''X'' with PDF ''f''(''x'') in the following table.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<center><math>\mathbb{E}(X) = \int_{-\infty}^{\infty}xf(x)dx</math></center><br />
|-<br />
|<center><math>\text{Var}(X) = \int_{-\infty}^{\infty}(x-\mathbb{E}(X))^2f(x)dx</math></center><br />
|-<br />
|<center><math>\sigma(X) = \sqrt{\text{Var}(X)}</math></center><br />
|}</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_Random_Variables&diff=1719981.1 Random Variables2012-05-30T05:30:54Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Random Variables<br />
|-<br />
| 1. A discrete random variable ''X'' is a quantity that can assume any <br />
<br />
value ''x'' from a discrete list of values with a certain probability.<br />
<br />
2. The probability that the random variable ''X'' assumes the particular <br />
<br />
value ''x'' is denoted by Pr(''X'' = ''x''). This collection of probabilities, <br />
<br />
along with all possible values ''x'', is the '''probability distribution''' <br />
<br />
of the random variable ''X''.<br />
<br />
3. A discrete list of values is any collection of values that is finite <br />
<br />
or countably infinite (i.e. can be written in a list).<br />
<br />
|}<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=UBC_Wiki:Books/Mprobably&diff=171997UBC Wiki:Books/Mprobably2012-05-30T05:23:20Z<p>EdKroc: </p>
<hr />
<div>{{saved_book}}<br />
<br />
== Probability Appendix ==<br />
;Discrete Random Variables<br />
:[[1.1 Random Variables]]<br />
:[[1.2 Probability Basics]]<br />
:[[1.3 The Probability Mass Function]]<br />
:[[1.4 The Cumulative Distribution Function]]<br />
:[[1.5 Some Common Discrete Distributions]]<br />
:[[1.6 Expected Value]]<br />
:[[1.7 Variance and Standard Deviation]]<br />
:[[1.8 Chapter 1 Summary]]<br />
;Continuous Random Variables<br />
:[[2.1 The Cumulative Distribution Function (Continuous Case)]]<br />
:[[2.2 The Probability Density Function]]<br />
:[[2.3 Some Common Continuous Distributions]]<br />
:[[2.4 The Normal Distribution]]<br />
:[[2.5 Expected Value, Variance, and Standard Deviation]]<br />
:[[2.6 A Sample Problem]]<br />
:[[2.7 Chapter 2 Summary]]<br />
<br />
[[Category:Books|Books/Mprobably]]</div>EdKrochttps://wiki.ubc.ca/index.php?title=UBC_Wiki:Books/Mprobably&diff=171996UBC Wiki:Books/Mprobably2012-05-30T05:22:44Z<p>EdKroc: </p>
<hr />
<div>{{saved_book}}<br />
<br />
== Probability Appendix ==<br />
;Discrete Random Variables<br />
:[[1.2 Probability Basics]]<br />
:[[1.3 The Probability Mass Function]]<br />
:[[1.4 The Cumulative Distribution Function]]<br />
:[[1.5 Some Common Discrete Distributions]]<br />
:[[1.6 Expected Value]]<br />
:[[1.7 Variance and Standard Deviation]]<br />
:[[1.8 Chapter 1 Summary]]<br />
;Continuous Random Variables<br />
:[[2.1 The Cumulative Distribution Function (Continuous Case)]]<br />
:[[2.2 The Probability Density Function]]<br />
:[[2.3 Some Common Continuous Distributions]]<br />
:[[2.4 The Normal Distribution]]<br />
:[[2.5 Expected Value, Variance, and Standard Deviation]]<br />
:[[2.6 A Sample Problem]]<br />
:[[2.7 Chapter 2 Summary]]<br />
:[[1.1 Random Variables]]<br />
<br />
[[Category:Books|Books/Mprobably]]</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_Random_Variables&diff=1719951.1 Random Variables2012-05-30T05:21:56Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Random Variables<br />
|-<br />
|<br />
1. A discrete random variable ''X'' is a quantity that can assume any <br />
<br />
value ''x'' from a discrete list of values with a certain probability.<br />
<br />
2. The probability that the random variable ''X'' assumes the particular <br />
<br />
value ''x'' is denoted by Pr(''X'' = ''x''). This collection of probabilities, <br />
<br />
along with all possible values ''x'', is the '''probability distribution''' <br />
<br />
of the random variable ''X''.<br />
<br />
3. A discrete list of values is any collection of values that is finite <br />
<br />
or countably infinite (i.e. can be written in a list).<br />
<br />
|}<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_Random_Variables&diff=1719941.1 Random Variables2012-05-30T05:21:10Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="500"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Random Variables<br />
|-<br />
|<br />
1. A discrete random variable ''X'' is a quantity that can assume any <br />
<br />
value ''x'' from a discrete list of values with a certain probability.<br />
<br />
2. The probability that the random variable ''X'' assumes the particular <br />
<br />
value ''x'' is denoted by Pr(''X'' = ''x''). This collection of <br />
<br />
probabilities, along with all possible values ''x'', is the '''probability <br />
<br />
distribution''' of the random variable ''X''.<br />
<br />
3. A discrete list of values is any collection of values that is finite <br />
<br />
or countably infinite (i.e. can be written in a list).<br />
<br />
|}<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_Random_Variables&diff=1719931.1 Random Variables2012-05-30T05:20:33Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="500"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Random Variables<br />
|-<br />
|<br />
# A discrete random variable ''X'' is a quantity that can assume any <br />
<br />
value ''x'' from a discrete list of values with a certain probability.<br />
# The probability that the random variable ''X'' assumes the particular <br />
<br />
value ''x'' is denoted by Pr(''X'' = ''x''). This collection of <br />
<br />
probabilities, along with all possible values ''x'', is the '''probability <br />
<br />
distribution''' of the random variable ''X''.<br />
# A discrete list of values is any collection of values that is finite <br />
<br />
or countably infinite (i.e. can be written in a list).<br />
|}<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.5_Expected_Value,_Variance,_and_Standard_Deviation&diff=1719922.5 Expected Value, Variance, and Standard Deviation2012-05-30T05:19:26Z<p>EdKroc: </p>
<hr />
<div>Analogous to the discrete case, we can define the expected value, variance, and standard deviation of a continuous random variable. These quantities have the same interpretation as in the discrete setting. The expectation of a random variable is a measure of the center of the distribution, its mean value. The variance and standard deviation are measures of the horizontal spread or dispersion of the random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Expected Value of a Continuous Random Variable<br />
|-<br />
| The '''expected value''' (also called the '''expectation''' or '''mean''') of a continuous random variable ''X'',<br />
<br />
with probability density function ''f''(''x''), is the number given by<br />
<br />
<center><math>\mathbb{E}(X) = \int_{-\infty}^{\infty} x f(x) dx</math>.</center><br />
<br />
|}<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Variance and Standard Deviation of a Continuous Random Variable<br />
|-<br />
| The '''variance''' of a continuous random variable ''X'', with probability density function ''f''(''x'') is:<br />
<br />
<center><math>\text{Var}(X) = \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx </math>.</center><br />
<br />
As in the discrete case, the '''standard deviation''', σ, is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math>.</center><br />
<br />
|}<br />
<br />
==Simple Example==<br />
<br />
A random variable ''X'' is given by the following PDF. Check that this is a valid PDF and calculate the standard deviation of ''X''.<br />
<br />
<math>f(x) = \begin{cases}<br />
2 (1 - x) & \text{if } 0 \le x \le 1,\\<br />
0 & \text{otherwise} <br />
\end{cases}<br />
</math><br />
<br />
===Solution===<br />
<br />
====Part 1====<br />
<br />
To verify that ''f''(''x'') is a valid PDF, we must check that it is everywhere nonnegative and that it integrates to 1.<br />
<br />
We see that 2(1-x) = 2 - 2x ≥ 0 precisely when x ≤ 1; thus ''f''(''x'') is everywhere nonnegative.<br />
<br />
To check that ''f''(''x'') has unit area under its graph, we calculate<br />
<br />
<math>\begin{align}<br />
\int_{-\infty}^{\infty} f(x) dx = 2 \int_{0}^{1} (1 - x) dx =2 \Big( x - \frac{x^2}{2} \Big) \Big|_0^1=1 <br />
\end{align}</math><br />
<br />
So ''f''(''x'') is indeed a valid PDF.<br />
<br />
====Part 2====<br />
<br />
To calculate the standard deviation of ''X'', we must first find its variance. Calculating the variance of ''X'' requires its expected value:<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X) &= \int_{-\infty}^{\infty} x f(x) dx \\<br />
&= \int_{0}^{1} x \Big[ 2 (1 - x) \Big] dx \\<br />
&= 2 \int_{0}^{1} \Big( x - x^2 \Big) dx \\<br />
&= 2 \Big( \frac{x^2}{2} - \frac{x^3}{3} \Big) \Big|_0^1 \\<br />
&= 1/3<br />
\end{align}</math><br />
<br />
Using this value, we compute the variance of ''X'' as follows<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx \\<br />
&= \int_0^1 \big( x - 1/3\big)^2\cdot 2(1-x) dx \\<br />
&= 2 \int_0^1\big( x^2 -\frac{2}{3} x + \frac{1}{9} \big) (1-x) dx \\<br />
&= 2 \int_0^1\big( -x^3 + \frac{5}{3}x^2 -\frac{7}{9} x +\frac{1}{9} \big) dx \\<br />
&= 2 \big( -\frac{1}{4}x^4 + \frac{5}{9}x^3 -\frac{7}{18} x^2 +\frac{1}{9}x \big)\Big|_0^1 \\<br />
&= 2 \big( -\frac{1}{4} + \frac{5}{9} -\frac{7}{18} +\frac{1}{9} \big) \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
Therefore, the standard deviation of ''X'' is<br />
<br />
<math>\begin{align}<br />
\sigma &= \sqrt{\text{Var}(X)}\\<br />
&= \frac 1{3\sqrt{2}}<br />
\end{align}</math><br />
<br />
===An Alternative Formula for Variance===<br />
<br />
There is an alternative formula for the variance of a random variable that is less tedious than the above definition. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Alternate Formula for the Variance of a Continuous Random Variable<br />
|-<br />
| The '''variance''' of a continuous random variable ''X'' with PDF ''f''(''x'') is the number given by<br />
<br />
<center><math>\text{Var}(X) = \mathbb{E}(X^2) - [\mathbb{E}(X)]^2</math>.</center><br />
<br />
|}<br />
<br />
The derivation of this formula is a simple manipulation and has been relegated to the exercises. We should note that a completely analogous formula holds for the variance of a discrete random variable, with the integral signs replaced by sums.<br />
<br />
==Simple Example Revisited==<br />
<br />
We can use this alternate formula for variance to find the standard deviation of the random variable ''X'' defined above.<br />
<br />
Remembering that the expectation of ''X'' was found to be 1/3, we compute the variance of ''X'' as follows:<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \mathbb{E}(X^2) - [\mathbb{E}(X)]^2\\<br />
&= \int_{-\infty}^{\infty} x^2 f(x) dx - \left(\frac 13\right)^2 \\<br />
&= 2 \int_0^1 x^2 (1-x) dx - \frac{1}{9}\\<br />
&= 2 \int_0^1\big( x^2 - x^3 \big) dx- \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3}x^3 - \frac{1}{4}x^4 \big) \big|_0^1 - \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3} - \frac{1}{4} \big) - \frac{1}{9} \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
In the exercises, you will compute the expectations, variances and standard deviations of many of the random variables we have introduced in this chapter, as well as those of many new ones.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.5_Expected_Value,_Variance,_and_Standard_Deviation&diff=1719912.5 Expected Value, Variance, and Standard Deviation2012-05-30T05:19:14Z<p>EdKroc: </p>
<hr />
<div>Analogous to the discrete case, we can define the expected value, variance, and standard deviation of a continuous random variable. These quantities have the same interpretation as in the discrete setting. The expectation of a random variable is a measure of the center of the distribution, its mean value. The variance and standard deviation are measures of the horizontal spread or dispersion of the random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Expected Value of a Continuous Random Variable<br />
|-<br />
| The '''expected value''' (also called the '''expectation''' or '''mean''') of a continuous random variable ''X'',<br />
with probability density function ''f''(''x''), is the number given by<br />
<br />
<center><math>\mathbb{E}(X) = \int_{-\infty}^{\infty} x f(x) dx</math>.</center><br />
<br />
|}<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Variance and Standard Deviation of a Continuous Random Variable<br />
|-<br />
| The '''variance''' of a continuous random variable ''X'', with probability density function ''f''(''x'') is:<br />
<br />
<center><math>\text{Var}(X) = \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx </math>.</center><br />
<br />
As in the discrete case, the '''standard deviation''', σ, is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math>.</center><br />
<br />
|}<br />
<br />
==Simple Example==<br />
<br />
A random variable ''X'' is given by the following PDF. Check that this is a valid PDF and calculate the standard deviation of ''X''.<br />
<br />
<math>f(x) = \begin{cases}<br />
2 (1 - x) & \text{if } 0 \le x \le 1,\\<br />
0 & \text{otherwise} <br />
\end{cases}<br />
</math><br />
<br />
===Solution===<br />
<br />
====Part 1====<br />
<br />
To verify that ''f''(''x'') is a valid PDF, we must check that it is everywhere nonnegative and that it integrates to 1.<br />
<br />
We see that 2(1-x) = 2 - 2x ≥ 0 precisely when x ≤ 1; thus ''f''(''x'') is everywhere nonnegative.<br />
<br />
To check that ''f''(''x'') has unit area under its graph, we calculate<br />
<br />
<math>\begin{align}<br />
\int_{-\infty}^{\infty} f(x) dx = 2 \int_{0}^{1} (1 - x) dx =2 \Big( x - \frac{x^2}{2} \Big) \Big|_0^1=1 <br />
\end{align}</math><br />
<br />
So ''f''(''x'') is indeed a valid PDF.<br />
<br />
====Part 2====<br />
<br />
To calculate the standard deviation of ''X'', we must first find its variance. Calculating the variance of ''X'' requires its expected value:<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X) &= \int_{-\infty}^{\infty} x f(x) dx \\<br />
&= \int_{0}^{1} x \Big[ 2 (1 - x) \Big] dx \\<br />
&= 2 \int_{0}^{1} \Big( x - x^2 \Big) dx \\<br />
&= 2 \Big( \frac{x^2}{2} - \frac{x^3}{3} \Big) \Big|_0^1 \\<br />
&= 1/3<br />
\end{align}</math><br />
<br />
Using this value, we compute the variance of ''X'' as follows<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx \\<br />
&= \int_0^1 \big( x - 1/3\big)^2\cdot 2(1-x) dx \\<br />
&= 2 \int_0^1\big( x^2 -\frac{2}{3} x + \frac{1}{9} \big) (1-x) dx \\<br />
&= 2 \int_0^1\big( -x^3 + \frac{5}{3}x^2 -\frac{7}{9} x +\frac{1}{9} \big) dx \\<br />
&= 2 \big( -\frac{1}{4}x^4 + \frac{5}{9}x^3 -\frac{7}{18} x^2 +\frac{1}{9}x \big)\Big|_0^1 \\<br />
&= 2 \big( -\frac{1}{4} + \frac{5}{9} -\frac{7}{18} +\frac{1}{9} \big) \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
Therefore, the standard deviation of ''X'' is<br />
<br />
<math>\begin{align}<br />
\sigma &= \sqrt{\text{Var}(X)}\\<br />
&= \frac 1{3\sqrt{2}}<br />
\end{align}</math><br />
<br />
===An Alternative Formula for Variance===<br />
<br />
There is an alternative formula for the variance of a random variable that is less tedious than the above definition. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Alternate Formula for the Variance of a Continuous Random Variable<br />
|-<br />
| The '''variance''' of a continuous random variable ''X'' with PDF ''f''(''x'') is the number given by<br />
<br />
<center><math>\text{Var}(X) = \mathbb{E}(X^2) - [\mathbb{E}(X)]^2</math>.</center><br />
<br />
|}<br />
<br />
The derivation of this formula is a simple manipulation and has been relegated to the exercises. We should note that a completely analogous formula holds for the variance of a discrete random variable, with the integral signs replaced by sums.<br />
<br />
==Simple Example Revisited==<br />
<br />
We can use this alternate formula for variance to find the standard deviation of the random variable ''X'' defined above.<br />
<br />
Remembering that the expectation of ''X'' was found to be 1/3, we compute the variance of ''X'' as follows:<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \mathbb{E}(X^2) - [\mathbb{E}(X)]^2\\<br />
&= \int_{-\infty}^{\infty} x^2 f(x) dx - \left(\frac 13\right)^2 \\<br />
&= 2 \int_0^1 x^2 (1-x) dx - \frac{1}{9}\\<br />
&= 2 \int_0^1\big( x^2 - x^3 \big) dx- \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3}x^3 - \frac{1}{4}x^4 \big) \big|_0^1 - \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3} - \frac{1}{4} \big) - \frac{1}{9} \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
In the exercises, you will compute the expectations, variances and standard deviations of many of the random variables we have introduced in this chapter, as well as those of many new ones.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.5_Expected_Value,_Variance,_and_Standard_Deviation&diff=1719902.5 Expected Value, Variance, and Standard Deviation2012-05-30T05:18:50Z<p>EdKroc: </p>
<hr />
<div>Analogous to the discrete case, we can define the expected value, variance, and standard deviation of a continuous random variable. These quantities have the same interpretation as in the discrete setting. The expectation of a random variable is a measure of the center of the distribution, its mean value. The variance and standard deviation are measures of the horizontal spread or dispersion of the random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Expected Value of a Continuous Random Variable<br />
|-<br />
| The '''expected value''' (also called the '''expectation''' or '''mean''') of a continuous random variable ''X'', with probability density function ''f''(''x''), is the number given by<br />
<br />
<center><math>\mathbb{E}(X) = \int_{-\infty}^{\infty} x f(x) dx</math>.</center><br />
<br />
|}<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Variance and Standard Deviation of a Continuous Random Variable<br />
|-<br />
| The '''variance''' of a continuous random variable ''X'', with probability density function ''f''(''x'') is:<br />
<br />
<center><math>\text{Var}(X) = \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx </math>.</center><br />
<br />
As in the discrete case, the '''standard deviation''', σ, is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math>.</center><br />
<br />
|}<br />
<br />
==Simple Example==<br />
<br />
A random variable ''X'' is given by the following PDF. Check that this is a valid PDF and calculate the standard deviation of ''X''.<br />
<br />
<math>f(x) = \begin{cases}<br />
2 (1 - x) & \text{if } 0 \le x \le 1,\\<br />
0 & \text{otherwise} <br />
\end{cases}<br />
</math><br />
<br />
===Solution===<br />
<br />
====Part 1====<br />
<br />
To verify that ''f''(''x'') is a valid PDF, we must check that it is everywhere nonnegative and that it integrates to 1.<br />
<br />
We see that 2(1-x) = 2 - 2x ≥ 0 precisely when x ≤ 1; thus ''f''(''x'') is everywhere nonnegative.<br />
<br />
To check that ''f''(''x'') has unit area under its graph, we calculate<br />
<br />
<math>\begin{align}<br />
\int_{-\infty}^{\infty} f(x) dx = 2 \int_{0}^{1} (1 - x) dx =2 \Big( x - \frac{x^2}{2} \Big) \Big|_0^1=1 <br />
\end{align}</math><br />
<br />
So ''f''(''x'') is indeed a valid PDF.<br />
<br />
====Part 2====<br />
<br />
To calculate the standard deviation of ''X'', we must first find its variance. Calculating the variance of ''X'' requires its expected value:<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X) &= \int_{-\infty}^{\infty} x f(x) dx \\<br />
&= \int_{0}^{1} x \Big[ 2 (1 - x) \Big] dx \\<br />
&= 2 \int_{0}^{1} \Big( x - x^2 \Big) dx \\<br />
&= 2 \Big( \frac{x^2}{2} - \frac{x^3}{3} \Big) \Big|_0^1 \\<br />
&= 1/3<br />
\end{align}</math><br />
<br />
Using this value, we compute the variance of ''X'' as follows<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx \\<br />
&= \int_0^1 \big( x - 1/3\big)^2\cdot 2(1-x) dx \\<br />
&= 2 \int_0^1\big( x^2 -\frac{2}{3} x + \frac{1}{9} \big) (1-x) dx \\<br />
&= 2 \int_0^1\big( -x^3 + \frac{5}{3}x^2 -\frac{7}{9} x +\frac{1}{9} \big) dx \\<br />
&= 2 \big( -\frac{1}{4}x^4 + \frac{5}{9}x^3 -\frac{7}{18} x^2 +\frac{1}{9}x \big)\Big|_0^1 \\<br />
&= 2 \big( -\frac{1}{4} + \frac{5}{9} -\frac{7}{18} +\frac{1}{9} \big) \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
Therefore, the standard deviation of ''X'' is<br />
<br />
<math>\begin{align}<br />
\sigma &= \sqrt{\text{Var}(X)}\\<br />
&= \frac 1{3\sqrt{2}}<br />
\end{align}</math><br />
<br />
===An Alternative Formula for Variance===<br />
<br />
There is an alternative formula for the variance of a random variable that is less tedious than the above definition. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Alternate Formula for the Variance of a Continuous Random Variable<br />
|-<br />
| The '''variance''' of a continuous random variable ''X'' with PDF ''f''(''x'') is the number given by<br />
<br />
<center><math>\text{Var}(X) = \mathbb{E}(X^2) - [\mathbb{E}(X)]^2</math>.</center><br />
<br />
|}<br />
<br />
The derivation of this formula is a simple manipulation and has been relegated to the exercises. We should note that a completely analogous formula holds for the variance of a discrete random variable, with the integral signs replaced by sums.<br />
<br />
==Simple Example Revisited==<br />
<br />
We can use this alternate formula for variance to find the standard deviation of the random variable ''X'' defined above.<br />
<br />
Remembering that the expectation of ''X'' was found to be 1/3, we compute the variance of ''X'' as follows:<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \mathbb{E}(X^2) - [\mathbb{E}(X)]^2\\<br />
&= \int_{-\infty}^{\infty} x^2 f(x) dx - \left(\frac 13\right)^2 \\<br />
&= 2 \int_0^1 x^2 (1-x) dx - \frac{1}{9}\\<br />
&= 2 \int_0^1\big( x^2 - x^3 \big) dx- \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3}x^3 - \frac{1}{4}x^4 \big) \big|_0^1 - \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3} - \frac{1}{4} \big) - \frac{1}{9} \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
In the exercises, you will compute the expectations, variances and standard deviations of many of the random variables we have introduced in this chapter, as well as those of many new ones.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.6_Expected_Value&diff=1719891.6 Expected Value2012-05-30T05:12:07Z<p>EdKroc: </p>
<hr />
<div>For an experiment or general random process, the outcomes are never fixed. We may replicate the experiment and generally expect to observe many different outcomes. Of course, in most reasonable circumstances we will expect these observed differences in the outcomes to collect with some level of concentration about some central value. One central value of fundamental importance is the ''expected value''.<br />
<br />
The '''expected value''' or '''expectation''' (also called the '''mean''') of a random variable ''X'' is the weighted average of the possible values of ''X'', weighted by their corresponding probabilities. Informally, the expectation of a random variable ''X'' is the average value that we would expect to see after repeated observation of the random process. Put another way, the expectation is the long-term average of the realized values of a random variable after repeated observation of the random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="500"<br />
|- style="background-color:#f0f0f0;"<br />
! Expected Value of a Discrete Random Variable<br />
|-<br />
| The expected value, <math>\mathbb{E}(X)\!</math>, of a discrete random variable ''X'' is the weighted average of the possible values of ''X'' where each possible value of ''X'' is weighted by its corresponding probability:<br />
<br />
<center><math>\mathbb{E}(X) = \sum_{k=1}^{N} x_k \textrm{Pr}( x = x_k )</math></center><br />
<br />
where ''N'' is the total number of possible values of ''X''.<br />
|} <br />
<br />
* Do not confuse the ''expected'' value with the ''average'' value of a set of observations: they are two different but related quantities. The average value of a random variable ''X'' would be just the ordinary average of the possible values of ''X''; that is, no possible value of ''X'' receives any special weight. Naturally, this ordinary average is given by <math>\frac 1N\sum_{k=1}^N x_k\!</math>. The expected value of ''X'' is a ''weighted'' average, where certain values get more or less weight depending on how likely or not they are to be observed. A true average value is calculated only when all weights (so all probabilities) are the same. <br />
* The definition of expected value requires numerical values for the ''x<sub>k</sub>''. So if the outcome for an experiment is something qualitative, such as "heads" or "tails", we could calculate the expected value if we assign heads and tails numerical values (0 and 1, for example). <br />
<br />
==Example: Test Scores==<br />
<br />
Recall the test score example from Sections 1.03 and 1.04. We supposed that in a class of 10 people the grades on a test are given by 30, 30, 30, 60, 60, 80, 80, 80, 90, 100. A test is drawn from the collection at random and the score ''X'' is observed. What is the expected value of the random variable ''X''?<br />
<br />
The expected value of the random variable is given by the weighted average of its values:<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X)<br />
&= \sum_{k=1}^{N} x_k \textrm{Pr}(X = x_k) \\<br />
&= 30 \frac{3}{10} + 60 \frac{2}{10} + 80 \frac{3}{10} + 90 \frac{1}{10} + 100 \frac{1}{10} \\<br />
&= 9 + 12 + 24 + 9 + 10 \\<br />
&= 64<br />
\end{align}</math><br />
<br />
Notice that 64 is not actually a possible value for the random variable ''X''. Nevertheless, this expectation makes sense if we remember that what we have really calculated is the long-term average of repeatedly drawing a test score from this collection. If we drew a test score at random from this collection 100 times (remembering to replace the selected test each time so that we never alter our collection of tests) and then averaged all the observed outcomes, this average value would be very near the expected value of 64.<br />
<br />
==Expectation as a Measure of the Center of a Distribution==<br />
<br />
Another informal way to think of the expectation of a random variable is to notice that it gives a measure of the center of the associated distribution. For our test score example, the PMF of the randomly selected test score ''X'' is shown below. <br />
<br />
[[File:MATH105GradeDistribPDF.png|300px]]<br />
<br />
Notice that the expected value of our randomly selected test score, <math>\mathbb{E}(X) = 64\!</math>, lies near the "center" of the PMF. There are many different ways to quantify the "center of a distribution" - for example, computing the 50th percentile of the possible outcomes - but for our purposes we will concentrate our attention on the expected value.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_Random_Variables&diff=1719881.1 Random Variables2012-05-30T05:10:44Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="500"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Random Variables<br />
|-<br />
|<br />
# A discrete random variable ''X'' is a quantity that can assume any value ''x'' from a discrete list of values with a certain probability.<br />
# The probability that the random variable ''X'' assumes the particular value ''x'' is denoted by Pr(''X'' = ''x''). This collection of probabilities, along with all possible values ''x'', is the '''probability distribution''' of the random variable ''X''.<br />
# A discrete list of values is any collection of values that is finite or countably infinite (i.e. can be written in a list).<br />
|}<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.5_Expected_Value,_Variance,_and_Standard_Deviation&diff=1719872.5 Expected Value, Variance, and Standard Deviation2012-05-30T05:10:15Z<p>EdKroc: </p>
<hr />
<div>Analogous to the discrete case, we can define the expected value, variance, and standard deviation of a continuous random variable. These quantities have the same interpretation as in the discrete setting. The expectation of a random variable is a measure of the center of the distribution, its mean value. The variance and standard deviation are measures of the horizontal spread or dispersion of the random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="500"<br />
|- style="background-color:#f0f0f0;"<br />
! Expected Value, Variance, and Standard Deviation of a Continuous Random Variable<br />
|-<br />
| The '''expected value''' (also called the '''expectation''' or '''mean''') of a continuous random variable ''X'', with probability density function ''f''(''x''), is the number given by<br />
<br />
<center><math>\mathbb{E}(X) = \int_{-\infty}^{\infty} x f(x) dx</math>.</center><br />
<br />
The '''variance''' of ''X'' is:<br />
<br />
<center><math>\text{Var}(X) = \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx </math>.</center><br />
<br />
As in the discrete case, the '''standard deviation''', σ, is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math>.</center><br />
<br />
|}<br />
<br />
==Simple Example==<br />
<br />
A random variable ''X'' is given by the following PDF. Check that this is a valid PDF and calculate the standard deviation of ''X''.<br />
<br />
<math>f(x) = \begin{cases}<br />
2 (1 - x) & \text{if } 0 \le x \le 1,\\<br />
0 & \text{otherwise} <br />
\end{cases}<br />
</math><br />
<br />
===Solution===<br />
<br />
====Part 1====<br />
<br />
To verify that ''f''(''x'') is a valid PDF, we must check that it is everywhere nonnegative and that it integrates to 1.<br />
<br />
We see that 2(1-x) = 2 - 2x ≥ 0 precisely when x ≤ 1; thus ''f''(''x'') is everywhere nonnegative.<br />
<br />
To check that ''f''(''x'') has unit area under its graph, we calculate<br />
<br />
<math>\begin{align}<br />
\int_{-\infty}^{\infty} f(x) dx = 2 \int_{0}^{1} (1 - x) dx =2 \Big( x - \frac{x^2}{2} \Big) \Big|_0^1=1 <br />
\end{align}</math><br />
<br />
So ''f''(''x'') is indeed a valid PDF.<br />
<br />
====Part 2====<br />
<br />
To calculate the standard deviation of ''X'', we must first find its variance. Calculating the variance of ''X'' requires its expected value:<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X) &= \int_{-\infty}^{\infty} x f(x) dx \\<br />
&= \int_{0}^{1} x \Big[ 2 (1 - x) \Big] dx \\<br />
&= 2 \int_{0}^{1} \Big( x - x^2 \Big) dx \\<br />
&= 2 \Big( \frac{x^2}{2} - \frac{x^3}{3} \Big) \Big|_0^1 \\<br />
&= 1/3<br />
\end{align}</math><br />
<br />
Using this value, we compute the variance of ''X'' as follows<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx \\<br />
&= \int_0^1 \big( x - 1/3\big)^2\cdot 2(1-x) dx \\<br />
&= 2 \int_0^1\big( x^2 -\frac{2}{3} x + \frac{1}{9} \big) (1-x) dx \\<br />
&= 2 \int_0^1\big( -x^3 + \frac{5}{3}x^2 -\frac{7}{9} x +\frac{1}{9} \big) dx \\<br />
&= 2 \big( -\frac{1}{4}x^4 + \frac{5}{9}x^3 -\frac{7}{18} x^2 +\frac{1}{9}x \big)\Big|_0^1 \\<br />
&= 2 \big( -\frac{1}{4} + \frac{5}{9} -\frac{7}{18} +\frac{1}{9} \big) \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
Therefore, the standard deviation of ''X'' is<br />
<br />
<math>\begin{align}<br />
\sigma &= \sqrt{\text{Var}(X)}\\<br />
&= \frac 1{3\sqrt{2}}<br />
\end{align}</math><br />
<br />
===An Alternative Formula for Variance===<br />
<br />
There is an alternative formula for the variance of a random variable that is less tedious than the above definition. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Alternate Formula for the Variance of a Continuous Random Variable<br />
|-<br />
| The '''variance''' of a continuous random variable ''X'' with PDF ''f''(''x'') is the number given by<br />
<br />
<center><math>\text{Var}(X) = \mathbb{E}(X^2) - [\mathbb{E}(X)]^2</math>.</center><br />
<br />
|}<br />
<br />
The derivation of this formula is a simple manipulation and has been relegated to the exercises. We should note that a completely analogous formula holds for the variance of a discrete random variable, with the integral signs replaced by sums.<br />
<br />
==Simple Example Revisited==<br />
<br />
We can use this alternate formula for variance to find the standard deviation of the random variable ''X'' defined above.<br />
<br />
Remembering that the expectation of ''X'' was found to be 1/3, we compute the variance of ''X'' as follows:<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \mathbb{E}(X^2) - [\mathbb{E}(X)]^2\\<br />
&= \int_{-\infty}^{\infty} x^2 f(x) dx - \left(\frac 13\right)^2 \\<br />
&= 2 \int_0^1 x^2 (1-x) dx - \frac{1}{9}\\<br />
&= 2 \int_0^1\big( x^2 - x^3 \big) dx- \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3}x^3 - \frac{1}{4}x^4 \big) \big|_0^1 - \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3} - \frac{1}{4} \big) - \frac{1}{9} \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
In the exercises, you will compute the expectations, variances and standard deviations of many of the random variables we have introduced in this chapter, as well as those of many new ones.</div>EdKrochttps://wiki.ubc.ca/index.php?title=UBC_Wiki:Books/Mprobably&diff=171986UBC Wiki:Books/Mprobably2012-05-30T05:06:34Z<p>EdKroc: Created page with "{{saved_book}} == Probability Appendix == ;Discrete Random Variables :1.1 Random Variables :1.2 Probability Basics :1.3 The Probability Mass Function :[[1.4 The C..."</p>
<hr />
<div>{{saved_book}}<br />
<br />
== Probability Appendix ==<br />
;Discrete Random Variables<br />
:[[1.1 Random Variables]]<br />
:[[1.2 Probability Basics]]<br />
:[[1.3 The Probability Mass Function]]<br />
:[[1.4 The Cumulative Distribution Function]]<br />
:[[1.5 Some Common Discrete Distributions]]<br />
:[[1.6 Expected Value]]<br />
:[[1.7 Variance and Standard Deviation]]<br />
:[[1.8 Chapter 1 Summary]]<br />
;Continuous Random Variables<br />
:[[2.1 The Cumulative Distribution Function (Continuous Case)]]<br />
:[[2.2 The Probability Density Function]]<br />
:[[2.3 Some Common Continuous Distributions]]<br />
:[[2.4 The Normal Distribution]]<br />
:[[2.5 Expected Value, Variance, and Standard Deviation]]<br />
:[[2.6 A Sample Problem]]<br />
:[[2.7 Chapter 2 Summary]]<br />
<br />
[[Category:Books|Books/Mprobably]]</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.7_Chapter_2_Summary&diff=1719852.7 Chapter 2 Summary2012-05-30T05:05:26Z<p>EdKroc: Created page with "Chapter 2 defined continuous random variables and investigated some of their properties. We saw several examples of commonly used continuous distributions, including the famou..."</p>
<hr />
<div>Chapter 2 defined continuous random variables and investigated some of their properties. We saw several examples of commonly used continuous distributions, including the famous normal distribution.<br />
<br />
==Relationship Between CDF and PDF==<br />
<br />
One of the key features of a random variable is its associated probability distribution, which gives the probabilities that we can observe a certain event, or set of values, under the given random variable. This distribution can take the form of either a cumulative distribution function (CDF) or a probability density function (PDF) for continuous random variables. These two functions are related by the Fundamental Theorem of Calculus: <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<math>F(x) = \int_{-\infty}^{x} f(t) dt</math><br />
|}<br />
<br />
The integrand is the PDF of our continuous random variable, and the corresponding integral is the CDF.<br />
<br />
==Calculating Probabilities==<br />
<br />
These two functions give the probabilities associated with observing certain events under a random variable ''X'' in question. The CDF has a direct probabilistic interpretation, given by<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<math>F(x) = \text{Pr}(X \le x)</math><br />
|}<br />
<br />
Using the relationship between the CDF and the PDF, probabilities for events associated to continuous random variables can be computed in two equivalent ways. Suppose we wish to calculate the probability that a continuous random variable ''X'' is between two values ''a'' and ''b''. We could use the PDF and integrate to find this probability.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<math>\text{Pr}(a \le X \le b) = \int_{a}^{b} f(x) dx</math><br />
|}<br />
<br />
Alternatively, if we wish to use the CDF, ''F''(''x''), we can evaluate the difference ''F''(''b'') - ''F''(''a'') to find this probability.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<math>\text{Pr}(a \le X \le b) = F(b) - F(a)</math><br />
|}<br />
<br />
Of course we know that both approaches yield the same result. This fact is precisely the statement of the Fundamental Theorem of Calculus.<br />
<br />
==Expected Value, Variance and Standard Deviation==<br />
<br />
Just as with discrete random variables, the expectation represents the "center" of a random variable, an expected value of an experiment, or the average value of the outcomes of an experiment repeated many times. The variance and standard deviation of a random variable is a numerical measure of the spread, or dispersion, of the PDF of X. Given the PDF ''f''(''x'') of a continuous random variable ''X'', we can calculate these quantities.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="120"<br />
|- style="background-color:#f0f0f0;"<br />
|- <br />
|<center><math>\mathbb{E}(X) = \int_{-\infty}^{\infty}xf(x)dx</math></center><br />
|-<br />
|<center><math>\text{Var}(X) = \int_{-\infty}^{\infty}(x-\mathbb{E}(X))^2f(x)dx</math></center><br />
|-<br />
|<center><math>\sigma(X) = \sqrt{\text{Var}(X)}</math></center><br />
|}</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.6_A_Sample_Problem&diff=1719842.6 A Sample Problem2012-05-30T05:04:41Z<p>EdKroc: Created page with "The length of time ''X'', needed by students in a particular course to complete a 1 hour exam is a random variable with PDF given by <math>f(x) = \begin{cases} k(x^2 + x) & \..."</p>
<hr />
<div>The length of time ''X'', needed by students in a particular course to complete a 1 hour exam is a random variable with PDF given by<br />
<br />
<math>f(x) = \begin{cases}<br />
k(x^2 + x) & \text{if } 0 \le x \le 1,\\<br />
0 & \text{elsewhere} <br />
\end{cases}<br />
</math><br />
<br />
For the random variable ''X'', <br />
<br />
# Find the value ''k'' that makes ''f''(''x'') a probability density function (PDF)<br />
# Find the cumulative distribution function (CDF)<br />
# Graph the PDF and the CDF<br />
# Find the probability that that a randomly selected student will finish the exam in less than half an hour<br />
# Find the mean time needed to complete a 1 hour exam<br />
# Find the variance and standard deviation of ''X''<br />
<br />
==Solution==<br />
<br />
===Part 1===<br />
<br />
The given PDF must integrate to 1. Thus, we calculate<br />
<br />
<math><br />
\begin{align}<br />
1 &= \int_{-\infty}^{\infty} f(x) dx\\<br />
&= k \int_{0}^{1}( x^2 + x )dx \\<br />
&= k \Big(\frac{x^3}{3}+\frac{x^2}{2}\Big)\Big|_0^1\\<br />
&= k\left(\frac 56\right)<br />
\end{align}<br />
</math><br />
<br />
Therefore, ''k'' = 6/5. Notice also that the PDF is nonnegative everywhere.<br />
<br />
===Part 2===<br />
<br />
The CDF, ''F''(''x''), is the area function of the PDF, obtained by integrating the PDF from negative infinity to an arbitrary value ''x''. <br />
<br />
If ''x'' is in the interval (-∞, 0), then<br />
<br />
<math><br />
\begin{align}<br />
F(x) &= \int_{-\infty}^{x} f(t) dt \\<br />
&= \int_{-\infty}^{x} 0 dt \\<br />
&=0<br />
\end{align}<br />
</math><br />
<br />
If ''x'' is in the interval [0, 1], then<br />
<br />
<math><br />
\begin{align}<br />
F(x) &= \int_{-\infty}^{x} f(t) dt \\<br />
&= \int_{-\infty}^{0} f(t) dt + \int_{0}^{x} f(t) dt \\<br />
&= 0 + \frac 65 \Big(\frac{x^3}{3}+\frac{x^2}{2}\Big) \\<br />
&= \frac 65 \Big(\frac{x^3}{3}+\frac{x^2}{2}\Big) \\<br />
\end{align}<br />
</math><br />
<br />
If ''x'' is in the interval (1, ∞) then<br />
<br />
<math><br />
\begin{align}<br />
F(x) &= \int_{-\infty}^{x} f(t) dt \\<br />
&= \int_{-\infty}^{0} f(t) dt + \int_{0}^{1} f(t) dt + \int_{1}^{x} f(t) dt \\<br />
&= 0 + \frac 65 \Big(\frac{x^3}{3}+\frac{x^2}{2}\Big)\Big|_0^1 + 0 \\<br />
&= \frac{6}{5}\cdot\frac{5}{6}\\<br />
&= 1<br />
\end{align}<br />
</math><br />
<br />
The CDF is therefore given by<br />
<br />
<math><br />
F(x) =<br />
\begin{cases}<br />
0 & \text{if } x < 0,\\<br />
\frac 65 \Big(\frac{x^3}{3}+\frac{x^2}{2}\Big) & \text{if } 0 \le x \le 1,\\<br />
1 & \text{if } x > 1.<br />
\end{cases}<br />
</math><br />
<br />
===Part 3===<br />
<br />
The PDF and CDF of ''X'' are shown below. <br />
<br />
[[File:MATH105PDFCDF.jpg|300px]]<br />
<br />
===Part 4===<br />
<br />
The probability that a student will complete the exam in less than half an hour is Pr(''X'' < 0.5). Note that since Pr(''X'' = 0.5) = 0 (since ''X'' is a continuous random variable) it is equivalent to calculate Pr(''x'' ≤ 0.5). This is precisely ''F''(0.5):<br />
<br />
<math><br />
F(0.5) = \frac 65 \Big(\frac{0.5^3}{3}+\frac{0.5^2}{2}\Big) = \frac 65 \Big(\frac{1}{24} + \frac{1}{8}\Big) = \frac{6}{5} \cdot \frac{1}{6} =\frac{1}{5}<br />
</math><br />
<br />
===Part 5===<br />
<br />
The mean time to complete a 1 hour exam is the expected value of the random variable ''X''. Consequently, we calculate<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X) &= \int_{-\infty}^{\infty}xf(x) dx\\<br />
&= \frac 65 \int_0^1 x(x^2+x) dx\\<br />
&= \frac 65 \int_0^1 x^3 + x^2 dx\\<br />
&= \frac 65 \left(\frac {x^4}{4} + \frac {x^3}{3}\right)\Big|_0^1\\<br />
&= \frac 65 \left(\frac 14 + \frac 13\right)\\<br />
&= \frac {7}{10}<br />
\end{align}</math><br />
<br />
===Part 6===<br />
<br />
To find the variance of ''X'', we use our alternate formula to calculate<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \mathbb{E}(X^2) - [\mathbb{E}(X)]^2\\<br />
&= \int_{-\infty}^{\infty} x^2f(x) dx - \left(\frac {7}{10}\right)^2\\<br />
&= \frac 65 \int_0^1 x^2(x^2+x) dx - \frac {49}{100}\\<br />
&= \frac 65 \int_0^1 x^4 + x^3 dx - \frac {49}{100}\\<br />
&= \frac 65 \left(\frac {x^5}{5} + \frac {x^4}{4}\right)\Big|_0^1 - \frac {49}{100}\\<br />
&= \frac 65 \left(\frac 15 + \frac 14\right) - \frac {49}{100}\\<br />
&= \frac {54}{100} - \frac {49}{100}\\<br />
&= \frac {1}{20}<br />
\end{align}</math><br />
<br />
Finally, we see that the standard deviation of ''X'' is <br />
<br />
<math><br />
\sigma(X) = \sqrt{\frac 1{20}} = \frac 1{2\sqrt{5}}<br />
</math></div>EdKrochttps://wiki.ubc.ca/index.php?title=2.5_Expected_Value,_Variance,_and_Standard_Deviation&diff=1719832.5 Expected Value, Variance, and Standard Deviation2012-05-30T05:03:58Z<p>EdKroc: Created page with "Analogous to the discrete case, we can define the expected value, variance, and standard deviation of a continuous random variable. These quantities have the same interpretati..."</p>
<hr />
<div>Analogous to the discrete case, we can define the expected value, variance, and standard deviation of a continuous random variable. These quantities have the same interpretation as in the discrete setting. The expectation of a random variable is a measure of the center of the distribution, its mean value. The variance and standard deviation are measures of the horizontal spread or dispersion of the random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="600"<br />
|- style="background-color:#f0f0f0;"<br />
! Expected Value, Variance, and Standard Deviation of a Continuous Random Variable<br />
|-<br />
| The '''expected value''' (also called the '''expectation''' or '''mean''') of a continuous random variable ''X'', with probability density function ''f''(''x''), is the number given by<br />
<br />
<center><math>\mathbb{E}(X) = \int_{-\infty}^{\infty} x f(x) dx</math>.</center><br />
<br />
The '''variance''' of ''X'' is:<br />
<br />
<center><math>\text{Var}(X) = \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx </math>.</center><br />
<br />
As in the discrete case, the '''standard deviation''', σ, is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math>.</center><br />
<br />
|}<br />
<br />
==Simple Example==<br />
<br />
A random variable ''X'' is given by the following PDF. Check that this is a valid PDF and calculate the standard deviation of ''X''.<br />
<br />
<math>f(x) = \begin{cases}<br />
2 (1 - x) & \text{if } 0 \le x \le 1,\\<br />
0 & \text{otherwise} <br />
\end{cases}<br />
</math><br />
<br />
===Solution===<br />
<br />
====Part 1====<br />
<br />
To verify that ''f''(''x'') is a valid PDF, we must check that it is everywhere nonnegative and that it integrates to 1.<br />
<br />
We see that 2(1-x) = 2 - 2x ≥ 0 precisely when x ≤ 1; thus ''f''(''x'') is everywhere nonnegative.<br />
<br />
To check that ''f''(''x'') has unit area under its graph, we calculate<br />
<br />
<math>\begin{align}<br />
\int_{-\infty}^{\infty} f(x) dx = 2 \int_{0}^{1} (1 - x) dx =2 \Big( x - \frac{x^2}{2} \Big) \Big|_0^1=1 <br />
\end{align}</math><br />
<br />
So ''f''(''x'') is indeed a valid PDF.<br />
<br />
====Part 2====<br />
<br />
To calculate the standard deviation of ''X'', we must first find its variance. Calculating the variance of ''X'' requires its expected value:<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X) &= \int_{-\infty}^{\infty} x f(x) dx \\<br />
&= \int_{0}^{1} x \Big[ 2 (1 - x) \Big] dx \\<br />
&= 2 \int_{0}^{1} \Big( x - x^2 \Big) dx \\<br />
&= 2 \Big( \frac{x^2}{2} - \frac{x^3}{3} \Big) \Big|_0^1 \\<br />
&= 1/3<br />
\end{align}</math><br />
<br />
Using this value, we compute the variance of ''X'' as follows<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx \\<br />
&= \int_0^1 \big( x - 1/3\big)^2\cdot 2(1-x) dx \\<br />
&= 2 \int_0^1\big( x^2 -\frac{2}{3} x + \frac{1}{9} \big) (1-x) dx \\<br />
&= 2 \int_0^1\big( -x^3 + \frac{5}{3}x^2 -\frac{7}{9} x +\frac{1}{9} \big) dx \\<br />
&= 2 \big( -\frac{1}{4}x^4 + \frac{5}{9}x^3 -\frac{7}{18} x^2 +\frac{1}{9}x \big)\Big|_0^1 \\<br />
&= 2 \big( -\frac{1}{4} + \frac{5}{9} -\frac{7}{18} +\frac{1}{9} \big) \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
Therefore, the standard deviation of ''X'' is<br />
<br />
<math>\begin{align}<br />
\sigma &= \sqrt{\text{Var}(X)}\\<br />
&= \frac 1{3\sqrt{2}}<br />
\end{align}</math><br />
<br />
===An Alternative Formula for Variance===<br />
<br />
There is an alternative formula for the variance of a random variable that is less tedious than the above definition. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Alternate Formula for the Variance of a Continuous Random Variable<br />
|-<br />
| The '''variance''' of a continuous random variable ''X'' with PDF ''f''(''x'') is the number given by<br />
<br />
<center><math>\text{Var}(X) = \mathbb{E}(X^2) - [\mathbb{E}(X)]^2</math>.</center><br />
<br />
|}<br />
<br />
The derivation of this formula is a simple manipulation and has been relegated to the exercises. We should note that a completely analogous formula holds for the variance of a discrete random variable, with the integral signs replaced by sums.<br />
<br />
==Simple Example Revisited==<br />
<br />
We can use this alternate formula for variance to find the standard deviation of the random variable ''X'' defined above.<br />
<br />
Remembering that the expectation of ''X'' was found to be 1/3, we compute the variance of ''X'' as follows:<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \mathbb{E}(X^2) - [\mathbb{E}(X)]^2\\<br />
&= \int_{-\infty}^{\infty} x^2 f(x) dx - \left(\frac 13\right)^2 \\<br />
&= 2 \int_0^1 x^2 (1-x) dx - \frac{1}{9}\\<br />
&= 2 \int_0^1\big( x^2 - x^3 \big) dx- \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3}x^3 - \frac{1}{4}x^4 \big) \big|_0^1 - \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3} - \frac{1}{4} \big) - \frac{1}{9} \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
In the exercises, you will compute the expectations, variances and standard deviations of many of the random variables we have introduced in this chapter, as well as those of many new ones.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.4_The_Normal_Distribution&diff=1719822.4 The Normal Distribution2012-05-30T05:03:15Z<p>EdKroc: </p>
<hr />
<div>The most important probability distribution in all of science and mathematics is the ''normal'' distribution.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="650"<br />
|- style="background-color:#f0f0f0;"<br />
! The Normal Distribution<br />
|-<br />
|The random variable ''X'' has a '''normal distribution with mean parameter ''μ'' and variance parameter ''σ''<sup>2</sup>''' > 0 if and only if its PDF is given by <br />
<br />
<center><math>f(x) = \frac 1{\sqrt{2\pi\sigma^2}}e^{-\frac {(x-\mu)^2}{2\sigma^2}},\ -\infty < x < \infty.</math></center><br />
<br />
To express this distributional relationship on ''X'', we commonly write ''X'' ~ Normal(''μ'',''σ''<sup>2</sup>).<br />
|}<br />
<br />
This PDF is the classic "bell curve" shape associated with so many experiments and natural phenomena. The parameter μ gives the mean of the distribution (the center of the bell curve) while the σ<sup>2</sup> parameter gives the variance (the horizontal spread of the bell curve). The first of these facts is a simple exercise in integration (see the exercises), while the second requires a bit more ingenuity. <br />
<br />
Recall that the standard deviation of a random variable is defined to be the positive square root of its variance. Thus, a normal random variable has standard deviation σ.<br />
<br />
This random variable enjoys many analytical properties that make it a desirable object to work with theoretically. For example, the normal density is symmetric about its mean μ. This means that, among other things, exactly half of the area under the PDF lies to the right of the mean, and the other half of the area lies to the left of the mean. More generally, we have the following important fact.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Symmetry of Probabilities for a Normal Distribution<br />
|-<br />
|If ''X'' has a normal distribution with mean ''μ'' and variance ''σ''<sup>2</sup>, and if ''x'' is any real number, then<br />
<br />
<center><math>\text{Pr}(X\leq \mu - x) = \text{Pr}(X\geq \mu + x).</math></center><br />
|}<br />
<br />
However, the PDF of a normal distribution is not convenient for calculating probabilities directly. In fact, it can be shown that no closed form exists for the cumulative distribution function of a normal random variable. Thus, we must rely on tables of values to calculate probabilities for events associated to a normal random variable. (The values in these tables are calculated using numerical techniques of integration.)<br />
<br />
A particularly useful version of the normal distribution is the ''standard normal'' distribution, where the mean parameter is 0 and the variance parameter is 1.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="600"<br />
|- style="background-color:#f0f0f0;"<br />
! The Standard Normal Distribution<br />
|-<br />
|The random variable ''Z'' has a '''standard normal distribution''' if its distribution is normal with mean 0 and variance 1. The PDF of ''Z'' is given by <br />
<br />
<center><math>f(z) = \frac 1{\sqrt{2\pi}}e^{-\frac {z^2}{2}},\ -\infty < z < \infty.</math></center><br />
|}<br />
<br />
For a particular value ''x'' of a normal random variable ''X'', the distance from ''x'' to the mean ''μ'' of ''X'' expressed in units of standard deviation ''σ'' is<br />
<br />
<math>z = \frac {x-\mu}{\sigma}.</math><br />
<br />
Since we have subtracted off the mean (the center of the distribution) and factored out the standard deviation (the horizontal spread), this new value ''z'' is not only a rescaled version of ''x'', but is also a realization of a ''standard'' normal random variable. <br />
<br />
In this way, we can standardize any value from a generic normal distribution, transforming it into one from a standard normal distribution. Thus we reduce the problem of calculating probabilities for an event from a normal random variable to calculating probabilities for an event from a ''standard'' normal random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Standardizing a Normal Random Variable<br />
|-<br />
|Let ''X'' have a normal distribution with mean ''μ'' and variance ''σ''<sup>2</sup>. Then the new random variable <br />
<br />
<center><math>Z = \frac {X - \mu}{\sigma}</math></center><br />
<br />
has a standard normal distribution.<br />
|}<br />
<br />
==Calculating Probabilities of Events Using a Standard Normal Distribution==<br />
<br />
Suppose that the test scores for first-year integral calculus final exams are normally distributed with mean 70 and standard deviation 14. Given that Pr(''Z'' ≤ 0.36) = 0.64 and Pr(''Z'' ≤ 1.43) = 0.92 for a standard normal random variable ''Z'', what percentage of final exam scores lie between 75 and 90?<br />
<br />
If we let ''X'' denote the score of a randomly selected final exam, then we know that ''X'' has a normal distribution with parameters μ = 70 and σ = 14. To find the percentage of final exam scores that lie between 75 and 90, we need to use the information about the probabilities of a ''standard'' normal random variable. Thus we must standardize ''X'' using the technique above. <br />
<br />
For our particular problem, we wish to compute<br />
<br />
<math> \text{Pr}(75 \leq X \leq 90).</math><br />
<br />
We proceed by standardizing the random variable ''X'' as well as the particular ''x'' values of interest. Thus, since ''X'' has mean 70 and standard deviation 14, we write<br />
<br />
<math> \text{Pr}(75 \leq X \leq 90) = \text{Pr}\left(\frac {75 - 70}{14} \leq \frac {X - 70}{14} \leq \frac {90 - 70}{14}\right).</math><br />
<br />
Now we have standardized our normal random variable so that <br />
<br />
<math>\frac {X - 70}{14} = Z,</math><br />
<br />
where ''Z'' ~ Normal(0,1).<br />
<br />
Simplifying the numerical expressions from above, we deduce that we must calculate<br />
<br />
<math> \text{Pr}(0.36 \leq Z \leq 1.43).</math><br />
<br />
Now we can use the information we were given, namely that Pr(''Z'' ≤ 0.36) = 0.64 and Pr(''Z'' ≤ 1.43) = 0.92. Using these values, we find <br />
<br />
<math>\begin{align}<br />
\text{Pr}(75 \leq X \leq 90) &= \text{Pr}(0.36 \leq Z \leq 1.43)\\<br />
&= \text{Pr}(Z\leq 1.43) - \text{Pr}(Z\leq 0.36)\\<br />
&= 0.92 - 0.64\\<br />
&= 0.28.<br />
\end{align}</math><br />
<br />
Therefore the percentage of first-year integral calculus final exam scores between 75 and 90 is 28%.<br />
<br />
Now suppose we wish to find the percentage of final exam scores larger than 90, as well as the percentage of final exam scores less than 65. To find the percentage of final exam scores larger than 90, we use our knowledge about probabilities of disjoint events:<br />
<br />
<math>\begin{align}<br />
\text{Pr}(X > 90) &= 1 - \text{Pr}(X \leq 90)\\<br />
&= 1 - \text{Pr}(Z\leq 1.43)\\<br />
&= 1 - 0.92\\<br />
&= 0.08.<br />
\end{align}</math><br />
<br />
Thus, we find that 8% of exam scores are larger than 90.<br />
<br />
To find the percentage of final exam scores less than 65, we must exploit the symmetry of the normal distribution. Recall that our normal random variable ''X'' has mean 70. We are given information about the probability of a standard normal random variable assuming a value less than 0.36, which we have already seen corresponds to the probability of our normal random variable ''X'' assuming a value less than 75. Now notice that the ''x'' value 65 is the reflection of 75 through the mean of ''X''. That is, both scores 65 and 75 are exactly 5 units from the mean of our random variable ''X''. Thus we can take advantage of the symmetry property of the normal distribution.<br />
<br />
Using the symmetry identity from earlier in this section, we find that<br />
<br />
<math>\begin{align}<br />
\text{Pr}(X < 65) &= \text{Pr}(X < 70 - 5)\\<br />
&= \text{Pr}(X > 70 + 5)\\<br />
&= 1 - \text{Pr}(X\leq 75)\\<br />
&= 1 - \text{Pr}(Z\leq 0.36)\\<br />
&= 1 - 0.64\\<br />
&= 0.36.<br />
\end{align}</math><br />
<br />
Thus, we find that 36% of exam scores are smaller than 65.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.4_The_Normal_Distribution&diff=1719812.4 The Normal Distribution2012-05-30T05:02:49Z<p>EdKroc: </p>
<hr />
<div>The most important probability distribution in all of science and mathematics is the ''normal'' distribution.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="600"<br />
|- style="background-color:#f0f0f0;"<br />
! The Normal Distribution<br />
|-<br />
|The random variable ''X'' has a '''normal distribution with mean parameter ''μ'' and variance parameter ''σ''<sup>2</sup>''' > 0 if and only if its PDF is given by <br />
<br />
<center><math>f(x) = \frac 1{\sqrt{2\pi\sigma^2}}e^{-\frac {(x-\mu)^2}{2\sigma^2}},\ -\infty < x < \infty.</math></center><br />
<br />
To express this distributional relationship on ''X'', we commonly write ''X'' ~ Normal(''μ'',''σ''<sup>2</sup>).<br />
|}<br />
<br />
This PDF is the classic "bell curve" shape associated with so many experiments and natural phenomena. The parameter μ gives the mean of the distribution (the center of the bell curve) while the σ<sup>2</sup> parameter gives the variance (the horizontal spread of the bell curve). The first of these facts is a simple exercise in integration (see the exercises), while the second requires a bit more ingenuity. <br />
<br />
Recall that the standard deviation of a random variable is defined to be the positive square root of its variance. Thus, a normal random variable has standard deviation σ.<br />
<br />
This random variable enjoys many analytical properties that make it a desirable object to work with theoretically. For example, the normal density is symmetric about its mean μ. This means that, among other things, exactly half of the area under the PDF lies to the right of the mean, and the other half of the area lies to the left of the mean. More generally, we have the following important fact.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Symmetry of Probabilities for a Normal Distribution<br />
|-<br />
|If ''X'' has a normal distribution with mean ''μ'' and variance ''σ''<sup>2</sup>, and if ''x'' is any real number, then<br />
<br />
<center><math>\text{Pr}(X\leq \mu - x) = \text{Pr}(X\geq \mu + x).</math></center><br />
|}<br />
<br />
However, the PDF of a normal distribution is not convenient for calculating probabilities directly. In fact, it can be shown that no closed form exists for the cumulative distribution function of a normal random variable. Thus, we must rely on tables of values to calculate probabilities for events associated to a normal random variable. (The values in these tables are calculated using numerical techniques of integration.)<br />
<br />
A particularly useful version of the normal distribution is the ''standard normal'' distribution, where the mean parameter is 0 and the variance parameter is 1.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="600"<br />
|- style="background-color:#f0f0f0;"<br />
! The Standard Normal Distribution<br />
|-<br />
|The random variable ''Z'' has a '''standard normal distribution''' if its distribution is normal with mean 0 and variance 1. The PDF of ''Z'' is given by <br />
<br />
<center><math>f(z) = \frac 1{\sqrt{2\pi}}e^{-\frac {z^2}{2}},\ -\infty < z < \infty.</math></center><br />
|}<br />
<br />
For a particular value ''x'' of a normal random variable ''X'', the distance from ''x'' to the mean ''μ'' of ''X'' expressed in units of standard deviation ''σ'' is<br />
<br />
<math>z = \frac {x-\mu}{\sigma}.</math><br />
<br />
Since we have subtracted off the mean (the center of the distribution) and factored out the standard deviation (the horizontal spread), this new value ''z'' is not only a rescaled version of ''x'', but is also a realization of a ''standard'' normal random variable. <br />
<br />
In this way, we can standardize any value from a generic normal distribution, transforming it into one from a standard normal distribution. Thus we reduce the problem of calculating probabilities for an event from a normal random variable to calculating probabilities for an event from a ''standard'' normal random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Standardizing a Normal Random Variable<br />
|-<br />
|Let ''X'' have a normal distribution with mean ''μ'' and variance ''σ''<sup>2</sup>. Then the new random variable <br />
<br />
<center><math>Z = \frac {X - \mu}{\sigma}</math></center><br />
<br />
has a standard normal distribution.<br />
|}<br />
<br />
==Calculating Probabilities of Events Using a Standard Normal Distribution==<br />
<br />
Suppose that the test scores for first-year integral calculus final exams are normally distributed with mean 70 and standard deviation 14. Given that Pr(''Z'' ≤ 0.36) = 0.64 and Pr(''Z'' ≤ 1.43) = 0.92 for a standard normal random variable ''Z'', what percentage of final exam scores lie between 75 and 90?<br />
<br />
If we let ''X'' denote the score of a randomly selected final exam, then we know that ''X'' has a normal distribution with parameters μ = 70 and σ = 14. To find the percentage of final exam scores that lie between 75 and 90, we need to use the information about the probabilities of a ''standard'' normal random variable. Thus we must standardize ''X'' using the technique above. <br />
<br />
For our particular problem, we wish to compute<br />
<br />
<math> \text{Pr}(75 \leq X \leq 90).</math><br />
<br />
We proceed by standardizing the random variable ''X'' as well as the particular ''x'' values of interest. Thus, since ''X'' has mean 70 and standard deviation 14, we write<br />
<br />
<math> \text{Pr}(75 \leq X \leq 90) = \text{Pr}\left(\frac {75 - 70}{14} \leq \frac {X - 70}{14} \leq \frac {90 - 70}{14}\right).</math><br />
<br />
Now we have standardized our normal random variable so that <br />
<br />
<math>\frac {X - 70}{14} = Z,</math><br />
<br />
where ''Z'' ~ Normal(0,1).<br />
<br />
Simplifying the numerical expressions from above, we deduce that we must calculate<br />
<br />
<math> \text{Pr}(0.36 \leq Z \leq 1.43).</math><br />
<br />
Now we can use the information we were given, namely that Pr(''Z'' ≤ 0.36) = 0.64 and Pr(''Z'' ≤ 1.43) = 0.92. Using these values, we find <br />
<br />
<math>\begin{align}<br />
\text{Pr}(75 \leq X \leq 90) &= \text{Pr}(0.36 \leq Z \leq 1.43)\\<br />
&= \text{Pr}(Z\leq 1.43) - \text{Pr}(Z\leq 0.36)\\<br />
&= 0.92 - 0.64\\<br />
&= 0.28.<br />
\end{align}</math><br />
<br />
Therefore the percentage of first-year integral calculus final exam scores between 75 and 90 is 28%.<br />
<br />
Now suppose we wish to find the percentage of final exam scores larger than 90, as well as the percentage of final exam scores less than 65. To find the percentage of final exam scores larger than 90, we use our knowledge about probabilities of disjoint events:<br />
<br />
<math>\begin{align}<br />
\text{Pr}(X > 90) &= 1 - \text{Pr}(X \leq 90)\\<br />
&= 1 - \text{Pr}(Z\leq 1.43)\\<br />
&= 1 - 0.92\\<br />
&= 0.08.<br />
\end{align}</math><br />
<br />
Thus, we find that 8% of exam scores are larger than 90.<br />
<br />
To find the percentage of final exam scores less than 65, we must exploit the symmetry of the normal distribution. Recall that our normal random variable ''X'' has mean 70. We are given information about the probability of a standard normal random variable assuming a value less than 0.36, which we have already seen corresponds to the probability of our normal random variable ''X'' assuming a value less than 75. Now notice that the ''x'' value 65 is the reflection of 75 through the mean of ''X''. That is, both scores 65 and 75 are exactly 5 units from the mean of our random variable ''X''. Thus we can take advantage of the symmetry property of the normal distribution.<br />
<br />
Using the symmetry identity from earlier in this section, we find that<br />
<br />
<math>\begin{align}<br />
\text{Pr}(X < 65) &= \text{Pr}(X < 70 - 5)\\<br />
&= \text{Pr}(X > 70 + 5)\\<br />
&= 1 - \text{Pr}(X\leq 75)\\<br />
&= 1 - \text{Pr}(Z\leq 0.36)\\<br />
&= 1 - 0.64\\<br />
&= 0.36.<br />
\end{align}</math><br />
<br />
Thus, we find that 36% of exam scores are smaller than 65.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.4_The_Normal_Distribution&diff=1719802.4 The Normal Distribution2012-05-30T05:01:55Z<p>EdKroc: Created page with "The most important probability distribution in all of science and mathematics is the ''normal'' distribution. {| border="1" cellspacing="0" cellpadding="4" align="center" |- ..."</p>
<hr />
<div>The most important probability distribution in all of science and mathematics is the ''normal'' distribution.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! The Normal Distribution<br />
|-<br />
|The random variable ''X'' has a '''normal distribution with mean parameter ''μ'' and variance parameter ''σ''<sup>2</sup>''' > 0 if and only if its PDF is given by <br />
<br />
<center><math>f(x) = \frac 1{\sqrt{2\pi\sigma^2}}e^{-\frac {(x-\mu)^2}{2\sigma^2}},\ -\infty < x < \infty.</math></center><br />
<br />
To express this distributional relationship on ''X'', we commonly write ''X'' ~ Normal(''μ'',''σ''<sup>2</sup>).<br />
|}<br />
<br />
This PDF is the classic "bell curve" shape associated with so many experiments and natural phenomena. The parameter μ gives the mean of the distribution (the center of the bell curve) while the σ<sup>2</sup> parameter gives the variance (the horizontal spread of the bell curve). The first of these facts is a simple exercise in integration (see the exercises), while the second requires a bit more ingenuity. <br />
<br />
Recall that the standard deviation of a random variable is defined to be the positive square root of its variance. Thus, a normal random variable has standard deviation σ.<br />
<br />
This random variable enjoys many analytical properties that make it a desirable object to work with theoretically. For example, the normal density is symmetric about its mean μ. This means that, among other things, exactly half of the area under the PDF lies to the right of the mean, and the other half of the area lies to the left of the mean. More generally, we have the following important fact.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Symmetry of Probabilities for a Normal Distribution<br />
|-<br />
|If ''X'' has a normal distribution with mean ''μ'' and variance ''σ''<sup>2</sup>, and if ''x'' is any real number, then<br />
<br />
<center><math>\text{Pr}(X\leq \mu - x) = \text{Pr}(X\geq \mu + x).</math></center><br />
|}<br />
<br />
However, the PDF of a normal distribution is not convenient for calculating probabilities directly. In fact, it can be shown that no closed form exists for the cumulative distribution function of a normal random variable. Thus, we must rely on tables of values to calculate probabilities for events associated to a normal random variable. (The values in these tables are calculated using numerical techniques of integration.)<br />
<br />
A particularly useful version of the normal distribution is the ''standard normal'' distribution, where the mean parameter is 0 and the variance parameter is 1.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! The Standard Normal Distribution<br />
|-<br />
|The random variable ''Z'' has a '''standard normal distribution''' if its distribution is normal with mean 0 and variance 1. The PDF of ''Z'' is given by <br />
<br />
<center><math>f(z) = \frac 1{\sqrt{2\pi}}e^{-\frac {z^2}{2}},\ -\infty < z < \infty.</math></center><br />
|}<br />
<br />
For a particular value ''x'' of a normal random variable ''X'', the distance from ''x'' to the mean ''μ'' of ''X'' expressed in units of standard deviation ''σ'' is<br />
<br />
<math>z = \frac {x-\mu}{\sigma}.</math><br />
<br />
Since we have subtracted off the mean (the center of the distribution) and factored out the standard deviation (the horizontal spread), this new value ''z'' is not only a rescaled version of ''x'', but is also a realization of a ''standard'' normal random variable. <br />
<br />
In this way, we can standardize any value from a generic normal distribution, transforming it into one from a standard normal distribution. Thus we reduce the problem of calculating probabilities for an event from a normal random variable to calculating probabilities for an event from a ''standard'' normal random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Standardizing a Normal Random Variable<br />
|-<br />
|Let ''X'' have a normal distribution with mean ''μ'' and variance ''σ''<sup>2</sup>. Then the new random variable <br />
<br />
<center><math>Z = \frac {X - \mu}{\sigma}</math></center><br />
<br />
has a standard normal distribution.<br />
|}<br />
<br />
==Calculating Probabilities of Events Using a Standard Normal Distribution==<br />
<br />
Suppose that the test scores for first-year integral calculus final exams are normally distributed with mean 70 and standard deviation 14. Given that Pr(''Z'' ≤ 0.36) = 0.64 and Pr(''Z'' ≤ 1.43) = 0.92 for a standard normal random variable ''Z'', what percentage of final exam scores lie between 75 and 90?<br />
<br />
If we let ''X'' denote the score of a randomly selected final exam, then we know that ''X'' has a normal distribution with parameters μ = 70 and σ = 14. To find the percentage of final exam scores that lie between 75 and 90, we need to use the information about the probabilities of a ''standard'' normal random variable. Thus we must standardize ''X'' using the technique above. <br />
<br />
For our particular problem, we wish to compute<br />
<br />
<math> \text{Pr}(75 \leq X \leq 90).</math><br />
<br />
We proceed by standardizing the random variable ''X'' as well as the particular ''x'' values of interest. Thus, since ''X'' has mean 70 and standard deviation 14, we write<br />
<br />
<math> \text{Pr}(75 \leq X \leq 90) = \text{Pr}\left(\frac {75 - 70}{14} \leq \frac {X - 70}{14} \leq \frac {90 - 70}{14}\right).</math><br />
<br />
Now we have standardized our normal random variable so that <br />
<br />
<math>\frac {X - 70}{14} = Z,</math><br />
<br />
where ''Z'' ~ Normal(0,1).<br />
<br />
Simplifying the numerical expressions from above, we deduce that we must calculate<br />
<br />
<math> \text{Pr}(0.36 \leq Z \leq 1.43).</math><br />
<br />
Now we can use the information we were given, namely that Pr(''Z'' ≤ 0.36) = 0.64 and Pr(''Z'' ≤ 1.43) = 0.92. Using these values, we find <br />
<br />
<math>\begin{align}<br />
\text{Pr}(75 \leq X \leq 90) &= \text{Pr}(0.36 \leq Z \leq 1.43)\\<br />
&= \text{Pr}(Z\leq 1.43) - \text{Pr}(Z\leq 0.36)\\<br />
&= 0.92 - 0.64\\<br />
&= 0.28.<br />
\end{align}</math><br />
<br />
Therefore the percentage of first-year integral calculus final exam scores between 75 and 90 is 28%.<br />
<br />
Now suppose we wish to find the percentage of final exam scores larger than 90, as well as the percentage of final exam scores less than 65. To find the percentage of final exam scores larger than 90, we use our knowledge about probabilities of disjoint events:<br />
<br />
<math>\begin{align}<br />
\text{Pr}(X > 90) &= 1 - \text{Pr}(X \leq 90)\\<br />
&= 1 - \text{Pr}(Z\leq 1.43)\\<br />
&= 1 - 0.92\\<br />
&= 0.08.<br />
\end{align}</math><br />
<br />
Thus, we find that 8% of exam scores are larger than 90.<br />
<br />
To find the percentage of final exam scores less than 65, we must exploit the symmetry of the normal distribution. Recall that our normal random variable ''X'' has mean 70. We are given information about the probability of a standard normal random variable assuming a value less than 0.36, which we have already seen corresponds to the probability of our normal random variable ''X'' assuming a value less than 75. Now notice that the ''x'' value 65 is the reflection of 75 through the mean of ''X''. That is, both scores 65 and 75 are exactly 5 units from the mean of our random variable ''X''. Thus we can take advantage of the symmetry property of the normal distribution.<br />
<br />
Using the symmetry identity from earlier in this section, we find that<br />
<br />
<math>\begin{align}<br />
\text{Pr}(X < 65) &= \text{Pr}(X < 70 - 5)\\<br />
&= \text{Pr}(X > 70 + 5)\\<br />
&= 1 - \text{Pr}(X\leq 75)\\<br />
&= 1 - \text{Pr}(Z\leq 0.36)\\<br />
&= 1 - 0.64\\<br />
&= 0.36.<br />
\end{align}</math><br />
<br />
Thus, we find that 36% of exam scores are smaller than 65.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.3_Some_Common_Continuous_Distributions&diff=1719792.3 Some Common Continuous Distributions2012-05-30T05:01:14Z<p>EdKroc: Created page with "Let us consider some common continuous random variables that often arise in practice. We should stress that this is indeed a very small sample of common continuous distributio..."</p>
<hr />
<div>Let us consider some common continuous random variables that often arise in practice. We should stress that this is indeed a very small sample of common continuous distributions.<br />
<br />
==The Beta Distribution==<br />
<br />
Suppose the proportion ''p'' of restaurants that make a profit in their first year of operation is given by a certain ''beta'' random variable ''X'', with probability density function:<br />
<br />
<math>f(p) = <br />
\begin{cases}<br />
12p(1 -p )^2 & \text{if } 0 \le p \le 1,\\<br />
0 & \text{elsewhere}.<br />
\end{cases}<br />
</math><br />
<br />
What is the probability that more than half of the restaurants will make a profit during their first year of operation? To answer this question, we calculate the probability as an area under the PDF curve as follows:<br />
<br />
<math> \begin{align}<br />
\mathrm{Pr}(0.5 \le X \le 1) &= \int_{0.5}^{1} f(p) dp \\<br />
&=\int_{0.5}^{1} 12p(1 -p )^2 dp \\<br />
&= \int_{0.5}^{1} \left(12p - 24p^2 + 12p^3\right) dp \\<br />
&= 6p^2 - 8p^3 + 3p^4 \Big|_{0.5}^1 \\<br />
&= (6 - 8 +3) - (1.5 - 1 + 0.1875) \\<br />
&= 0.3125<br />
\end{align} </math><br />
<br />
Therefore, Pr(0.5 ≤ ''P'' ≤ 1) = 0.3125.<br />
<br />
The example above is a particular case of a beta random variable. In general, a '''beta random variable''' has the generic PDF:<br />
<br />
<math>f(x) = \begin{cases}<br />
kx^{a-1}(1-x)^{b-1} & \text{if } 0 \le x \le 1,\\<br />
0 & \text{elsewhere}<br />
\end{cases}</math><br />
<br />
where the constants ''a'' and ''b'' are greater than zero, and the constant ''k'' is chosen so that the density ''f'' integrates to 1. <br />
<br />
We see that our previous example was a beta random variable given by the above density with ''a'' = 2 and ''b'' = 3. Let us find the associated cumulative distribution function ''F''(''p'') for this random variable. We compute:<br />
<br />
<math> \begin{align}<br />
F(p) &= \int_{-\infty}^{p} f(t) dt \\<br />
&= \int_0^p 12 t (1 - t)^2 dt \\<br />
&= 12 \int_0^p (t - 2t^2 + t^3) dt \\<br />
&= 12\Big( \frac{1}{2} t^2 - \frac{2}{3}t^3 + \frac{1}{4} t^4 \Big) \Big|_0^p \\<br />
&= p^2 ( 6 - 8p + 3p^2),<br />
\end{align} </math><br />
<br />
valid for 0 ≤ ''p'' ≤ 1.<br />
<br />
==The Exponential Distribution==<br />
<br />
The lifespan of a lightbulb can be modeled by a continuous random variable since lifespan - i.e. ''time'' - is a continuous quantity. A reasonable distribution for this random variable is what is known as an ''exponential distribution''.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
A random variable ''Y'' has an '''exponential distribution with parameter β''' > 0 if its PDF is given by <br />
<br />
<math>f(y) = \begin{cases}<br />
\frac 1{\beta}e^{-y/\beta} & \text{if } 0\leq y <\infty\\<br />
0 & \text{elsewhere}<br />
\end{cases}</math><br />
|}<br />
<br />
Suppose that the lifespan (in months) of lightbulbs manufactured at a certain facility can be modeled by an exponential random variable ''Y'' with parameter β = 4. What is the probability that a particular lightbulb lasts at least a year? Again, we can calculate this probability by evaluating an integral. Since there are 12 months in one year, we calculate<br />
<br />
<math> \begin{align}<br />
\mathrm{Pr}(Y \geq 12) &= \int_{12}^{\infty} f(y) dy \\<br />
&=\int_{12}^{\infty} \frac 14 e^{-y/4} dy \\<br />
&= -e^{-y/4} \Big|_{12}^{\infty} \\<br />
&= 0 - (-e^{-3}) \\<br />
&\approx 0.04979<br />
\end{align} </math><br />
<br />
Thus we can see that it is highly likely we would need to replace a lightbulb produced from this facility within one year of manufacture.<br />
<br />
==The Continuous Uniform Distribution==<br />
<br />
Our third example of a common continuous random variable is one that we have already encountered. Consider the experiment of randomly choosing a real number from the interval [a,b]. Letting ''X'' denote this random outcome, we say that ''X'' has a ''continuous uniform'' distribution on [a,b] if the probability that we choose a value in some subinterval of [a,b] is given by the relative size of that subinterval in [a,b]. More explicitly, we have the following:<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
A random variable ''X'' has an '''continuous uniform''' distribution on [a,b] if its PDF is constant on [a,b]; i.e. its PDF is given by <br />
<br />
<math>f(x) = \begin{cases}<br />
\frac 1{b-a} & \text{if } a\leq x \leq b\\<br />
0 & \text{elsewhere}<br />
\end{cases}</math><br />
|}<br />
<br />
The continuous uniform distribution has a particularly simple representation, just as its discrete counterpart does. Nevertheless, this random variable has great practical and theoretical utility. We will explore this distribution in more detail in the following example and in the exercises.<br />
<br />
===A Geometric Problem===<br />
<br />
Consider the square in the ''xy''-plane bounded by the lines ''x'' = 0, ''x'' = 1, ''y'' = 0 and ''y'' = 1. Now consider a vertical line with equation ''x'' = ''b'', where 0 ≤ ''b'' ≤ 1 is fixed. Note that this line will intersect the unit square just defined.<br />
<br />
Suppose we select a point inside this square, uniformly at random. If we let ''X'' be the ''x''-coordinate of this random point, what is the probability that ''X'' is in the interval [0 , ''b'']?<br />
<br />
An illustration of our problem is given in the figure below. Graphically, we are trying to find the probability that a randomly selected point inside the square lies to the ''left'' of the red line. <br />
<br />
[[File:MATH105ProbabilitySquareExample.jpg|400px]]<br />
<br />
The region to the left of the red line is a rectangle with area equal to ''b''. The probability that our random point lies inside this rectangle is proportional to the area of that rectangle, since the larger the area of the rectangle, the larger the probability is that the point is inside of it. <br />
<br />
* If the probability that the point is between 0 and ''b'' were equal to 0.5, then the red line would have to divide the square into two equal halves: so ''b'' = 0.5.<br />
* If the probability that the point is between 0 and ''b'' were equal to 0.25, then the red line would have to divide the square at 1/4: so ''b'' = 0.25.<br />
* If the probability that the point is between 0 and ''b'' were equal to 1, then the red line would have to lie on the rightmost edge of the square itself: so ''b'' = 1<br />
<br />
In general, we see that we should have Pr(0 ≤ ''X'' ≤ ''b'') = ''b''. <br />
<br />
Notice that this result matches with the definition of our random variable ''X''. Since we want to select a random point ''uniformly at random'' from the unit square, the random variable ''X'' giving the ''x''-coordinate of this random point should be a ''continuous uniform'' random variable on the interval [0,1]. Thus, the PDF of ''X'' is simply <math>f(x) = 1\!</math>, where <math>0\leq x\leq 1\!</math>.<br />
<br />
<br />
Therefore, <math>\textrm{Pr}(0\leq X\leq b) = \int_0^b dx = b\!</math>, which agrees with the answer we derived using purely geometric considerations.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.2_The_Probability_Density_Function&diff=1719782.2 The Probability Density Function2012-05-30T05:00:29Z<p>EdKroc: Created page with "==An Important Distinction Between Continuous and Discrete Random Variables== What is Pr(''X'' = ''x'')? The answer clearly depends on the distribution of the random variable..."</p>
<hr />
<div>==An Important Distinction Between Continuous and Discrete Random Variables==<br />
<br />
What is Pr(''X'' = ''x'')? The answer clearly depends on the distribution of the random variable ''X''. For discrete random variables, we have already seen that if ''x'' is a possible value that ''X'' can assume, then Pr(''X'' = ''x'') is some positive number. But is this still true if ''X'' is a continuous random variable?<br />
<br />
In the context of our maximum outdoor air temperature example from the previous section, we may ask ''what is the probability that the maximum outdoor air temperature in downtown Vancouver on any given day in January is '''exactly''' 0°C?'' Since our measurements of the air temperature are never exact, this probability should be ''zero''. If we had instead asked for the probability that the maximum outdoor air temperature was within 0.005° of 0°C, then we would have arrived at a nonzero probability. All practical measurements of continuous data are always approximate. They may be very precise, but they can never be truly ''exact''. Hence, we cannot expect to measure the likelihood of an exact outcome, only an approximate one.<br />
<br />
In general, for any continuous random variable ''X'', we will always have Pr(''X'' = ''x'') = 0. We can prove this fact directly by appealing to our basic results about combining probabilities of disjoint events.<br />
<br />
Suppose we choose any interval [ ''x'' , ''x'' + ''Δx'']. The probability that the continuous random variable ''X'' lies inside of this interval is <br />
<br />
<math>\text{Pr}(x\le X \le x + \Delta x). </math><br />
<br />
Using our identity for probabilities of disjoint events, we can write this as the difference<br />
<br />
<math>\text{Pr}(X \le x + \Delta x) - \text{Pr}(X \le x). </math><br />
<br />
If we take the limit as ''Δx'' goes to zero, we obtain<br />
<br />
<math>\begin{align}\lim_{\Delta x \rightarrow 0} \text{Pr}(x \le X \le x + \Delta x) <br />
&= \lim_{\Delta x \rightarrow 0} \Big[ \text{Pr}(X \le x + \Delta x) - \text{Pr}(X \le x) \Big] \\<br />
&= \text{Pr}(X \le x ) - \text{Pr}(X \le x)\\<br />
& = 0<br />
\end{align}</math><br />
<br />
Notice that the crucial step in this argument is the evaluation of the limit in the second to last line. Since ''X'' is a ''continuous'' random variable, its CDF ''F''(''x'') is a continuous function; thus, we are allowed to pass the limit through to the argument of the function ''F''(''x'') = Pr(''X'' ≤ ''x''). Notice that if ''X'' were a discrete random variable, this evaluation would not be possible in general since its CDF would not be continuous.<br />
<br />
This gives a direct proof of the fact that Pr(''X'' = ''x'') = 0 for any continuous random variable ''X''. We will see that an even simpler proof will come for free for most continuous random variables via the Fundamental Theorem of Calculus. In order to do this however, we need to relate these probabilities to an integration of some appropriate function. It turns out that this function plays a vital role in describing the distribution of a continuous random variable and will be extremely useful for performing calculations. <br />
<br />
==The Probability Density Function==<br />
<br />
The "appropriate function" referred to above is called the '''probability density function''' (PDF). It can be defined for most continuous random variables, and is extremely useful for calculating probabilities of events associated to a continuous random variable. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! The Probability Density Function<br />
|-<br />
| Let ''F''(''x'') be the cumulative distribution function for a continuous random variable ''X''. <br />
The '''probability density function''' (PDF) of ''X'' is given by<br />
<br />
<center><math>f(x) = \frac{dF(x)}{dx}</math>,</center><br />
<br />
wherever the derivative exists.<br />
|}<br />
<br />
In short, the PDF of a continuous random variable is the derivative of its CDF. Using the Fundamental Theorem of Calculus, we see that the CDF ''F''(''x'') of a continuous random variable ''X'' may be expressed in terms of its PDF:<br />
<br />
<math>F(x) = \int_{-\infty}^x f(t)dt,</math><br />
<br />
where ''f'' denotes the PDF of ''X''.<br />
<br />
==Properties of the PDF==<br />
<br />
This formulation of the PDF via the Fundamental Theorem of Calculus allows us to derive the following properties. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Properties of the Probability Density Function<br />
|-<br />
| If ''f''(''x'') is a '''probability density function''' for a continuous random variable ''X'', then<br />
<br />
<math> \begin{align} <br />
1. & \ F(x) = \mathrm{Pr}(X \le x) = \int_{-\infty}^{x} f(t) dt \\<br />
2. & \ f(x) \ge 0 \ \text{for any value of} \ x \\<br />
3. & \int_{-\infty}^{\infty} f(t)dt = 1<br />
\end{align} </math><br />
|}<br />
<br />
The first property, as we have already seen, is just an application of the Fundamental Theorem of Calculus and relates the CDF of a continuous random variable to its PDF. <br />
<br />
The second property states that for a function to be a PDF, it must be nonnegative. This makes intuitive sense since probabilities are always nonnegative numbers. More precisely, we already know that the CDF ''F''(''x'') is a nondecreasing function of ''x''. Thus, its derivative ''f''(''x'') is nonnegative.<br />
<br />
The third property states that the area between the function and the ''x''-axis must be 1, or that all probabilities must integrate to 1. This must be true since <math>\lim_{x\rightarrow-\infty}F(x) = 0\text{ and } \lim_{x\rightarrow+\infty}F(x) = 1\!</math>; thus Property 3 follows from the Fundamental Theorem of Calculus.<br />
<br />
The PDF gives us a helpful geometrical interpretation of the probability of an event: the probability that a continuous random variable ''X'' is less than some value ''x''<sub>0</sub>, is equal to the area under the PDF ''f''(''x'') on the interval (-∞,''x''<sub>0</sub> ], as demonstrated in the following graph. <br />
<br />
[[File:MATH105CDFPDFRelation.jpg|300px]]<br />
<br />
Similarly, we have <math>\text{Pr}(a\leq x\leq b) = \int_a^bf(x)dx\!</math>.<br />
<br />
Now that we can interpret probabilities as integrals, it is clear that for a continuous random variable ''X'', we will always have Pr(''X'' = ''x'') = 0. This is simply because the area under a single point of a curve is always zero. In other words, if ''X'' is a continuous random variable, the probability that ''X'' is equal to a particular value will always be zero. We again note that this is an important difference between continuous and discrete random variables.<br />
<br />
The PDF of a continuous random variable plays a similar role as the PMF does for discrete random variables. In particular, they are both used to compute probabilities of events associated to a random variable. However, as the previous paragraph shows, PDFs and PMFs are different objects, just as continuous and discrete random variables are different concepts.<br />
<br />
==Example==<br />
<br />
Let ''f''(''x'') = ''k''(3''x''<sup>2</sup> + 1) for 0 ≤ ''x'' ≤ 2, and ''f''(''x'') = 0 elsewhere. <br />
<br />
# Find the value of ''k'' that makes the given function a PDF. <br />
# Let ''X'' be a continuous random variable whose PDF is ''f''(''x''). Compute the probability that ''X'' is between 1 and 2. <br />
# Find the cumulative distribution function of ''X''.<br />
# Find the probability that ''X'' is ''exactly'' equal to 1.<br />
<br />
==Solution==<br />
<br />
===Part 1)===<br />
<br />
<math> \begin{align}<br />
1 &= \int_{-\infty}^{\infty} f(x) dx \\<br />
&= \int_0^2 k(3x^2+1) dx \\<br />
&= k\Big(\frac{3x^3}{3} +x\Big)\Big|_0^2 dx \\<br />
&= k(10)<br />
\end{align} </math><br />
<br />
Therefore, ''k'' = 1/10. <br />
<br />
Notice that ''f''(''x'') ≥ 0 for all ''x''. Also notice that we can rewrite this PDF as a piecewise function:<br />
<br />
<math> f(x) = \begin{cases} \frac 1{10}(3x^2 + 1) & \text{if } 0\leq x\leq 2\\<br />
0 & \text{otherwise}<br />
\end{cases}</math><br />
<br />
===Part 2)===<br />
<br />
Using our value of ''k'' from Part 1:<br />
<br />
<math> \begin{align}<br />
\mathrm{Pr}(1 \le X \le 2) = \int_1^2 \frac{3x^2+1}{10} dx = \frac{x^3+x}{10} \Big|_1^2 = 1 - 2/10 = 4/5<br />
\end{align} </math><br />
<br />
Therefore, Pr(1 ≤ ''X'' ≤ 2) is 4/5.<br />
<br />
===Part 3)===<br />
<br />
Using the Fundamental Theorem of Calculus, the CDF of ''X'' at ''x'' in [0,2] is <br />
<br />
<math> \begin{align}<br />
\text{Pr}(X\leq x) = F(x) & = \int_{-\infty}^x f(t)dt \\<br />
& = \int_0^x\frac 1{10}(3t^2+1)dt\\<br />
& = \frac 1{10}(t^3+t)\Big|_0^x\\<br />
& = \frac 1{10}(x^3+x), \text{ for } 0\leq x\leq 2<br />
\end{align}</math><br />
<br />
A similar calculation easily verifies that ''F''(''x'') = 0 for all ''x'' < 0 and that ''F''(''x'') = 1 for all ''x'' > 2.<br />
<br />
===Part 4)===<br />
<br />
Since ''X'' is a continuous random variable, we immediately know that the probability that it equals any one particular value must be zero. More directly, we compute<br />
<br />
<math> \text{Pr}(X = 1) = \int_1^1f(t)dt = 0</math><br />
<br />
==An Important Subtlety==<br />
<br />
There is an important subtlety in the definition of the PDF of a continuous random variable. Notice that the PDF of a continuous random variable ''X'' can only be defined when the cumulative distribution function of ''X'' is ''differentiable''. <br />
<br />
As a first example, consider the experiment of randomly choosing a real number from the interval [0,1]. Let ''X'' denote the outcome of this experiment. Since the likelihood of picking a number in a given subinterval of [0,1] is proportional to the length of that subinterval, we see that the CDF ''F''(''x'') is given by<br />
<br />
<math> \text{Pr}(X\leq x) = F(x) = <br />
\begin{cases}<br />
0 & \text{if } x<0\\<br />
x & \text{if } 0\leq x\leq 1\\<br />
1 & \text{if } x>1<br />
\end{cases}</math><br />
<br />
This function is differentiable everywhere ''except'' at the points ''x'' = 0 and ''x'' = 1. So the PDF of ''X'' is defined at all points except for these two:<br />
<br />
<math> \frac {dF(x)}{dx} = f(x) =<br />
\begin{cases}<br />
1 & \text{if } 0<x<1\\<br />
0 & \text{if } x<0 \text{ or } x>1<br />
\end{cases}</math> <br />
<br />
Nevertheless, it can still make sense to define the PDF at the points where the CDF fails to be differentiable. We know that the integral over a single point is always zero, so we can always change the value of our PDF at any particular point (or at any finite set of points) without changing the probabilities of events associated to our random variable. Thus, we could define<br />
<br />
<math> \frac {dF(x)}{dx} := f(x) =<br />
\begin{cases}<br />
1 & \text{if } 0<x<1\\<br />
0 & \text{otherwise}<br />
\end{cases}</math> <br />
<br />
or<br />
<br />
<math> \frac {dF(x)}{dx} := f(x) =<br />
\begin{cases}<br />
1 & \text{if } 0\leq x\leq 1\\<br />
0 & \text{otherwise}<br />
\end{cases}</math> <br />
<br />
Both of these functions are also PDFs of the continuous random variable ''X''. These two formulations have the advantage of being defined for all real numbers.<br />
<br />
==Not All Continuous Random Variables Have PDFs==<br />
<br />
We can sometimes encounter continuous random variables that simply do not have a meaningful PDF at all. The simplest such example is given by a distribution function called the ''Cantor staircase''. <br />
<br />
The ''Cantor set'' is defined recursively as follows:<br />
<br />
* Start with the interval [0,1).<br />
* Delete the middle third of this interval. You are now left with two subintervals [0,1/3) and [2/3,1).<br />
* Delete the middle third of each of these remaining subintervals. Now we have four new subintervals: [0,1/9), [2/9,3/9), [6/9,7/9), and [8/9,1).<br />
* Repeat this middle third deletion for the new subintervals. Continue indefinitely.<br />
<br />
If we take this process to the limit, the set that remains is called the ''Cantor set''. It is extremely sparse in [0,1), yet still contains about as many points as the entire interval itself. In particular, notice that every point of the form ''x'' = 1 - 3<sup>-''k''</sup> is in the Cantor set for every ''k'' > 0.<br />
<br />
We can define a Cantor random variable to have the cumulative distribution function that increases on the Cantor set and remains constant off of this set. We define this function as follows:<br />
<br />
* Let ''F''(''x'') be the CDF of our Cantor random variable ''X''. Define ''F''(''x'') = 0 for ''x'' < 0 and ''F''(''x'') = 1 for ''x'' > 1.<br />
* Define ''F''(''x'') = 1/2 on [1/3,2/3), i.e. on the first middle third deleted in the construction of the Cantor set.<br />
* Define ''F''(''x'') = 1/4 on [1/9,2/9) and ''F''(''x'') = 3/4 on [7/9,8/9).<br />
* Define ''F''(''x'') = 1/8, 3/8, 5/8, and 7/8 on the deleted middle thirds from the third step in our Cantor set construction.<br />
* Continue indefinitely.<br />
<br />
After a limiting argument and some technicalities with defining ''F''(''x'') on the Cantor set itself, this procedure defines a ''continuous'' function that begins at 0 and increases to 1. However, since this function is ''constant'' except on the Cantor set, we see that its derivative off of the Cantor set must be identically ''zero''. On the Cantor set the function is not differentiable and so has no natural PDF.<br />
<br />
What we see is that, for a Cantor random variable, we cannot make any sensible definition for the PDF. It is either identically zero or not defined. <br />
<br />
This is an interesting example of how identifying a random variable with its PDF can lead us astray. Thankfully, for our purposes, we will never need to consider continuous random variables that do not have PDFs defined everywhere (except possibly at finitely many points).</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.1_The_Cumulative_Distribution_Function_(Continuous_Case)&diff=1719772.1 The Cumulative Distribution Function (Continuous Case)2012-05-30T04:59:45Z<p>EdKroc: </p>
<hr />
<div>In the previous chapter, we defined random variables in general, but focused only on '''discrete''' random variables. In this chapter, we properly treat '''continuous''' random variables. <br />
<br />
If for example ''X'' is the height of a randomly selected person in British Columbia, or ''X'' is tomorrow's low temperature at Vancouver International Airport, then ''X'' is a continuously varying quantity. <br />
<br />
We previously defined a '''continuous random variable''' to be one where the values the random variable can assume are given by a ''continuum'' of values. For example, we can define a continuous random variable that can take on any value in the interval [1,2].<br />
<br />
To make this definition more precise, we recall the definition from Section 1.4 of a cumulative distribution function (CDF) that was given for ''any'' random variable. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="500"<br />
|- style="background-color:#f0f0f0;"<br />
! The Cumulative Distribution Function<br />
|-<br />
| The '''cumulative distribution function''' for ''any'' random variable ''X'', denoted by ''F''(''x''), is the probability that ''X'' assumes a value less than or equal to ''x'':<br />
<br />
<center><math>F(x) = \text{Pr}(X \le x)</math></center><br />
<br />
The cumulative distribution function has the following properties:<br />
<br />
* 0 ≤ ''F''(''x'') ≤ 1 for all values of ''x''<br />
* <math>\lim_{x\rightarrow-\infty}F(x) = 0</math><br />
* <math>\lim_{x\rightarrow+\infty}F(x) = 1</math><br />
* ''F''(''x'') is a nondecreasing function of ''x''<br />
|}<br />
<br />
This definition is independent of the type of random variable to which we are referring. From this, we can define a '''continuous random variable''' to be any random variable ''X'' whose CDF is a ''continuous'' function. Notice that this is in contrast to the case of discrete random variables where the corresponding CDF is always a discontinuous step-function.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="350"<br />
|- style="background-color:#f0f0f0;"<br />
! Continuous Random Variable<br />
|-<br />
| A '''continuous random variable''' is one that has a ''continuous'' cumulative distribution function.<br />
|}<br />
<br />
==Comparison to the Discrete Case==<br />
<br />
Recall the cumulative distribution function we had for the test scores example in the previous chapter. The graph of the cumulative distribution function is given below. <br />
<br />
[[File:MATH105CDFGrades.png|450px]] <br />
<br />
Observe that the graph of this function "increases in steps" at 30, 60, 80, 90, and 100. CDFs for discrete random variables are ''always'' step functions. A discrete random variable cannot assume a continuum of values; thus, its CDF can only increase at a finite or countably infinite set of points. <br />
<br />
For another example, consider the function whose graph is given below. <br />
<br />
[[File:MATH105CRVExample.jpg|360px]]<br />
<br />
This function cannot represent a CDF for a ''continuous'' random variable because the function ''F'' is not continuous for all values of ''x''. However, ''F'' could represent a cumulative distribution function for a ''discrete'' random variable since it increases from 0 to 1 in a finite number of steps.<br />
<br />
==Example: Maximum Outdoor Air Temperature==<br />
<br />
The maximum outdoor air temperature in downtown Vancouver on any given day in January can be expressed as a continuous random variable ''X''. A reasonable CDF for this random variable is given by the function<br />
<br />
<math>F(x) = \frac{1}{1 + e^{-kx}}, \ \text{ for some }k > 0</math><br />
<br />
For a particular ''k'', we have graphed this cumulative distribution function in the plot below. <br />
<br />
[[File:CumulativeDistribFunctExample.jpg|300px]]<br />
<br />
In the above plot, note that the horizontal ''x''-axis gives possible values of the maximum outdoor air temperature in downtown Vancouver on any day in January, and that the vertical probability-axis gives values between 0 and 1. The value of the cumulative distribution function ''F''(''x'') gives the probability that the maximum outdoor air temperature is no greater than ''x''.<br />
<br />
We can easily see that this function satisfies the basic properties of a CDF. Clearly, ''F''(''x'') ≥ 0 for all possible temperatures ''x''. Also, ''F''(''x'') ≤ 1 for all ''x'' since the denominator in the definition of ''F''(''x'') is always larger than the numerator. Since ''k'' > 0, we calculate<br />
<br />
<math>\lim_{x \rightarrow -\infty} F(x) = \lim_{x \rightarrow -\infty} \frac{1}{1 + e^{-kx}} = 0.</math><br />
<br />
Likewise,<br />
<br />
<math>\lim_{x \rightarrow +\infty} F(x) = \lim_{x \rightarrow +\infty} \frac{1}{1 + e^{-kx}} = 1.</math><br />
<br />
To check that ''F'' is nondecreasing, we note that since ''F'' is everywhere differentiable, it suffices to show that the derivative of ''F'' is nonnegative. A quick calculation yields<br />
<br />
<math> \frac{d}{dx} F(x) = \frac{ke^{-kx}}{(1 + e^{-kx})^2}</math><br />
<br />
which is certainly never negative. Thus we see explicitly that this function ''F''(''x'') satisfies all the basic properties that a CDF should.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.1_The_Cumulative_Distribution_Function_(Continuous_Case)&diff=1719762.1 The Cumulative Distribution Function (Continuous Case)2012-05-30T04:59:16Z<p>EdKroc: </p>
<hr />
<div>In the previous chapter, we defined random variables in general, but focused only on '''discrete''' random variables. In this chapter, we properly treat '''continuous''' random variables. <br />
<br />
If for example ''X'' is the height of a randomly selected person in British Columbia, or ''X'' is tomorrow's low temperature at Vancouver International Airport, then ''X'' is a continuously varying quantity. <br />
<br />
We previously defined a '''continuous random variable''' to be one where the values the random variable can assume are given by a ''continuum'' of values. For example, we can define a continuous random variable that can take on any value in the interval [1,2].<br />
<br />
To make this definition more precise, we recall the definition from Section 1.4 of a cumulative distribution function (CDF) that was given for ''any'' random variable. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="500"<br />
|- style="background-color:#f0f0f0;"<br />
! The Cumulative Distribution Function<br />
|-<br />
| The '''cumulative distribution function''' for ''any'' random variable ''X'', denoted by ''F''(''x''), is the probability that ''X'' assumes a value less than or equal to ''x'':<br />
<br />
<center><math>F(x) = \text{Pr}(X \le x)</math></center><br />
<br />
The cumulative distribution function has the following properties:<br />
<center>* 0 ≤ ''F''(''x'') ≤ 1 for all values of ''x''<br />
* <math>\lim_{x\rightarrow-\infty}F(x) = 0</math><br />
* <math>\lim_{x\rightarrow+\infty}F(x) = 1</math><br />
* ''F''(''x'') is a nondecreasing function of ''x''</center><br />
|}<br />
<br />
This definition is independent of the type of random variable to which we are referring. From this, we can define a '''continuous random variable''' to be any random variable ''X'' whose CDF is a ''continuous'' function. Notice that this is in contrast to the case of discrete random variables where the corresponding CDF is always a discontinuous step-function.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="350"<br />
|- style="background-color:#f0f0f0;"<br />
! Continuous Random Variable<br />
|-<br />
| A '''continuous random variable''' is one that has a ''continuous'' cumulative distribution function.<br />
|}<br />
<br />
==Comparison to the Discrete Case==<br />
<br />
Recall the cumulative distribution function we had for the test scores example in the previous chapter. The graph of the cumulative distribution function is given below. <br />
<br />
[[File:MATH105CDFGrades.png|450px]] <br />
<br />
Observe that the graph of this function "increases in steps" at 30, 60, 80, 90, and 100. CDFs for discrete random variables are ''always'' step functions. A discrete random variable cannot assume a continuum of values; thus, its CDF can only increase at a finite or countably infinite set of points. <br />
<br />
For another example, consider the function whose graph is given below. <br />
<br />
[[File:MATH105CRVExample.jpg|360px]]<br />
<br />
This function cannot represent a CDF for a ''continuous'' random variable because the function ''F'' is not continuous for all values of ''x''. However, ''F'' could represent a cumulative distribution function for a ''discrete'' random variable since it increases from 0 to 1 in a finite number of steps.<br />
<br />
==Example: Maximum Outdoor Air Temperature==<br />
<br />
The maximum outdoor air temperature in downtown Vancouver on any given day in January can be expressed as a continuous random variable ''X''. A reasonable CDF for this random variable is given by the function<br />
<br />
<math>F(x) = \frac{1}{1 + e^{-kx}}, \ \text{ for some }k > 0</math><br />
<br />
For a particular ''k'', we have graphed this cumulative distribution function in the plot below. <br />
<br />
[[File:CumulativeDistribFunctExample.jpg|300px]]<br />
<br />
In the above plot, note that the horizontal ''x''-axis gives possible values of the maximum outdoor air temperature in downtown Vancouver on any day in January, and that the vertical probability-axis gives values between 0 and 1. The value of the cumulative distribution function ''F''(''x'') gives the probability that the maximum outdoor air temperature is no greater than ''x''.<br />
<br />
We can easily see that this function satisfies the basic properties of a CDF. Clearly, ''F''(''x'') ≥ 0 for all possible temperatures ''x''. Also, ''F''(''x'') ≤ 1 for all ''x'' since the denominator in the definition of ''F''(''x'') is always larger than the numerator. Since ''k'' > 0, we calculate<br />
<br />
<math>\lim_{x \rightarrow -\infty} F(x) = \lim_{x \rightarrow -\infty} \frac{1}{1 + e^{-kx}} = 0.</math><br />
<br />
Likewise,<br />
<br />
<math>\lim_{x \rightarrow +\infty} F(x) = \lim_{x \rightarrow +\infty} \frac{1}{1 + e^{-kx}} = 1.</math><br />
<br />
To check that ''F'' is nondecreasing, we note that since ''F'' is everywhere differentiable, it suffices to show that the derivative of ''F'' is nonnegative. A quick calculation yields<br />
<br />
<math> \frac{d}{dx} F(x) = \frac{ke^{-kx}}{(1 + e^{-kx})^2}</math><br />
<br />
which is certainly never negative. Thus we see explicitly that this function ''F''(''x'') satisfies all the basic properties that a CDF should.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.1_The_Cumulative_Distribution_Function_(Continuous_Case)&diff=1719752.1 The Cumulative Distribution Function (Continuous Case)2012-05-30T04:58:51Z<p>EdKroc: </p>
<hr />
<div>In the previous chapter, we defined random variables in general, but focused only on '''discrete''' random variables. In this chapter, we properly treat '''continuous''' random variables. <br />
<br />
If for example ''X'' is the height of a randomly selected person in British Columbia, or ''X'' is tomorrow's low temperature at Vancouver International Airport, then ''X'' is a continuously varying quantity. <br />
<br />
We previously defined a '''continuous random variable''' to be one where the values the random variable can assume are given by a ''continuum'' of values. For example, we can define a continuous random variable that can take on any value in the interval [1,2].<br />
<br />
To make this definition more precise, we recall the definition from Section 1.4 of a cumulative distribution function (CDF) that was given for ''any'' random variable. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="500"<br />
|- style="background-color:#f0f0f0;"<br />
! The Cumulative Distribution Function<br />
|-<br />
| The '''cumulative distribution function''' for ''any'' random variable ''X'', denoted by ''F''(''x''), is the probability that ''X'' assumes a value less than or equal to ''x'':<br />
<br />
<center><math>F(x) = \text{Pr}(X \le x)</math></center><br />
<br />
The cumulative distribution function has the following properties:<br />
* 0 ≤ ''F''(''x'') ≤ 1 for all values of ''x''<br />
* <math>\lim_{x\rightarrow-\infty}F(x) = 0</math><br />
* <math>\lim_{x\rightarrow+\infty}F(x) = 1</math><br />
* ''F''(''x'') is a nondecreasing function of ''x''<br />
|}<br />
<br />
This definition is independent of the type of random variable to which we are referring. From this, we can define a '''continuous random variable''' to be any random variable ''X'' whose CDF is a ''continuous'' function. Notice that this is in contrast to the case of discrete random variables where the corresponding CDF is always a discontinuous step-function.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="350"<br />
|- style="background-color:#f0f0f0;"<br />
! Continuous Random Variable<br />
|-<br />
| A '''continuous random variable''' is one that has a ''continuous'' cumulative distribution function.<br />
|}<br />
<br />
==Comparison to the Discrete Case==<br />
<br />
Recall the cumulative distribution function we had for the test scores example in the previous chapter. The graph of the cumulative distribution function is given below. <br />
<br />
[[File:MATH105CDFGrades.png|450px]] <br />
<br />
Observe that the graph of this function "increases in steps" at 30, 60, 80, 90, and 100. CDFs for discrete random variables are ''always'' step functions. A discrete random variable cannot assume a continuum of values; thus, its CDF can only increase at a finite or countably infinite set of points. <br />
<br />
For another example, consider the function whose graph is given below. <br />
<br />
[[File:MATH105CRVExample.jpg|360px]]<br />
<br />
This function cannot represent a CDF for a ''continuous'' random variable because the function ''F'' is not continuous for all values of ''x''. However, ''F'' could represent a cumulative distribution function for a ''discrete'' random variable since it increases from 0 to 1 in a finite number of steps.<br />
<br />
==Example: Maximum Outdoor Air Temperature==<br />
<br />
The maximum outdoor air temperature in downtown Vancouver on any given day in January can be expressed as a continuous random variable ''X''. A reasonable CDF for this random variable is given by the function<br />
<br />
<math>F(x) = \frac{1}{1 + e^{-kx}}, \ \text{ for some }k > 0</math><br />
<br />
For a particular ''k'', we have graphed this cumulative distribution function in the plot below. <br />
<br />
[[File:CumulativeDistribFunctExample.jpg|300px]]<br />
<br />
In the above plot, note that the horizontal ''x''-axis gives possible values of the maximum outdoor air temperature in downtown Vancouver on any day in January, and that the vertical probability-axis gives values between 0 and 1. The value of the cumulative distribution function ''F''(''x'') gives the probability that the maximum outdoor air temperature is no greater than ''x''.<br />
<br />
We can easily see that this function satisfies the basic properties of a CDF. Clearly, ''F''(''x'') ≥ 0 for all possible temperatures ''x''. Also, ''F''(''x'') ≤ 1 for all ''x'' since the denominator in the definition of ''F''(''x'') is always larger than the numerator. Since ''k'' > 0, we calculate<br />
<br />
<math>\lim_{x \rightarrow -\infty} F(x) = \lim_{x \rightarrow -\infty} \frac{1}{1 + e^{-kx}} = 0.</math><br />
<br />
Likewise,<br />
<br />
<math>\lim_{x \rightarrow +\infty} F(x) = \lim_{x \rightarrow +\infty} \frac{1}{1 + e^{-kx}} = 1.</math><br />
<br />
To check that ''F'' is nondecreasing, we note that since ''F'' is everywhere differentiable, it suffices to show that the derivative of ''F'' is nonnegative. A quick calculation yields<br />
<br />
<math> \frac{d}{dx} F(x) = \frac{ke^{-kx}}{(1 + e^{-kx})^2}</math><br />
<br />
which is certainly never negative. Thus we see explicitly that this function ''F''(''x'') satisfies all the basic properties that a CDF should.</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.1_The_Cumulative_Distribution_Function_(Continuous_Case)&diff=1719742.1 The Cumulative Distribution Function (Continuous Case)2012-05-30T04:58:19Z<p>EdKroc: Created page with "In the previous chapter, we defined random variables in general, but focused only on '''discrete''' random variables. In this chapter, we properly treat '''continuous''' rando..."</p>
<hr />
<div>In the previous chapter, we defined random variables in general, but focused only on '''discrete''' random variables. In this chapter, we properly treat '''continuous''' random variables. <br />
<br />
If for example ''X'' is the height of a randomly selected person in British Columbia, or ''X'' is tomorrow's low temperature at Vancouver International Airport, then ''X'' is a continuously varying quantity. <br />
<br />
We previously defined a '''continuous random variable''' to be one where the values the random variable can assume are given by a ''continuum'' of values. For example, we can define a continuous random variable that can take on any value in the interval [1,2].<br />
<br />
To make this definition more precise, we recall the definition from Section 1.4 of a cumulative distribution function (CDF) that was given for ''any'' random variable. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="350"<br />
|- style="background-color:#f0f0f0;"<br />
! The Cumulative Distribution Function<br />
|-<br />
| The '''cumulative distribution function''' for ''any'' random variable ''X'', denoted by ''F''(''x''), is the probability that ''X'' assumes a value less than or equal to ''x'':<br />
<br />
<center><math>F(x) = \text{Pr}(X \le x)</math></center><br />
<br />
The cumulative distribution function has the following properties:<br />
* 0 ≤ ''F''(''x'') ≤ 1 for all values of ''x''<br />
* <math>\lim_{x\rightarrow-\infty}F(x) = 0</math><br />
* <math>\lim_{x\rightarrow+\infty}F(x) = 1</math><br />
* ''F''(''x'') is a nondecreasing function of ''x''<br />
|}<br />
<br />
This definition is independent of the type of random variable to which we are referring. From this, we can define a '''continuous random variable''' to be any random variable ''X'' whose CDF is a ''continuous'' function. Notice that this is in contrast to the case of discrete random variables where the corresponding CDF is always a discontinuous step-function.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="350"<br />
|- style="background-color:#f0f0f0;"<br />
! Continuous Random Variable<br />
|-<br />
| A '''continuous random variable''' is one that has a ''continuous'' cumulative distribution function.<br />
|}<br />
<br />
==Comparison to the Discrete Case==<br />
<br />
Recall the cumulative distribution function we had for the test scores example in the previous chapter. The graph of the cumulative distribution function is given below. <br />
<br />
[[File:MATH105CDFGrades.png|450px]] <br />
<br />
Observe that the graph of this function "increases in steps" at 30, 60, 80, 90, and 100. CDFs for discrete random variables are ''always'' step functions. A discrete random variable cannot assume a continuum of values; thus, its CDF can only increase at a finite or countably infinite set of points. <br />
<br />
For another example, consider the function whose graph is given below. <br />
<br />
[[File:MATH105CRVExample.jpg|360px]]<br />
<br />
This function cannot represent a CDF for a ''continuous'' random variable because the function ''F'' is not continuous for all values of ''x''. However, ''F'' could represent a cumulative distribution function for a ''discrete'' random variable since it increases from 0 to 1 in a finite number of steps.<br />
<br />
==Example: Maximum Outdoor Air Temperature==<br />
<br />
The maximum outdoor air temperature in downtown Vancouver on any given day in January can be expressed as a continuous random variable ''X''. A reasonable CDF for this random variable is given by the function<br />
<br />
<math>F(x) = \frac{1}{1 + e^{-kx}}, \ \text{ for some }k > 0</math><br />
<br />
For a particular ''k'', we have graphed this cumulative distribution function in the plot below. <br />
<br />
[[File:CumulativeDistribFunctExample.jpg|300px]]<br />
<br />
In the above plot, note that the horizontal ''x''-axis gives possible values of the maximum outdoor air temperature in downtown Vancouver on any day in January, and that the vertical probability-axis gives values between 0 and 1. The value of the cumulative distribution function ''F''(''x'') gives the probability that the maximum outdoor air temperature is no greater than ''x''.<br />
<br />
We can easily see that this function satisfies the basic properties of a CDF. Clearly, ''F''(''x'') ≥ 0 for all possible temperatures ''x''. Also, ''F''(''x'') ≤ 1 for all ''x'' since the denominator in the definition of ''F''(''x'') is always larger than the numerator. Since ''k'' > 0, we calculate<br />
<br />
<math>\lim_{x \rightarrow -\infty} F(x) = \lim_{x \rightarrow -\infty} \frac{1}{1 + e^{-kx}} = 0.</math><br />
<br />
Likewise,<br />
<br />
<math>\lim_{x \rightarrow +\infty} F(x) = \lim_{x \rightarrow +\infty} \frac{1}{1 + e^{-kx}} = 1.</math><br />
<br />
To check that ''F'' is nondecreasing, we note that since ''F'' is everywhere differentiable, it suffices to show that the derivative of ''F'' is nonnegative. A quick calculation yields<br />
<br />
<math> \frac{d}{dx} F(x) = \frac{ke^{-kx}}{(1 + e^{-kx})^2}</math><br />
<br />
which is certainly never negative. Thus we see explicitly that this function ''F''(''x'') satisfies all the basic properties that a CDF should.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.8_Chapter_1_Summary&diff=1719731.8 Chapter 1 Summary2012-05-30T04:57:26Z<p>EdKroc: Created page with "Our treatment of discrete random variables has been brief, but the concepts we have introduced are fundamental to any random process. These fundamentals will be explored again..."</p>
<hr />
<div>Our treatment of discrete random variables has been brief, but the concepts we have introduced are fundamental to any random process. These fundamentals will be explored again in the next chapter when we apply them to continuous random variables. We will see that many similarities exist between the discrete and continuous cases, but we will also notice many important differences between the two as well. <br />
<br />
We summarize some of the important concepts that were introduced in Chapter 1.<br />
<br />
==The PMF and the CDF==<br />
<br />
The '''probability mass function''' (PMF) of a random variable ''X'' is the function that assigns probabilities to the possible outcomes of ''X''. We write <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="200"<br />
|- style="background-color:#ffffff;"<br />
| <br />
<center><math>\textrm{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
to denote this function of the possible values ''x<sub>k</sub>'' of ''X''.<br />
<br />
The '''cumulative distribution function''' (CDF) of a random variable ''X'' is the function that accumulates the probabilities from a specified value. We define the CDF to be ''F''(''x'') = Pr(''X'' ≤ ''x'') and note that the CDF is intimately related to the PMF via our identity for the probability of disjoint events; i.e., the CDF is given by a sum over values of the PMF. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="200"<br />
|- style="background-color:#f0f0f0;"<br />
|-<br />
| <br />
<math>F(x_n) = \sum_{k=1}^{n} \textrm{Pr}(X = x_k)</math><br />
|} <br />
<br />
==Expected Value, Variance and Standard Deviation==<br />
<br />
The concepts of '''expectation''', '''variance''' and '''standard deviation''' are crucial and will be revisited again when we explore continuous random variables. Students should know that<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="300"<br />
|- style="background-color:#ffffff;"<br />
| <br />
<math>\begin{align}<br />
\mathbb{E}(X) &= \sum_{k=1}^{N} x_k \textrm{Pr}(X=x_k) \\<br />
\text{Var}(X) & = \sum_{k=1}^{N} (x_k - \mathbb{E}(X))^2\cdot\textrm{Pr}(X=x_k)\\<br />
\sigma(X) &= \sqrt{\text{Var}(X)}\end{align}</math><br />
<br />
|}<br />
<br />
The expectation represents the "center" of a random variable, an expected value of an experiment, or the average of outcomes of an experiment repeated many times. The variance and standard deviation of a random variable are numerical measures of the spread, or dispersion, of the distribution of the random variable.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.7_Variance_and_Standard_Deviation&diff=1719721.7 Variance and Standard Deviation2012-05-30T04:56:43Z<p>EdKroc: </p>
<hr />
<div>Another important quantity related to a given random variable is its variance. The '''variance''' is a numerical description of the spread, or the ''dispersion'', of the random variable. That is, the variance of a random variable ''X'' is a measure of how spread out the values of ''X'' are, given how likely each value is to be observed.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Variance and Standard Deviation of a Discrete Random Variable<br />
|-<br />
| The variance, Var(''X''), of a discrete random variable ''X'' is <br />
<br />
<center><math>\text{Var}(X) = \sum_{k=1}^{N} \Big(x_k - \mathbb{E}(X)\Big)^2\textrm{Pr}(X=x_k)</math></center><br />
<br />
where ''N'' is the total number of possible values of ''X''. <br />
<br />
The '''standard deviation''', ''σ'', is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math></center><br />
<br />
|}<br />
<br />
Observe that the variance of a random variable is always nonnegative (since probabilities are nonnegative, and the square of a number is also nonnegative). <br />
<br />
Observe also that much like the expectation of a random variable ''X'', the variance (or standard deviation) is a weighted average of an expression of observable and calculable values. More precisely, notice that <br />
<br />
<math>\text{Var}(X) = \mathbb{E}\left(\left[X - \mathbb{E}(X)\right]^2\right).</math><br />
<br />
==Example: Test Scores==<br />
<br />
Using the test scores example of the previous sections, calculate the variance and standard deviation of the random variable ''X'' associated to randomly selecting a single exam.<br />
<br />
==Solution==<br />
<br />
The variance of the random variable ''X'' is given by<br />
<br />
<math>\begin{align}<br />
\text{Var}(X)<br />
&= \sum_{k=1}^{N} (x_k - \mathbb{E}(X))^2 \textrm{Pr}(X=x_k) \\<br />
&= (30-64)^2 \frac{3}{10} + (60 - 64)^2\frac{2}{10} + (80 - 64)^2 \frac{3}{10} + (90-64)^2 \frac{1}{10} + (100-64)^2 \frac{1}{10} \\<br />
&= 624<br />
\end{align}</math><br />
<br />
The standard deviation of ''X'' is then<br />
<br />
<math>\sigma(X) = \sqrt{624}\approx 24.979992</math><br />
<br />
==Interpretation of the Standard Deviation==<br />
<br />
For most "nice" random variables, i.e. ones that are not too wildly distributed, the standard deviation has a convenient informal interpretation. Consider the intervals <math>S_m = \left[\mathbb{E}(X) - m\sigma(X),\ \mathbb{E}(X) + m\sigma(X)\right],</math> for some positive integer ''m''. As we increase the value of ''m'', these intervals will contain more of the possible values of the random variable ''X''. <br />
<br />
A good rule of thumb is that for "nicely distributed" random variables, all of the most likely possible values of the random variable will be contained in the interval ''S''<sub>3</sub>. Another way to say this is that, for discrete random variables, most of the PMF will live on the interval ''S''<sub>3</sub>. We will see in the next chapter that a similar interpretation holds for continuous random variables.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.7_Variance_and_Standard_Deviation&diff=1719711.7 Variance and Standard Deviation2012-05-30T04:56:14Z<p>EdKroc: </p>
<hr />
<div>Another important quantity related to a given random variable is its variance. The '''variance''' is a numerical description of the spread, or the ''dispersion'', of the random variable. That is, the variance of a random variable ''X'' is a measure of how spread out the values of ''X'' are, given how likely each value is to be observed.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Variance and Standard Deviation of a Discrete Random Variable<br />
|-<br />
| The variance, Var(''X''), of a discrete random variable ''X'' is <br />
<br />
<center><math>\text{Var}(X) = \sum_{k=1}^{N} \Big(x_k - \mathbb{E}(X)\Big)^2\cdot \textrm{Pr}(X=x_k)</math></center><br />
<br />
where ''N'' is the total number of possible values of ''X''. <br />
<br />
The '''standard deviation''', ''σ'', is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math></center><br />
<br />
|}<br />
<br />
Observe that the variance of a random variable is always nonnegative (since probabilities are nonnegative, and the square of a number is also nonnegative). <br />
<br />
Observe also that much like the expectation of a random variable ''X'', the variance (or standard deviation) is a weighted average of an expression of observable and calculable values. More precisely, notice that <br />
<br />
<math>\text{Var}(X) = \mathbb{E}\left(\left[X - \mathbb{E}(X)\right]^2\right).</math><br />
<br />
==Example: Test Scores==<br />
<br />
Using the test scores example of the previous sections, calculate the variance and standard deviation of the random variable ''X'' associated to randomly selecting a single exam.<br />
<br />
==Solution==<br />
<br />
The variance of the random variable ''X'' is given by<br />
<br />
<math>\begin{align}<br />
\text{Var}(X)<br />
&= \sum_{k=1}^{N} (x_k - \mathbb{E}(X))^2 \textrm{Pr}(X=x_k) \\<br />
&= (30-64)^2 \frac{3}{10} + (60 - 64)^2\frac{2}{10} + (80 - 64)^2 \frac{3}{10} + (90-64)^2 \frac{1}{10} + (100-64)^2 \frac{1}{10} \\<br />
&= 624<br />
\end{align}</math><br />
<br />
The standard deviation of ''X'' is then<br />
<br />
<math>\sigma(X) = \sqrt{624}\approx 24.979992</math><br />
<br />
==Interpretation of the Standard Deviation==<br />
<br />
For most "nice" random variables, i.e. ones that are not too wildly distributed, the standard deviation has a convenient informal interpretation. Consider the intervals <math>S_m = \left[\mathbb{E}(X) - m\sigma(X),\ \mathbb{E}(X) + m\sigma(X)\right],</math> for some positive integer ''m''. As we increase the value of ''m'', these intervals will contain more of the possible values of the random variable ''X''. <br />
<br />
A good rule of thumb is that for "nicely distributed" random variables, all of the most likely possible values of the random variable will be contained in the interval ''S''<sub>3</sub>. Another way to say this is that, for discrete random variables, most of the PMF will live on the interval ''S''<sub>3</sub>. We will see in the next chapter that a similar interpretation holds for continuous random variables.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.7_Variance_and_Standard_Deviation&diff=1719701.7 Variance and Standard Deviation2012-05-30T04:55:52Z<p>EdKroc: Created page with "Another important quantity related to a given random variable is its variance. The '''variance''' is a numerical description of the spread, or the ''dispersion'', of the rando..."</p>
<hr />
<div>Another important quantity related to a given random variable is its variance. The '''variance''' is a numerical description of the spread, or the ''dispersion'', of the random variable. That is, the variance of a random variable ''X'' is a measure of how spread out the values of ''X'' are, given how likely each value is to be observed.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Variance and Standard Deviation of a Discrete Random Variable<br />
|-<br />
| The variance, Var(''X''), of a discrete random variable ''X'' is <br />
<br />
<center><math>\text{Var}(X) = \sum_{k=1}^{N} \Big(x_k - \mathbb{E}(X)\Big)^2 \textrm{Pr}(X=x_k)</math></center><br />
<br />
where ''N'' is the total number of possible values of ''X''. <br />
<br />
The '''standard deviation''', ''σ'', is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math></center><br />
<br />
|}<br />
<br />
Observe that the variance of a random variable is always nonnegative (since probabilities are nonnegative, and the square of a number is also nonnegative). <br />
<br />
Observe also that much like the expectation of a random variable ''X'', the variance (or standard deviation) is a weighted average of an expression of observable and calculable values. More precisely, notice that <br />
<br />
<math>\text{Var}(X) = \mathbb{E}\left(\left[X - \mathbb{E}(X)\right]^2\right).</math><br />
<br />
==Example: Test Scores==<br />
<br />
Using the test scores example of the previous sections, calculate the variance and standard deviation of the random variable ''X'' associated to randomly selecting a single exam.<br />
<br />
==Solution==<br />
<br />
The variance of the random variable ''X'' is given by<br />
<br />
<math>\begin{align}<br />
\text{Var}(X)<br />
&= \sum_{k=1}^{N} (x_k - \mathbb{E}(X))^2 \textrm{Pr}(X=x_k) \\<br />
&= (30-64)^2 \frac{3}{10} + (60 - 64)^2\frac{2}{10} + (80 - 64)^2 \frac{3}{10} + (90-64)^2 \frac{1}{10} + (100-64)^2 \frac{1}{10} \\<br />
&= 624<br />
\end{align}</math><br />
<br />
The standard deviation of ''X'' is then<br />
<br />
<math>\sigma(X) = \sqrt{624}\approx 24.979992</math><br />
<br />
==Interpretation of the Standard Deviation==<br />
<br />
For most "nice" random variables, i.e. ones that are not too wildly distributed, the standard deviation has a convenient informal interpretation. Consider the intervals <math>S_m = \left[\mathbb{E}(X) - m\sigma(X),\ \mathbb{E}(X) + m\sigma(X)\right],</math> for some positive integer ''m''. As we increase the value of ''m'', these intervals will contain more of the possible values of the random variable ''X''. <br />
<br />
A good rule of thumb is that for "nicely distributed" random variables, all of the most likely possible values of the random variable will be contained in the interval ''S''<sub>3</sub>. Another way to say this is that, for discrete random variables, most of the PMF will live on the interval ''S''<sub>3</sub>. We will see in the next chapter that a similar interpretation holds for continuous random variables.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.6_Expected_Value&diff=1719691.6 Expected Value2012-05-30T04:55:09Z<p>EdKroc: </p>
<hr />
<div>For an experiment or general random process, the outcomes are never fixed. We may replicate the experiment and generally expect to observe many different outcomes. Of course, in most reasonable circumstances we will expect these observed differences in the outcomes to collect with some level of concentration about some central value. One central value of fundamental importance is the ''expected value''.<br />
<br />
The '''expected value''' or '''expectation''' (also called the '''mean''') of a random variable ''X'' is the weighted average of the possible values of ''X'', weighted by their corresponding probabilities. Informally, the expectation of a random variable ''X'' is the average value that we would expect to see after repeated observation of the random process. Put another way, the expectation is the long-term average of the realized values of a random variable after repeated observation of the random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="500"<br />
|- style="background-color:#f0f0f0;"<br />
! Expected Value of a Discrete Random Variable<br />
|-<br />
| The expected value, <math>\mathbb{E}(X)\!</math>, of a discrete random variable ''X'' is the weighted average of the possible values of ''X'' where each possible value of ''X'' is weighted by its corresponding probability:<br />
<br />
<center><math>\mathbb{E}(X) = \sum_{k=1}^{N} x_k \textrm{Pr}( x = x_k )</math></center><br />
<br />
where ''N'' is the total number of possible values of ''X''.<br />
|}<br />
<br />
Note the following: <br />
<br />
* Do not confuse the ''expected'' value with the ''average'' value of a set of observations: they are two different but related quantities. The average value of a random variable ''X'' would be just the ordinary average of the possible values of ''X''; that is, no possible value of ''X'' receives any special weight. Naturally, this ordinary average is given by <math>\frac 1N\sum_{k=1}^N x_k\!</math>. The expected value of ''X'' is a ''weighted'' average, where certain values get more or less weight depending on how likely or not they are to be observed. A true average value is calculated only when all weights (so all probabilities) are the same. <br />
* The definition of expected value requires numerical values for the ''x<sub>k</sub>''. So if the outcome for an experiment is something qualitative, such as "heads" or "tails", we could calculate the expected value if we assign heads and tails numerical values (0 and 1, for example). <br />
<br />
==Example: Test Scores==<br />
<br />
Recall the test score example from Sections 1.03 and 1.04. We supposed that in a class of 10 people the grades on a test are given by 30, 30, 30, 60, 60, 80, 80, 80, 90, 100. A test is drawn from the collection at random and the score ''X'' is observed. What is the expected value of the random variable ''X''?<br />
<br />
The expected value of the random variable is given by the weighted average of its values:<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X)<br />
&= \sum_{k=1}^{N} x_k \textrm{Pr}(X = x_k) \\<br />
&= 30 \frac{3}{10} + 60 \frac{2}{10} + 80 \frac{3}{10} + 90 \frac{1}{10} + 100 \frac{1}{10} \\<br />
&= 9 + 12 + 24 + 9 + 10 \\<br />
&= 64<br />
\end{align}</math><br />
<br />
Notice that 64 is not actually a possible value for the random variable ''X''. Nevertheless, this expectation makes sense if we remember that what we have really calculated is the long-term average of repeatedly drawing a test score from this collection. If we drew a test score at random from this collection 100 times (remembering to replace the selected test each time so that we never alter our collection of tests) and then averaged all the observed outcomes, this average value would be very near the expected value of 64.<br />
<br />
==Expectation as a Measure of the Center of a Distribution==<br />
<br />
Another informal way to think of the expectation of a random variable is to notice that it gives a measure of the center of the associated distribution. For our test score example, the PMF of the randomly selected test score ''X'' is shown below. <br />
<br />
[[File:MATH105GradeDistribPDF.png|300px]]<br />
<br />
Notice that the expected value of our randomly selected test score, <math>\mathbb{E}(X) = 64\!</math>, lies near the "center" of the PMF. There are many different ways to quantify the "center of a distribution" - for example, computing the 50th percentile of the possible outcomes - but for our purposes we will concentrate our attention on the expected value.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.6_Expected_Value&diff=1719681.6 Expected Value2012-05-30T04:54:51Z<p>EdKroc: Created page with "For an experiment or general random process, the outcomes are never fixed. We may replicate the experiment and generally expect to observe many different outcomes. Of course, ..."</p>
<hr />
<div>For an experiment or general random process, the outcomes are never fixed. We may replicate the experiment and generally expect to observe many different outcomes. Of course, in most reasonable circumstances we will expect these observed differences in the outcomes to collect with some level of concentration about some central value. One central value of fundamental importance is the ''expected value''.<br />
<br />
The '''expected value''' or '''expectation''' (also called the '''mean''') of a random variable ''X'' is the weighted average of the possible values of ''X'', weighted by their corresponding probabilities. Informally, the expectation of a random variable ''X'' is the average value that we would expect to see after repeated observation of the random process. Put another way, the expectation is the long-term average of the realized values of a random variable after repeated observation of the random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="300"<br />
|- style="background-color:#f0f0f0;"<br />
! Expected Value of a Discrete Random Variable<br />
|-<br />
| The expected value, <math>\mathbb{E}(X)\!</math>, of a discrete random variable ''X'' is the weighted average of the possible values of ''X'' where each possible value of ''X'' is weighted by its corresponding probability:<br />
<br />
<center><math>\mathbb{E}(X) = \sum_{k=1}^{N} x_k \textrm{Pr}( x = x_k )</math></center><br />
<br />
where ''N'' is the total number of possible values of ''X''.<br />
|}<br />
<br />
Note the following: <br />
<br />
* Do not confuse the ''expected'' value with the ''average'' value of a set of observations: they are two different but related quantities. The average value of a random variable ''X'' would be just the ordinary average of the possible values of ''X''; that is, no possible value of ''X'' receives any special weight. Naturally, this ordinary average is given by <math>\frac 1N\sum_{k=1}^N x_k\!</math>. The expected value of ''X'' is a ''weighted'' average, where certain values get more or less weight depending on how likely or not they are to be observed. A true average value is calculated only when all weights (so all probabilities) are the same. <br />
* The definition of expected value requires numerical values for the ''x<sub>k</sub>''. So if the outcome for an experiment is something qualitative, such as "heads" or "tails", we could calculate the expected value if we assign heads and tails numerical values (0 and 1, for example). <br />
<br />
==Example: Test Scores==<br />
<br />
Recall the test score example from Sections 1.03 and 1.04. We supposed that in a class of 10 people the grades on a test are given by 30, 30, 30, 60, 60, 80, 80, 80, 90, 100. A test is drawn from the collection at random and the score ''X'' is observed. What is the expected value of the random variable ''X''?<br />
<br />
The expected value of the random variable is given by the weighted average of its values:<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X)<br />
&= \sum_{k=1}^{N} x_k \textrm{Pr}(X = x_k) \\<br />
&= 30 \frac{3}{10} + 60 \frac{2}{10} + 80 \frac{3}{10} + 90 \frac{1}{10} + 100 \frac{1}{10} \\<br />
&= 9 + 12 + 24 + 9 + 10 \\<br />
&= 64<br />
\end{align}</math><br />
<br />
Notice that 64 is not actually a possible value for the random variable ''X''. Nevertheless, this expectation makes sense if we remember that what we have really calculated is the long-term average of repeatedly drawing a test score from this collection. If we drew a test score at random from this collection 100 times (remembering to replace the selected test each time so that we never alter our collection of tests) and then averaged all the observed outcomes, this average value would be very near the expected value of 64.<br />
<br />
==Expectation as a Measure of the Center of a Distribution==<br />
<br />
Another informal way to think of the expectation of a random variable is to notice that it gives a measure of the center of the associated distribution. For our test score example, the PMF of the randomly selected test score ''X'' is shown below. <br />
<br />
[[File:MATH105GradeDistribPDF.png|300px]]<br />
<br />
Notice that the expected value of our randomly selected test score, <math>\mathbb{E}(X) = 64\!</math>, lies near the "center" of the PMF. There are many different ways to quantify the "center of a distribution" - for example, computing the 50th percentile of the possible outcomes - but for our purposes we will concentrate our attention on the expected value.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.5_Some_Common_Discrete_Distributions&diff=1719671.5 Some Common Discrete Distributions2012-05-30T04:54:06Z<p>EdKroc: Created page with "A random variable is a theoretical representation of a physical or experimental process we wish to study. Formally, it is a function defined over a ''sample space'' of possibl..."</p>
<hr />
<div>A random variable is a theoretical representation of a physical or experimental process we wish to study. Formally, it is a function defined over a ''sample space'' of possible outcomes. For our simple coin tossing experiment, where we flip a fair coin once and observe the outcome, our sample space consists of the two outcomes, H or T. When tossing two fair coins sequentially, our sample space consists of the four outcomes HH, HT, TH or TT. <br />
<br />
Let us fix a sample space of ''n'' tosses of a fair coin. Experimentally, we may be interested in studying the number of "heads" observed after tossing the coin ''n'' times. Or we could be interested in studying the number of tosses needed to first observe "heads". Or we could be interested in studying how likely a certain sequence of "heads" and "tails" is to be observed. Each of these experiments are defined on the same sample space (the events generated by ''n'' tosses of a fair coin), yet each strive to quantify different things. Consequently, each experiment should be associated with a different random variable.<br />
<br />
<br />
==The Binomial Distribution==<br />
<br />
Let ''X<sub>n</sub>'' denote the random variable that counts the number of times we observe "heads" when flipping a fair coin ''n'' times. Clearly, ''X'' can take on any integer value from 0 to n, corresponding to the experimental outcome of observing 0 to n "heads". How likely is any particular outcome of this random variable? Notice that we do not care about the order of the observations here, so that if ''n'' = 3, the outcome THH is equivalent to the outcomes HTH and HHT. Each of these outcomes contains two "heads".<br />
<br />
The likelihood of any particular outcome is what is represented by the probability mass function (PMF) of the random variable. Suppose ''n'' = 2. Then we see that the PMF of ''X<sub>2</sub>'' is given by:<br />
<br />
* Pr''(X<sub>2</sub> = 0)'' = 1/4<br />
* Pr''(X<sub>2</sub> = 1)'' = 1/2<br />
* Pr''(X<sub>2</sub> = 2)'' = 1/4<br />
<br />
We say that ''X<sub>2</sub>'' is a '''binomial''' random variable with parameters 2 (the number of times we flip the fair coin) and 1/2 (the probability that we observe heads after a single flip of the coin). We can write ''X<sub>2</sub> ~'' Bin(2, 1/2). <br />
<br />
Just as we did with Bernoulli random variables, we can think of our coin tossing experiment a bit more abstractly. Specifically, we can think of observing "heads" as a success and observing "tails" as a failure. This abstraction will help us generalize our coin tossing procedure to more general experiments.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="300"<br />
|- style="background-color:#f0f0f0;"<br />
! Binomial PMF<br />
|-<br />
| If ''X'' is a binomial random variable associated to ''n'' independent trials, each with a success probability ''p'', then the probability mass function of ''X'' is:<br />
<math>\textrm{Pr}( X = k ) = \frac {n!}{k!(n-k)!}\ ·p^k(1-p)^{n-k},</math><br />
<br />
where ''k'' is any integer from 0 to ''n''. Recall that the ''factorial'' notation ''n''! denotes the product of the first ''n'' positive integers: ''n''! = 1·2·3···(''n''-1)·''n'', and that we observe the convention 0! = 1.<br />
|} <br />
<br />
For our coin tossing experiment, the probability of success - that is, the probability of observing "heads" - was the same as the probability of failure, observing "tails". In general, we may be interested in processes that have different probabilities of success and failure.<br />
<br />
For example, suppose that we know that 5% of all light bulbs produced by a particular manufacturer are defective. If we buy a package of 6 light bulbs and want to calculate the probability that at least one is defective, we can do so by identifying this experiment with a binomial random variable. Here, we can think of observing a defective bulb as a "success" and observing a functional bulb as a "failure". Then our experiment is given by the random variable ''X<sub>6</sub>'' ~ Bin(6, 1/20), since we will observe 6 bulbs in total and each has a probability of 5/100 = 1/20 of being defective.<br />
<br />
In general, we can think of observing ''n'' independent experimental trials and counting the number of "successes" that we witness. The probability distribution we associate with this setup is the '''binomial''' random variable with parameters ''n'' and ''p'', where ''p'' is the probability of "success." We can denote this distributional relationship to a random variable ''X'' by ''X'' ~ Bin(''n'', ''p''). <br />
<br />
==The Geometric Distribution==<br />
<br />
Now consider a slightly different experiment where we wish to flip our fair coin repeatedly until we first observe "heads". Since we can first observe heads on the first flip, the second flip, the third flip, or on any subsequent flip, we see that the possible values our random variable can take are 1, 2, 3,.... <br />
<br />
Of course, we can consider a more abstract experiment where we observe a sequence of trials until we first observe a success, where the probability of success is ''p''. If we let ''X'' denote such a random variable, then we say that ''X'' is a '''geometric''' random variable with parameter ''p''. We can denote this particular random variable by ''X'' ~ Geo(''p'').<br />
<br />
Letting S denote the outcome of "success" and F denote the outcome of "failure", we can summarize the possible outcomes of a geometric experiment and their likelihoods (the probability mass function) in the following table. Here, we write ''p'' for the probability of success and ''q'' for the probability of failure.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Experimental Outcome<br />
! Value of the Random Variable, ''X = x''<br />
! Probability <br />
|-<br />
| S<br />
| ''x'' = 1<br />
| ''p''<br />
|-<br />
| FS<br />
| ''x'' = 2<br />
| ''q·p''<br />
|-<br />
| FFS<br />
| ''x'' = 3<br />
| ''q<sup>2</sup>·p''<br />
|-<br />
| FFFS<br />
| ''x'' = 4<br />
| ''q<sup>3</sup>·p''<br />
|-<br />
| FFFFS<br />
| ''x'' = 5<br />
| ''q<sup>4</sup>·p''<br />
|-<br />
| ...<br />
| ...<br />
| ...<br />
<br />
|}<br />
<br />
When flipping a fair coin, we see that ''X'' ~ Geo(1/2), so that our PDF takes the particularly simple form Pr(''X = k'') = (1/2)<sup>''k''</sup> for any positive integer ''k''.<br />
<br />
==The Discrete Uniform Distribution==<br />
<br />
Now consider a coin tossing experiment of flipping a fair coin ''n'' times and observing the sequence of "heads" and "tails". Because each outcome of a single flip of the coin is equally likely, and because the outcome of a single flip does not affect the outcome of another flip, we see that the likelihood of observing any particular sequence of "heads" and "tails" will always be the same. Notice that for ''n'' = 2 or 6, we have already encountered this random variable (see Section 1.01 and Sections 1.02 - 1.04 respectively).<br />
<br />
We say that a random variable ''X'' has a '''discrete uniform''' distribution on ''n'' points if ''X'' can assume any one of ''n'' values, each with equal probability. Evidently then, if ''X'' takes integer values from 1 to ''n'', we find that the PMF of ''X'' must be Pr(''X = k'') = 1/n, for any integer ''k'' between 1 and ''n''.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.4_The_Cumulative_Distribution_Function&diff=1719661.4 The Cumulative Distribution Function2012-05-30T04:53:25Z<p>EdKroc: Created page with "Generally speaking, for any random variable ''X'', we define the '''cumulative distribution function''' (CDF) of ''X'' as follows: {| border="1" cellspacing="0" cellpadding="..."</p>
<hr />
<div>Generally speaking, for any random variable ''X'', we define the '''cumulative distribution function''' (CDF) of ''X'' as follows:<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="400"<br />
|- style="background-color:#f0f0f0;"<br />
! Cumulative Distribution Function<br />
|-<br />
| The cumulative distribution function (CDF) of a random variable ''X'' is denoted by ''F''(''x''), and is defined to be the function<br />
<br />
<center><math> F(x) = \textrm{Pr}( X \leq x ).</math></center><br />
<br />
|}<br />
<br />
In other words, the cumulative distribution function for a random variable at ''x'' gives the probability that the random variable ''X'' is less than or equal to that number ''x''.<br />
<br />
Given a ''discrete'' random variable and its associated probability mass function, the definition of the cumulative distribution function can be rewritten using our identity for the probability of disjoint events (see Section 1.02). <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="400"<br />
|- style="background-color:#f0f0f0;"<br />
! Cumulative Distribution Function of a Discrete Random Variable<br />
|-<br />
| If ''X'' is a discrete random variable, the cumulative distribution function (CDF) of ''X'' can be written as<br />
<br />
<center><math>F(x) = \sum_{k=1}^{n} \textrm{Pr}( X = x_k )</math></center><br />
<br />
where ''x<sub>n</sub>'' is the largest possible value of ''X'' that is less than or equal to ''x''.<br />
|}<br />
<br />
Note that in this formula for CDFs of discrete random variables, we always have <math> n \leq N\!</math>, where ''N'' is the number of possible outcomes of ''X''. <br />
<br />
Notice also that the CDF of a discrete random variable will remain constant on any interval of the form <math>[x_n,x_{n+1})\!</math>. That is, <math>F(x) = F(x_n) = \sum_{k=1}^n \textrm{Pr}( X = x_k ) \text{ for any } x\in [x_n,x_{n+1})\!</math>.<br />
<br />
The following properties are immediate consequences of our definition of a random variable and the probability it associates to an event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="400"<br />
|- style="background-color:#f0f0f0;"<br />
! Properties of the CDF<br />
|-<br />
| <br />
* <math>0\leq F(x)\leq 1\text{ for all }x</math><br />
* <math>\lim_{x\rightarrow-\infty}F(x) = 0</math><br />
* <math>\lim_{x\rightarrow+\infty}F(x) = 1</math><br />
* ''F''(''x'') is a ''nondecreasing'' function of ''x''<br />
|}<br />
<br />
Recall that a function ''f''(''x'') is said to be ''nondecreasing'' if ''f''(''x<sub>1</sub>'') ≤ ''f''(''x<sub>2</sub>'') whenever ''x<sub>1</sub>'' < ''x<sub>2</sub>''.<br />
<br />
==Example: Rolling a Single Die==<br />
<br />
If ''X'' is the random variable we associated previously with rolling a fair six-sided die, then we can easily write down the CDF of ''X''.<br />
<br />
We already computed that the PMF of ''X'' is given by Pr(''X'' = ''k'') = 1/6 for ''k'' = 1,2,...,6. The CDF can be computed by summing these probabilities sequentially; we summarize as follows:<br />
<br />
* Pr(''X'' ≤ 1) = 1/6<br />
* Pr(''X'' ≤ 2) = 2/6<br />
* Pr(''X'' ≤ 3) = 3/6<br />
* Pr(''X'' ≤ 4) = 4/6<br />
* Pr(''X'' ≤ 5) = 5/6<br />
* Pr(''X'' ≤ 6) = 6/6 = 1<br />
<br />
Notice that Pr(''X'' ≤ ''x'') = 0 for any ''x'' < 1 since ''X'' cannot take values less than 1. Also, notice that Pr(''X'' ≤ ''x'') = 1 for any ''x'' > 6. Finally, note that the probabilities Pr(''X'' ≤ ''x'') are constant on any interval of the form [''k'',''k'' + 1) as required.<br />
<br />
==Example: Rolling Two Dice==<br />
<br />
Suppose that we have two fair six-sided dice, one yellow and one red as in the image below.<br />
<br />
[[File:MATH105TwoDice.jpg|300px]]<br />
<br />
We roll both dice at the same time and add the two numbers that are shown on the upward faces.<br />
<br />
Let ''Y'' be the discrete random variable associated to this sum.<br />
<br />
# How many possible outcomes are there? That is, how many different values can ''Y'' assume?<br />
# How is ''Y'' distributed? That is, what is the PMF of ''Y''?<br />
# What is the probability that ''Y'' is less than or equal to 6? <br />
# What is the CDF of ''Y''?<br />
<br />
==Solution==<br />
<br />
===Part 1)===<br />
<br />
There are 6 possible values we can observe of each die. The two dice are rolled independently (i.e. the value on one of the dice does not affect the value on the other die), so we see that there are 6 ✕ 6 = 36 different outcomes for a single roll of the two dice. Notice that all 36 outcomes are distinguishable since the two dice are different colors. So we can distinguish between a roll that produces a 4 on the yellow die and a 5 on the red die with a roll that produces a 5 on the yellow die and a 4 on the red die.<br />
<br />
However, we are interested in determining the number of possible outcomes for the ''sum'' of the values on the two dice, i.e. the number of different values for the random variable ''Y''. The smallest this sum can be is 1 + 1 = 2, and the largest is 6 + 6 = 12. Clearly, ''Y'' can also assume any value in between these two extremes; thus we conclude that the possible values for ''Y'' are 2,3,...,12.<br />
<br />
===Part 2)===<br />
<br />
To determine the probability distribution for ''Y'', first consider the probability that the sum of the dice equals 2. There is only one way that this can happen: both dice must roll to 1. There are 36 distinguishable rolls of the dice, so the probability that the sum is equal to 2 is 1/36. <br />
<br />
The other possible values of the random variable ''Y'' and their corresponding probabilities can be calculated in a similar fashion. Some of these are listed in the table below. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Outcome (Yellow, Red)<br />
! Sum = Yellow + Red<br />
! Probability<br />
|-<br />
| (1,1)<br />
| 2<br />
| 1/36<br />
|-<br />
| (1,2), (2,1)<br />
| 3 <br />
| 2/36<br />
|-<br />
| (1,3), (2,2), (3,1)<br />
| 4<br />
| 3/36<br />
|-<br />
| (1,4), (2,3), (3,2), (4,1)<br />
| 5<br />
| 4/36<br />
|-<br />
| (1,5), (2,4), (3,3), (4,2), (5,1)<br />
| 6<br />
| 5/36<br />
|-<br />
| . . .<br />
| . . .<br />
| . . .<br />
|-<br />
| (6,6)<br />
| 12<br />
| 1/36<br />
|}<br />
<br />
The probability mass function of ''Y'' is displayed in the following graph. <br />
<br />
[[File:MATH105DiceDistPDF.png|300px]]<br />
<br />
Alternatively, if we let ''p<sub>k</sub>'' = Pr(''Y'' = ''k''), the probability that the random sum ''Y'' is equal to ''k'', then the PMF can be given by a single formula:<br />
<br />
<math>p_k = \frac {6 - |k-7|}{36}\text{ if } k = 2,3,\ldots,12</math><br />
<br />
===Part 3)===<br />
<br />
The probability that the sum is less than or equal to 6 can be written as Pr( ''Y'' ≤ 6), which is equal to ''F''(6), the value of the cumulative distribution function ''F''(''y'') of ''Y'' at ''y'' = 6. Using our identity for probabilities of disjoint events, we calculate <br />
<br />
<math>\mathrm{Pr}( Y \le 6) = F(6) = \sum_{k=1}^6 p_k = 0 + \frac{1}{36} + \frac{2}{36} + \frac{3}{36} + \frac{4}{36} + \frac{5}{36} = \frac{15}{36} = \frac{5}{12}</math> <br />
<br />
===Part 4)===<br />
<br />
To find the CDF of ''Y'' in general, we need to give a table, graph or formula for ''F''(''k'') = Pr(''Y'' ≤ ''k'') for any given ''k''. Using our table for the PMF of ''Y'', we can easily construct the corresponding CDF table:<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! ''Y'' = ''k''<br />
! ''F''(''k'') = Pr(''Y'' ≤ ''k'')<br />
|-<br />
| 2<br />
| 1/36<br />
|-<br />
| 3 <br />
| 3/36<br />
|-<br />
| 4<br />
| 6/36<br />
|-<br />
| 5<br />
| 10/36<br />
|-<br />
| 6<br />
| 15/36<br />
|-<br />
| . . .<br />
| . . .<br />
|-<br />
| 12<br />
| 36/36 = 1<br />
|}<br />
<br />
This table defines a step-function starting at 0 for ''y'' < 2 and increasing in steps to 1 for ''y'' ≥ 12. Notice that the CDF is constant over any half-closed integer interval from 2 to 12. For example, ''F''(''y'') = 3/36 for all ''y'' in the interval [3,4).<br />
<br />
==Example: Test Scores==<br />
<br />
Consider the example of selecting a test score from a given collection that we explored in the previous section: in a class of 10 people, grades on a test were 30, 30, 30, 60, 60, 80, 80, 80, 90, 100. Let ''X'' be the score of a randomly drawn test from this collection. <br />
<br />
# Calculate the probability that a test drawn at random has a score less than or equal to 80.<br />
# Calculate the probability that a test drawn at random has a score less than or equal to ''x<sub>n</sub>'', where ''x<sub>n</sub>'' = 0, 10, 20, 30, ... , 100.<br />
<br />
==Solution==<br />
<br />
===Part 1)===<br />
<br />
Recall the probability mass function, calculated earlier:<br />
<br />
[[File:MATH105GradeDistribPDF.png|350px]]<br />
<br />
Let ''p<sub>k</sub>'' be the probability that the score of a randomly drawn test is ''x<sub>k</sub>'' = 10''k''. So, for example:<br />
<br />
* ''p''<sub>0</sub> is the probability that a randomly drawn test score is 0<br />
* ''p''<sub>1</sub> is the probability that a randomly drawn test score is 10<br />
* ''p''<sub>2</sub> is the probability that a randomly drawn test score is 20<br />
* ''p''<sub>3</sub> is the probability that a randomly drawn test score is 30<br />
<br />
and so on. Values for each of these probabilities are given in the above bar graph. Notice that many of these probabilities are zero.<br />
<br />
The probability that a test drawn at random has a score of no greater than 80 is exactly the value of the CDF of ''X'' at ''x'' = 80; i.e., <br />
<br />
<math>\begin{align}<br />
\mathrm{Pr}(X \le 80) &= F(80) \\<br />
&= \sum_{k=0}^{8} p_k \\<br />
&= p_0 + p_1 + p_2 + {\color{Blue}{p_3}} + p_4 + p_5 + {\color{Blue}{p_6}} + p_7 + {\color{Blue}{p_8}}\\<br />
&= 0 + 0 + 0 + \mathbf{\color{Blue}{\frac{3}{10}}} + 0 + 0 + {\color{Blue}{\frac{2}{10}}} + 0 + {\color{Blue}{\frac{3}{10}}}\\<br />
&= \frac{8}{10} \\<br />
&= \frac{4}{5}<br />
\end{align}<br />
</math><br />
<br />
The color blue was used in the above calculation to highlight nonzero probabilities. <br />
<br />
Because of the sample space of our experiment, if the randomly selected grade is to be less than or equal to 80, then this grade can only be 30, 60, or 80. Intuitively, the probability that a randomly selected test has a grade of 30, 60, or 80 is the ''sum'' of the probabilities that the score is one of these possibilities, which we note is in agreement with our identity concerning probabilities of disjoint events from Section 1.02.<br />
<br />
===Part 2)===<br />
<br />
Now we want to calculate the probability that a test drawn at random has a score less than or equal to ''x<sub>k</sub>'' = 10''k'' for ''k'' = 0,1,...,10. Again, we identify this as simply finding the value of the CDF of ''X'' at each of these ''x<sub>k</sub>'' values.<br />
<br />
<math><br />
\mathrm{Pr}(X \le 0) = F(0) = \sum_{k=0}^{0} p_k = p_0 = 0 <br />
</math><br />
<br />
Similarly, <math>F(0) = F(10) = F(20) = 0</math>. ''F''(30) is non-zero:<br />
<br />
<math><br />
\mathrm{Pr}(X \le 30) = F(30) = \sum_{k=0}^{3} p_k = 0 + 0 + 0 + \frac{3}{10}<br />
</math><br />
<br />
Notice that ''F''(40) is equal to ''F''(30), since ''p''<sub>4</sub> = 0. <br />
<br />
Other values of ''F'' are calculated in the same way using the definition of the cumulative distribution function. The following table contains the values of the CDF of ''X'' for ''x<sub>k</sub>'' = 0, 10, 20, 30, ... 100. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! ''k''<br />
! ''x<sub>k</sub>''<br />
! ''F''(''x<sub>k</sub>'')<br />
|-<br />
| 0<br />
| 0<br />
| 0<br />
|-<br />
| 1<br />
| 10<br />
| 0<br />
|-<br />
| 2<br />
| 20<br />
| 0<br />
|-<br />
| 3<br />
| 30<br />
| 0.3<br />
|-<br />
| 4<br />
| 40<br />
| 0.3<br />
|-<br />
| 5<br />
| 50<br />
| 0.3<br />
|-<br />
| 6<br />
| 60<br />
| 0.5<br />
|-<br />
| 7<br />
| 70<br />
| 0.5<br />
|-<br />
| 8<br />
| 80<br />
| 0.8<br />
|-<br />
| 9<br />
| 90<br />
| 0.9<br />
|-<br />
| 10<br />
| 100<br />
| 1.0<br />
|}<br />
<br />
Collectively, our calculations give the CDF of the random variable ''X''. This cumulative distribution function is graphed in the figure below.<br />
<br />
[[File:MATH105CDFGrades.png|450px]]</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.3_The_Probability_Mass_Function&diff=1719651.3 The Probability Mass Function2012-05-30T04:52:38Z<p>EdKroc: Created page with "Usually we are interested in experiments where there is more than one outcome, each having a possibly different probability. The '''probability mass function''' of a discrete ..."</p>
<hr />
<div>Usually we are interested in experiments where there is more than one outcome, each having a possibly different probability. The '''probability mass function''' of a discrete random variable is simply the collection of all these probabilities.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="300"<br />
|- style="background-color:#f0f0f0;"<br />
! Probability Mass Function<br />
|-<br />
| The probability mass function (PMF) of a discrete random variable ''X'' provides the probabilities Pr(''X'' = ''x'') for all possible values of ''x''. This function can be represented in a table, graph or formula.<br />
|}<br />
<br />
==Example: Different Colored Balls==<br />
<br />
Although it is usually necessary to define random variables that assume numerical values, this need not always be the case. Suppose that a box contains 10 balls:<br />
<br />
* 5 of the balls are red<br />
* 2 of the balls are green<br />
* 2 of the balls are blue<br />
* 1 ball is yellow<br />
<br />
Suppose we take one ball out of the box. Let ''X'' be the random variable that represents the color of the ball. As 5 of the balls are red, and there are 10 balls in total, the probability that a red ball is drawn from the box is Pr(''X'' = Red) = 5/10 = 1/2. [Note: this random variable does not have numerical outcomes, but we could easily fix this by assigning different numbers to the different colors we could observe.]<br />
<br />
Similarly, there are 2 green balls, so the probability that ''X'' is green is 2/10. Similar calculations for the other colors yield the probability mass function of ''X'' given by the following table. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Ball Color<br />
! Probability<br />
|-<br />
| red<br />
| 5/10<br />
|-<br />
| green<br />
| 2/10<br />
|-<br />
| blue<br />
| 2/10<br />
|-<br />
| yellow<br />
| 1/10<br />
|}<br />
<br />
==Example: A Six-Sided Die ==<br />
<br />
Consider again the experiment of rolling a six-sided die. A six-sided die can land on any of its six faces, so that a single experiment has six possible outcomes. <br />
<br />
For a "fair die", we anticipate getting each of the results with an equal probability, i.e. if we were to repeat the same experiment many times, we would expect that, on average, the six possible events would occur with similar frequencies (we say that such events are uniformly distributed).<br />
<br />
There are six possible outcomes: 1, 2, 3, 4, 5, or 6. The probability mass function could be given by the following table. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Outcome<br />
! Probability<br />
|-<br />
| 1<br />
| 1/6<br />
|-<br />
| 2<br />
| 1/6<br />
|-<br />
| 3<br />
| 1/6<br />
|-<br />
| 4<br />
| 1/6<br />
|-<br />
| 5<br />
| 1/6<br />
|-<br />
| 6<br />
| 1/6<br />
<br />
|}<br />
<br />
The PMF could also be given by the equation Pr(''D'' = ''k'') = 1/6, for ''k'' = 1, 2, 3, ... , 6, where ''D'' denotes the random variable associated to rolling a fair die once. Thus we see that discrete uniform random variables have PMFs which are particularly easy to represent.<br />
<br />
==Example: Test Scores==<br />
<br />
Suppose that in a class of 10 people the grades on a test are given by 30, 30, 30, 60, 60, 80, 80, 80, 90, 100. Suppose a test is drawn from the pile at random and the score ''X'' is observed. We would like to calculate the probability mass function for the randomly drawn test score.<br />
<br />
Looking at the test scores, we see that out of 10 grades:<br />
<br />
* the grade 30 occurred 3 times<br />
* the grade 60 occurred 2 times<br />
* the grade 80 occurred 3 times<br />
* the grade 90 occurred 1 time<br />
* the grade 100 occurred 1 time<br />
<br />
This tells us the probability mass function of the randomly chosen test score ''X'' which we present formally in the following table.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Grade, ''x<sub>k</sub>''<br />
! Probability, Pr(''X'' = ''x<sub>k</sub>'' )<br />
|-<br />
| 30<br />
| 3/10<br />
|-<br />
| 60<br />
| 2/10<br />
|-<br />
| 80<br />
| 3/10<br />
|-<br />
| 90<br />
| 1/10<br />
|-<br />
| 100<br />
| 1/10<br />
|}<br />
<br />
We have described the PMF in a table, but an equivalent representation could be given in a graph that plots the possible outcomes of ''X'' on the horizontal axis, and the probabilities associated to these outcomes on the vertical axis. Below is the graph of the probability mass function for the random variable ''X''.<br />
<br />
[[File:MATH105GradeDistribPDF.png|300px]]</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.2_Probability_Basics&diff=1719641.2 Probability Basics2012-05-30T04:51:50Z<p>EdKroc: Created page with "==Random Variables and their Observed Values== We commonly use uppercase letters to denote random variables, and lowercase letters to denote particular values that our random..."</p>
<hr />
<div>==Random Variables and their Observed Values==<br />
<br />
We commonly use uppercase letters to denote random variables, and lowercase letters to denote particular values that our random variables can assume. <br />
<br />
For example, consider a six-sided die, pictured below.<br />
<br />
[[File:MATH105RedDie.jpg|300px]]<br />
<br />
We could let ''X'' be the random value that gives the value observed on the upper face of the six-sided die after a single roll. Then if ''x'' denotes a particular value of the upper face, the expression ''X = x'' becomes well-defined. Specifically, the notation ''X = x'' signifies the event that the ''random variable'' ''X'' assumes the ''particular value'' ''x''. For the six-sided die example, ''x'' can be any integer from 1 to 6. So the expression ''X = 4'' would express the event that a random roll of the die would result in observing the value 4 on the upper face of the die.<br />
<br />
==Probabilities==<br />
<br />
We have already defined the notation Pr(''X'' = ''x'') to denote the probability that a random variable ''X'' is equal to a particular value ''x''. Similarly, Pr(''X'' ≤ ''x'') would denote the probability that the random variable ''X'' is less than or equal to the value ''x''. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Notation<br />
|-<br />
| Pr(''a'' ≤ ''X'' ≤ ''b'') denotes the probability that the random variable ''X'' lies between values ''a'' and ''b'', inclusively.<br />
|}<br />
<br />
With this notation, it now makes sense to write, for example, Pr(''X'' > ''a''), the probability that a random variable assumes a particular value strictly greater than ''a''. Similarly, we can make sense of the expressions Pr(''X'' < ''b''), Pr(''X'' ≠ ''x''), Pr(''X'' = ''x''<sub>1</sub> or ''X'' = ''x''<sub>2</sub>), among others. <br />
<br />
Notice that this notation allows us to do a kind of algebra with probabilities. For example, we notice the equivalence of the following two expressions: Pr(''X'' ≥ ''a'' and ''X'' < ''b'') = Pr(''a'' ≤ ''X'' < ''b''). An important consequence of this symbolism is the following:<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Probabilities of Complimentary Events<br />
|-<br />
| Pr(''X'' = ''x'') = 1 - Pr(''X'' ≠ ''x'')<br />
|-<br />
| Pr(''X'' > ''x'') = 1 - Pr(''X'' ≤ ''x'')<br />
|-<br />
| Pr(''X'' ≥ ''x'') = 1 - Pr(''X'' < ''x'')<br />
|}<br />
<br />
Notice that the first identity is simply a restatement of Discrete Probability Rule #3 from the previous section.<br />
<br />
These three identities are simple consequences of our notation and of the fact that the sum of ''all'' probabilities must always equal 1 for any random variable. The events ''X'' = ''x'' and ''X'' ≠ ''x'' are called '''complimentary''' because exactly one of the events must take place; i.e. both events cannot occur simultaneously, but one of the two must occur. The other expressions above also define complimentary events.<br />
<br />
For discrete random variables, we also have the identity:<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Algebra of Disjoint Events<br />
|-<br />
| If ''a'' ≠ ''b'', then Pr(''X'' = ''a'' or ''X'' = ''b'') = Pr(''X'' = ''a'') + Pr(''X'' = ''b'')<br />
|}<br />
<br />
==Six-Sided Die Example==<br />
<br />
Using our six-sided die example above, we have the random variable ''X'' which represents the value we observe on the upper face of the six-sided die after a single roll. Then the probability that ''X'' is equal to 5 can be written as: <br />
<br />
<math>\mathrm{Pr}( X = 5 ) = \frac{1}{6}</math><br />
<br />
Using our identities for complimentary events and for disjoint events, we find that the probability that ''X'' is equal to 1, 2, 3 or 4 can be computed as:<br />
<br />
<math>\begin{align}<br />
\mathrm{Pr}( 1\leq X\leq 4 ) <br />
&= \mathrm{Pr}( X = 1,\ \mathrm{or}\ X = 2,\ \mathrm{or}\ X = 3,\ \mathrm{or}\ X = 4 )\\<br />
&= 1 - \mathrm{Pr}( X = 5,\ \mathrm{or}\ X = 6 )\\<br />
&= 1 - [\mathrm{Pr}( X = 5 ) + \mathrm{Pr}( X = 6 )]\\<br />
&= 1 - (\frac 16 + \frac 16)\\<br />
&= \frac 23<br />
\end{align}</math><br />
<br />
Notice that ''X'' ~ Uniform(6); i.e. ''X'' has a uniform distribution on the integers from 1 to 6. Indeed, the probability of observing any one of these integer values (the value on the upper face of the rolled die) is the same for any value. Thus, ''X'' must be a uniform random variable.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_Random_Variables&diff=1719631.1 Random Variables2012-05-30T04:50:54Z<p>EdKroc: Created page with "In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identi..."</p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="600"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Random Variables<br />
|-<br />
|<br />
# A discrete random variable ''X'' is a quantity that can assume any value ''x'' from a discrete list of values with a certain probability.<br />
# The probability that the random variable ''X'' assumes the particular value ''x'' is denoted by Pr(''X'' = ''x''). This collection of probabilities, along with all possible values ''x'', is the '''probability distribution''' of the random variable ''X''.<br />
# A discrete list of values is any collection of values that is finite or countably infinite (i.e. can be written in a list).<br />
|}<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=2.5_-_Expected_Value,_Variance,_and_Standard_Deviation&diff=1719622.5 - Expected Value, Variance, and Standard Deviation2012-05-30T04:48:26Z<p>EdKroc: </p>
<hr />
<div>Analogous to the discrete case, we can define the expected value, variance, and standard deviation of a continuous random variable. These quantities have the same interpretation as in the discrete setting. The expectation of a random variable is a measure of the center of the distribution, its mean value. The variance and standard deviation are measures of the horizontal spread or dispersion of the random variable.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="600"<br />
|- style="background-color:#f0f0f0;"<br />
! Expected Value, Variance, and Standard Deviation of a Continuous Random Variable<br />
|-<br />
| The '''expected value''' (also called the '''expectation''' or '''mean''') of a continuous random variable ''X'', with probability density function ''f''(''x''), is the number given by<br />
<br />
<center><math>\mathbb{E}(X) = \int_{-\infty}^{\infty} x f(x) dx</math>.</center><br />
<br />
The '''variance''' of ''X'' is:<br />
<br />
<center><math>\text{Var}(X) = \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx </math>.</center><br />
<br />
As in the discrete case, the '''standard deviation''', σ, is the positive square root of the variance: <br />
<br />
<center><math>\sigma(X) = \sqrt{\text{Var}(X)} </math>.</center><br />
<br />
|}<br />
<br />
==Simple Example==<br />
<br />
A random variable ''X'' is given by the following PDF. Check that this is a valid PDF and calculate the standard deviation of ''X''.<br />
<br />
<math>f(x) = \begin{cases}<br />
2 (1 - x) & \text{if } 0 \le x \le 1,\\<br />
0 & \text{otherwise} <br />
\end{cases}<br />
</math><br />
<br />
===Solution===<br />
<br />
====Part 1====<br />
<br />
To verify that ''f''(''x'') is a valid PDF, we must check that it is everywhere nonnegative and that it integrates to 1.<br />
<br />
We see that 2(1-x) = 2 - 2x ≥ 0 precisely when x ≤ 1; thus ''f''(''x'') is everywhere nonnegative.<br />
<br />
To check that ''f''(''x'') has unit area under its graph, we calculate<br />
<br />
<math>\begin{align}<br />
\int_{-\infty}^{\infty} f(x) dx = 2 \int_{0}^{1} (1 - x) dx =2 \Big( x - \frac{x^2}{2} \Big) \Big|_0^1=1 <br />
\end{align}</math><br />
<br />
So ''f''(''x'') is indeed a valid PDF.<br />
<br />
====Part 2====<br />
<br />
To calculate the standard deviation of ''X'', we must first find its variance. Calculating the variance of ''X'' requires its expected value:<br />
<br />
<math>\begin{align}<br />
\mathbb{E}(X) &= \int_{-\infty}^{\infty} x f(x) dx \\<br />
&= \int_{0}^{1} x \Big[ 2 (1 - x) \Big] dx \\<br />
&= 2 \int_{0}^{1} \Big( x - x^2 \Big) dx \\<br />
&= 2 \Big( \frac{x^2}{2} - \frac{x^3}{3} \Big) \Big|_0^1 \\<br />
&= 1/3<br />
\end{align}</math><br />
<br />
Using this value, we compute the variance of ''X'' as follows<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \int_{-\infty}^{\infty} \big(x - \mathbb{E}(X)\big)^2 f(x) dx \\<br />
&= \int_0^1 \big( x - 1/3\big)^2\cdot 2(1-x) dx \\<br />
&= 2 \int_0^1\big( x^2 -\frac{2}{3} x + \frac{1}{9} \big) (1-x) dx \\<br />
&= 2 \int_0^1\big( -x^3 + \frac{5}{3}x^2 -\frac{7}{9} x +\frac{1}{9} \big) dx \\<br />
&= 2 \big( -\frac{1}{4}x^4 + \frac{5}{9}x^3 -\frac{7}{18} x^2 +\frac{1}{9}x \big)\Big|_0^1 \\<br />
&= 2 \big( -\frac{1}{4} + \frac{5}{9} -\frac{7}{18} +\frac{1}{9} \big) \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
Therefore, the standard deviation of ''X'' is<br />
<br />
<math>\begin{align}<br />
\sigma &= \sqrt{\text{Var}(X)}\\<br />
&= \frac 1{3\sqrt{2}}<br />
\end{align}</math><br />
<br />
===An Alternative Formula for Variance===<br />
<br />
There is an alternative formula for the variance of a random variable that is less tedious than the above definition. <br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Alternate Formula for the Variance of a Continuous Random Variable<br />
|-<br />
| The '''variance''' of a continuous random variable ''X'' with PDF ''f''(''x'') is the number given by<br />
<br />
<center><math>\text{Var}(X) = \mathbb{E}(X^2) - [\mathbb{E}(X)]^2</math>.</center><br />
<br />
|}<br />
<br />
The derivation of this formula is a simple manipulation and has been relegated to the exercises. We should note that a completely analogous formula holds for the variance of a discrete random variable, with the integral signs replaced by sums.<br />
<br />
==Simple Example Revisited==<br />
<br />
We can use this alternate formula for variance to find the standard deviation of the random variable ''X'' defined above.<br />
<br />
Remembering that the expectation of ''X'' was found to be 1/3, we compute the variance of ''X'' as follows:<br />
<br />
<math>\begin{align}<br />
\text{Var}(X) &= \mathbb{E}(X^2) - [\mathbb{E}(X)]^2\\<br />
&= \int_{-\infty}^{\infty} x^2 f(x) dx - \left(\frac 13\right)^2 \\<br />
&= 2 \int_0^1 x^2 (1-x) dx - \frac{1}{9}\\<br />
&= 2 \int_0^1\big( x^2 - x^3 \big) dx- \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3}x^3 - \frac{1}{4}x^4 \big) \big|_0^1 - \frac{1}{9} \\<br />
&= 2 \big( \frac{1}{3} - \frac{1}{4} \big) - \frac{1}{9} \\<br />
&= \frac{1}{18}<br />
\end{align}</math><br />
<br />
In the exercises, you will compute the expectations, variances and standard deviations of many of the random variables we have introduced in this chapter, as well as those of many new ones.</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_-_Random_Variables&diff=1719611.1 - Random Variables2012-05-30T04:47:30Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="600"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Random Variables<br />
|-<br />
|<br />
# A discrete random variable ''X'' is a quantity that can assume any value ''x'' from a discrete list of values with a certain probability.<br />
# The probability that the random variable ''X'' assumes the particular value ''x'' is denoted by Pr(''X'' = ''x''). This collection of probabilities, along with all possible values ''x'', is the '''probability distribution''' of the random variable ''X''.<br />
# A discrete list of values is any collection of values that is finite or countably infinite (i.e. can be written in a list).<br />
|}<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_-_Random_Variables&diff=1719601.1 - Random Variables2012-05-30T04:47:13Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center" width="200"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Random Variables<br />
|-<br />
|<br />
# A discrete random variable ''X'' is a quantity that can assume any value ''x'' from a discrete list of values with a certain probability.<br />
# The probability that the random variable ''X'' assumes the particular value ''x'' is denoted by Pr(''X'' = ''x''). This collection of probabilities, along with all possible values ''x'', is the '''probability distribution''' of the random variable ''X''.<br />
# A discrete list of values is any collection of values that is finite or countably infinite (i.e. can be written in a list).<br />
|}<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=1.1_-_Random_Variables&diff=1719591.1 - Random Variables2012-05-30T04:44:59Z<p>EdKroc: </p>
<hr />
<div>In many areas of science we are interested in quantifying the '''probability''' that a certain outcome of an experiment occurs. We can use a '''random variable''' to identify numerical events that are of interest in an experiment. In this way, a random variable is a theoretical representation of the physical or experimental process we wish to study. More precisely, a random variable is a quantity without a fixed value, but which can assume different values depending on how likely these values are to be observed; these likelihoods are probabilities.<br />
<br />
To quantify the probability that a particular value, or set of values (called an '''event'''), occurs, we use a number between 0 and 1. A probability of 0 implies that the event ''cannot'' occur, whereas a probability of 1 implies that the event ''must'' occur. Any value in the interval (0, 1) means that the event will only occur some of the time. Equivalently, if an event occurs with probability ''p'', then this means there is a ''p''(100)% chance of observing this event.<br />
<br />
Conventionally, we denote random variables by capital letters, and particular values that they can assume by lowercase letters. So we can say that ''X'' is a random variable that can assume certain particular values ''x'' with certain probabilities. <br />
<br />
We use the notation Pr(''X'' = ''x'') to denote the probability that the random variable ''X'' assumes the particular value ''x''. The range of values ''x'' for which this expression makes sense is of course dependent on the possible values of the random variable ''X''. We distinguish between two key cases.<br />
<br />
If ''X'' can assume only finitely many or countably many values, then we say that ''X'' is a '''discrete random variable'''. Saying that ''X'' can assume only ''finitely many or countably many'' values means that we should be able to ''list'' the possible values for the random variable ''X''. If this list is finite, we can say that ''X'' may take any value from the list ''x<sub>1</sub>'', ''x<sub>2</sub>'',..., ''x<sub>n</sub>'', for some positive integer ''n''. If the list is (countably) infinite, we can list the possible values for ''X'' as ''x<sub>1</sub>'', ''x<sub>2</sub>'',.... This is then a list without end (for example, the list of all positive integers).<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Random Variables<br />
|-<br />
|<br />
# A discrete random variable ''X'' is a quantity that can assume any <br />
value ''x'' from a discrete list of values with a certain probability.<br />
# The probability that the random variable ''X'' assumes the particular <br />
value ''x'' is denoted by Pr(''X'' = ''x''). This collection of <br />
probabilities, along with all possible values ''x'', is the '''probability <br />
distribution''' of the random variable ''X''.<br />
# A discrete list of values is any collection of values that is finite <br />
or countably infinite (i.e. can be written in a list).<br />
|}<br />
<br />
This terminology is in contrast to a '''continuous random variable''', where the values the random variable can assume are given by a continuum of values. For example, we could define a random variable that can take any value in the interval [1,2]. The values ''X'' can assume are then any real number in [1,2]. We will discuss continuous random variables in detail in the second chapter. For now, we deal strictly with discrete random variables.<br />
<br />
We state a few facts that should be intuitively obvious for probabilities in general. Namely, the chance of some particular event occurring should always be nonnegative and no greater than 100%. Also, the chance that ''something'' happens should be certain. From these facts, we can conclude that the chance of witnessing a particular event should be 100% less the chance of seeing ''anything but'' that particular event.<br />
<br />
{| border="1" cellspacing="0" cellpadding="4" align="center"<br />
|- style="background-color:#f0f0f0;"<br />
! Discrete Probability Rules<br />
|-<br />
|<br />
1. Probabilities are numbers between 0 and 1 inclusive: 0 ≤ Pr(''X'' = ''x<sub>k</sub>'') ≤ 1 for all ''k''<br />
<br />
2. The sum of all probabilities for a given experiment (random variable) is equal to one: <br />
<center><math>\sum_k \text{Pr}(X = x_k) = 1\!</math></center><br />
<br />
3. The probability of an event is 1 minus the probability that any other event occurs: <br />
<center><math>\text{Pr}(X = x_n) = 1 - \sum_{k\neq n}\text{Pr}(X = x_k)</math></center><br />
|}<br />
<br />
<br />
==Example: Tossing a Fair Coin Once==<br />
<br />
If we toss a coin into the air, there are only two possible outcomes: it will land as either "heads" (H) or "tails" (T). If the tossed coin is a "fair" coin, it is equally likely that the coin will land as tails or as heads. In other words, there is a 50% chance (1/2 probability) that the coin will land heads, and a 50% chance (1/2 probability) that the coin will land tails. Notice that the sum of these probabilities is 1 and that each probability is a number in the interval [0,1].<br />
<br />
We can define the random variable ''X'' to represent this coin tossing experiment. That is, we define ''X'' to be the discrete random variable that takes the value 0 with probability 1/2 and takes the value 1 with probability 1/2. Notice that with this notation, the experimental event that "we toss a fair coin and observe heads" is the same as the theoretical event that "the random variable ''X'' is observed to take the value 0"; i.e. we identify the number 0 with the outcome of "heads", and identify the number 1 with the outcome of "tails". We say that ''X'' is a '''Bernoulli random variable''' with parameter 1/2 and can write ''X'' ~ Ber(1/2). <br />
<br />
==Example: Tossing a Fair Coin Twice==<br />
<br />
Similarly, if we toss a fair coin two times, there are four possible outcomes. Each outcome is a sequence of heads (H) or tails (T):<br />
<br />
* HH<br />
* HT<br />
* TH<br />
* TT<br />
<br />
Because the coin is fair, each outcome is equally likely to occur. There are 4 possible outcomes, so we assign each outcome a probability of 1/4. <br />
<br />
Equivalently, we notice that for any of the four possible events to occur, we must observe two distinct events from two separate flips of a fair coin. So for example, to observe the sequence HH, we must flip a fair coin once and observe H, then flip a fair coin again and observe H once again. (We say that these two events are '''independent''' since the outcome of one event has no effect on the outcome of the other.) Since the probability of observing H after a flip of a fair coin is 1/2, we see that the probability of observing the sequence HH should be (1/2)×(1/2) = 1/4. <br />
<br />
Observe that again, all of our probabilities sum to 1, and each probability is a number on the interval [0, 1]. Just as before, we can identify each outcome of our experiment with a numerical value. Let us make the following assignments:<br />
<br />
* HH -> 0<br />
* HT -> 1<br />
* TH -> 2<br />
* TT -> 3<br />
<br />
This assignment defines a numerical discrete random variable ''Y'' that represents our coin tossing experiment. We see that ''Y'' takes the value 0 with probability 1/4, 1 with probability 1/4, 2 with probability 1/4, and 3 with probability 1/4. Using our general notation to describe this probability distribution, we can summarize by writing<br />
<br />
<math> \text{Pr}(Y = k) = 1/4,\text{ for } k = 0,1,2,3. </math><br />
<br />
Notice that with this notation, the experimental event that "we toss two fair coins and observe first tails, then heads" is the same as the theoretical event that "the random variable ''Y'' is observed to take the value 2". We say that ''Y'' is a '''uniform discrete random variable''' with parameter 4 since ''Y'' takes each of its four possible values with equal, or uniform, probability. To denote this distributional relationship, we can write ''Y'' ~ Uniform(4).</div>EdKrochttps://wiki.ubc.ca/index.php?title=UBC_Wiki:Books/Mprob&diff=171958UBC Wiki:Books/Mprob2012-05-30T04:37:20Z<p>EdKroc: Created page with "{{saved_book}} == Probability Appendix == ;Discrete Random Variables :1.1 - Random Variables :1.2 - Basic Probability :1.3 - The Probability Mass Function :[[1.4 ..."</p>
<hr />
<div>{{saved_book}}<br />
<br />
== Probability Appendix ==<br />
;Discrete Random Variables<br />
:[[1.1 - Random Variables]]<br />
:[[1.2 - Basic Probability]]<br />
:[[1.3 - The Probability Mass Function]]<br />
:[[1.4 - The Cumulative Distribution Function]]<br />
:[[1.5 - Some Common Discrete Distributions]]<br />
:[[1.6 - Expected Value]]<br />
:[[1.7 - Variance and Standard Deviation]]<br />
:[[1.8 - Chapter 1 Summary]]<br />
;Continuous Random Variables<br />
:[[2.1 - The Cumulative Distribution Function of a Continuous Random Variable]]<br />
:[[2.2 - The Probability Density Function]]<br />
:[[2.3 - Some Common Continuous Distributions]]<br />
:[[2.4 - The Normal Distribution]]<br />
:[[2.5 - Expected Value, Variance, and Standard Deviation]]<br />
:[[2.6 - Sample Problem]]<br />
:[[2.7 - Chapter 2 Summary]]<br />
<br />
[[Category:Books|Books/Mprob]]</div>EdKroc