1.5 - Some Common Discrete Distributions

From UBC Wiki

A random variable is a theoretical representation of a physical or experimental process we wish to study. Formally, it is a function defined over a sample space of possible outcomes. For our simple coin tossing experiment, where we flip a fair coin once and observe the outcome, our sample space consists of the two outcomes, H or T. When tossing two fair coins sequentially, our sample space consists of the four outcomes HH, HT, TH or TT.

Let us fix a sample space of n tosses of a fair coin. Experimentally, we may be interested in studying the number of "heads" observed after tossing the coin n times. Or we could be interested in studying the number of tosses needed to first observe "heads". Or we could be interested in studying how likely a certain sequence of "heads" and "tails" is to be observed. Each of these experiments are defined on the same sample space (the events generated by n tosses of a fair coin), yet each strive to quantify different things. Consequently, each experiment should be associated with a different random variable.

The Binomial Distribution

Let Xn denote the random variable that counts the number of times we observe "heads" when flipping a fair coin n times. Clearly, X can take on any integer value from 0 to n, corresponding to the experimental outcome of observing 0 to n "heads". How likely is any particular outcome of this random variable? Notice that we do not care about the order of the observations here, so that if n = 3, the outcome THH is equivalent to the outcomes HTH and HHT. Each of these outcomes contains two "heads".

The likelihood of any particular outcome is what is represented by the probability mass function (PMF) of the random variable. Suppose n = 2. Then we see that the PMF of X2 is given by:

  • Pr(X2 = 0) = 1/4
  • Pr(X2 = 1) = 1/2
  • Pr(X2 = 2) = 1/4

We say that X2 is a binomial random variable with parameters 2 (the number of times we flip the fair coin) and 1/2 (the probability that we observe heads after a single flip of the coin). We can write X2 ~ Bin(2, 1/2).

Just as we did with Bernoulli random variables, we can think of our coin tossing experiment a bit more abstractly. Specifically, we can think of observing "heads" as a success and observing "tails" as a failure. This abstraction will help us generalize our coin tossing procedure to more general experiments.

Binomial PMF
If X is a binomial random variable associated to n independent trials, each with a success probability p, then the probability mass function of X is:

Failed to parse (syntax error): {\displaystyle \textrm{Pr}( X = k ) = \frac {n!}{k!(n-k)!}\ ·p^k(1-p)^{n-k},}

where k is any integer from 0 to n. Recall that the factorial notation n! denotes the product of the first n positive integers: n! = 1·2·3···(n-1)·n, and that we observe the convention 0! = 1.

For our coin tossing experiment, the probability of success - that is, the probability of observing "heads" - was the same as the probability of failure, observing "tails". In general, we may be interested in processes that have different probabilities of success and failure.

For example, suppose that we know that 5% of all light bulbs produced by a particular manufacturer are defective. If we buy a package of 6 light bulbs and want to calculate the probability that at least one is defective, we can do so by identifying this experiment with a binomial random variable. Here, we can think of observing a defective bulb as a "success" and observing a functional bulb as a "failure". Then our experiment is given by the random variable X6 ~ Bin(6, 1/20), since we will observe 6 bulbs in total and each has a probability of 5/100 = 1/20 of being defective.

In general, we can think of observing n independent experimental trials and counting the number of "successes" that we witness. The probability distribution we associate with this setup is the binomial random variable with parameters n and p, where p is the probability of "success." We can denote this distributional relationship to a random variable X by X ~ Bin(n, p).

The Geometric Distribution

Now consider a slightly different experiment where we wish to flip our fair coin repeatedly until we first observe "heads". Since we can first observe heads on the first flip, the second flip, the third flip, or on any subsequent flip, we see that the possible values our random variable can take are 1, 2, 3,....

Of course, we can consider a more abstract experiment where we observe a sequence of trials until we first observe a success, where the probability of success is p. If we let X denote such a random variable, then we say that X is a geometric random variable with parameter p. We can denote this particular random variable by X ~ Geo(p).

Letting S denote the outcome of "success" and F denote the outcome of "failure", we can summarize the possible outcomes of a geometric experiment and their likelihoods (the probability mass function) in the following table. Here, we write p for the probability of success and q for the probability of failure.

Experimental Outcome Value of the Random Variable, X = x Probability
S x = 1 p
FS x = 2 q·p
FFS x = 3 q2·p
FFFS x = 4 q3·p
FFFFS x = 5 q4·p
... ... ...

When flipping a fair coin, we see that X ~ Geo(1/2), so that our PDF takes the particularly simple form Pr(X = k) = (1/2)k for any positive integer k.

The Discrete Uniform Distribution

Now consider a coin tossing experiment of flipping a fair coin n times and observing the sequence of "heads" and "tails". Because each outcome of a single flip of the coin is equally likely, and because the outcome of a single flip does not affect the outcome of another flip, we see that the likelihood of observing any particular sequence of "heads" and "tails" will always be the same. Notice that for n = 2 or 6, we have already encountered this random variable (see Section 1.01 and Sections 1.02 - 1.04 respectively).

We say that a random variable X has a discrete uniform distribution on n points if X can assume any one of n values, each with equal probability. Evidently then, if X takes integer values from 1 to n, we find that the PMF of X must be Pr(X = k) = 1/n, for any integer k between 1 and n.