Introduction to Probability
Introducing the basic concepts and vocabulary of probability
A computer program specifies a sequence of operations to be performed in order to accomplish some task. Often, the operations are deterministic: given a fixed input, the behavior of the program should always be the same for that input. Yet there are many problems for which an efficient and/or elegant solution can be obtained using randomization.
In a randomized program, some operations may be chosen randomly rather than deterministically according to the program’s input. Thus, different executions of the same program with the same input may give different behavior. While the potential unpredictability of randomized programs may seem at odds with program correctness and efficiency, we will see that for many randomized procedures it is possible to argue that undesireable outcomes are incredibly unlikely. In some cases, a randomized solution is so much simpler than a deterministic one that the slight chance of an undesireable outcome is overshadowed by the efficiency and simplicity of the randomized procedure.
The goal of this note is to introduce some of the basic concepts, vocabulary, and notation of probability. This material will serve as the foundation to reason about randomized procedures we will encounter going forward. For simplicity, we focus on discrete (finite) probability.
Probability Spaces
Perhaps the most familiar examples of randomness we encounter in our everday lives occur in games of chance. In many games, randomness is achieved by rolling dice, flipping coins, or dealing cards from a shuffled deck. Randomness affords a game a level of unpredictability. Yet patterns emerge from these random processes: a coin flipped repeatedly will typically yield an approximately equal number of heads and tails; a poker-player is almost never dealt a royal flush in a game of five-card stud. Probability is the quantitative study of random processes. That is, probability seeks to quantify the likelihood of different outcomes of a random processes.
The basic object of study in probability is a probability space. A (finite) probability space consists of:
-
a sample space space
whose elements are called outcomes, and -
a probability measure
that associates a real number —the probability of to each outome in satisfying: for all outcomes , and .
Example 1. We model the randomness of tossing a (fair) coin. In this case, there are two possible outcomes of a coin toss: heads and tails. We take our sample space to be
Example 2. Rolling a standard (six-sided) die has six outcomes which are equally likely. Thus we take
Events
An event
Example 3. Consider rolling a six-sided die, and let
Example 4. Consider the process of rolling two dice. The first die can take any value from 1 to 6, and similarly with the second die. Thus, we can represent the outcome of a roll as a pair
Exercise 1. Consider the scenario described in Example 4. Consider the event
Random Variables
Given a probability space
Example 5. Going back to the coin toss example above, we can define a random variable
Example 6. For the die rolling example, we can define the random variable
Expected Values
Given a random variable
We can denote the expression above more succinctly using summation notation as
The expected value of a random variable quantifies the average value we expect to see in a random variable if we repeat an experiment (e.g. a coin flip or die roll) many times over.
Example 7. Going back to the coin flip example, we can compute
This tells us that if we repeatedly play the coin flip betting game described above, neither player has an advantage; The players expect to win about as much as they lose.
Example 8. For the die rolling example, we compute
Thus an “average” die roll is
Probability Distributions
Often, when speaking about random variables we omit reference to the underlying probability space. In this case, we speak only of the probability that a random variable
The function
For the coin flipping example above,
which gives rise to the PDF
Note that the PDF does not reference the underlying sample space
Notice that, like our variable
Consider a game where the play is determined by a coin flip and a die roll. For the examples above, the random variable
Independence
Definition. Suppose
For our examples above with the coin flip and the die roll,
Let
We claim that
Similar calculations show that similar equalities hold for all possible values of
Linearity of Expectation
Given two random variables
Proposition. Suppose
Proof. For the first equality, by definition we compute
Using the fact that
The fourth equality holds because
These equations give the desired results.
The equation
Exercise. Prove that
Exercise. Give an example of random variables