Category Archives: teaching

computer science edutainment math teaching

Numberphile video on the Josephus Problem

Recently, the following Numberphile video on the Josephus Problem has been making the rounds on math-related social media. I watched the video, and I thought Daniel Erman did a remarkably good job at explaining how to solve a mathematical problem. Daniel’s approach is similar to the techniques described in Polya‘s “How to Solve It.” Yet the particular story that Daniel tells also has an appealing narrative arc.

Daniel’s video adheres to the following principles, which I think are fairly universal in mathematical problem solving.

  1. Start with a concrete problem. If the problem has a nice story to go along with it, all the better. The Josephus Problem is a great example of a concrete mathematical question. Given a method by which the soldiers kill one another and the number of soldiers, where should Josephus stand to be the last living soldier?
  2. Formalize and generalize the problem. What is special about the number 41? The mechanism by which the soldiers kill one another works just as well for any number of soldiers, so consider the problem for \(n\) soldiers.
  3. Consider simpler versions of the general problem. Now that we have the general \(n\)-soldier Josephus problem, we can easily work out a few examples when \(n\) is small. To quote Polya, “If you can’t solve a problem, then there is an easier problem you can’t solve: find it.” This process of finding simpler and simpler related problems until you find one you can solve is to me the most important general problem solving method.
  4. Solve enough of the “simple” problems until you see a pattern. Solving the simpler problems gives one both data and intuition that will allow you to conjecture about a general solution.
  5. Generalize the pattern as much as you can so that it fits the examples you’ve solved. Even if the pattern doesn’t give a complete answer (for example, Daniel’s observation that if \(n\) is a power of \(2\), soldier \(1\) is the last living soldier), even a partial solution is likely valuable to understanding a complete solution.
  6. Prove your generalization of the pattern to obtain a solution to the general problem. Often, this doesn’t happen all at once. The Numberphile video happens to give a particularly elegant solution in a very short period of time. Don’t get discouraged when not everything falls into place the first time you try to solve the problem!
  7. Apply your general solution to the original concrete problem.

In my own research, I follow the strategies above. In particular, Polya’s advice regarding finding and solving simpler problems (steps 3 and 4) is maybe the most valuable single piece of problem solving advice I know of. I think math could be characterized as the art of generalizing simple observations. Often, the simple observations arise by wasting a lot of paper trying to solve simple problems.

The narrative outlined in the steps above is also valuable from a pedagogic standpoint. By starting with a tangible (if slightly morbid) problem, the student/participant immediately has some intuition about the problem before beginning formal analysis. In my experience, one of the biggest challenges students face is connecting abstract statement and theorems to concrete problems. By introducing the concrete problem first and using this problem to motivate the formal development needed to solve the problem, students can begin using their imagination earlier in the problem solving process. This makes learning more interactive, memorable, and effective.

expository math teaching

Probability Primer


This post is a very brief introduction to some basic concepts in probability theory. We encounter uncertainty often in our everyday lives, for example, in the weather, games of chance (think of rolling dice or shuffling a deck of cards), financial markets, etc. Probability theory provides a language to quantify uncertainty, thereby allowing us to reason about future events whose outcomes are not yet known. In this note, we only consider events where the number of potential outcomes is finite.

The basic object of study in probability is a probability space. A (finite) probability space consists of a (finite) sample space \(\Omega = \{\omega_1, \omega_2, \ldots, \omega_n\}\) together with a function \(P : \Omega \to [0,1]\) which assigns probabilities to the outcomes \(\omega_i \in \Omega\). The probability function \(P\) satisfies \(P(\omega_i) \geq 0\) for all \(i\) and
\[
\sum_{i = 1}^n P(\omega_i) = 1.
\]
The interpretation is that \(P\) assigns probabilities or likelihoods to the events in \(\Omega\).

Example. We model the randomness of tossing a coin. In this case, there are two possible outcomes of a coin toss, heads or tails, so we take our sample space to be \(\Omega = \{H, T\}\). Since heads and tails are equally likely, we have \(P(H) = P(T) = 1/2\).

Example. Rolling a standard (six-sided) die has six outcomes which are equally likely. Thus we take \(\Omega = \{1, 2, 3, 4, 5, 6\}\) and \(P(i) = 1/6\) for \(i = 1, 2, \ldots, 6\).

Definition. Let \(\Omega\) be a sample space and \(P\) a probability function on \(\Omega\). A random variable is a function \(X : \Omega \to \mathbf{R}\).

Example. Going back to the coin toss example above, we can define a random variable \(C\) on the coin toss probability space defined by \(C(H) = 1\) and \(C(T) = -1\). This random variable may arise as a simple game: two players toss a coin and bet on the outcome. If the outcome is heads, player 1 wins one dollar, while if the outcome is tails, player 1 loses one dollar.

Example. For the die rolling example, we can define the random variable \(R\) by \(R(i) = i\). The value of \(R\) is simply the value showing on die after it is rolled.

Definition. A fundamental value assigned to random variable is its expected value, expectation or average, which is defined by the formula
\[
E(X) = \sum_{i = 1}^n X(\omega_i) P(\omega_i).
\]
The expected value of a random variable quantifies the behavior we expect to see in a random variable if we repeat an experiment (e.g. a coin flip or die roll) many times over.

Example. Going back to the coin flip example, we can compute
\[
E(C) = C(H) P(H) + C(T) P(T) = 1 \cdot \frac{1}{2} + (-1) \cdot \frac{1}{2} = 0.
\]
This tells us that if we repeatedly play the coin flip betting game described above, neither player has an advantage. The players expect to win about as much as they lose.

For the die rolling example, we compute
\[
E(R) = R(1) P(1) + \cdots + R(6) P(1) = 1 \cdot \frac{1}{6} + \cdots + 6 \cdot \frac{1}{6} = 3.5.
\]
Thus an “average” die roll is \(3.5\) (even though \(3.5\) cannot be the outcome of a single die roll).

Often, when speaking about random variables we omit reference to the underlying probability space. In this case, we speak only of the probability that a random variable \(X\) takes on various values. For the coin flipping example above, we could have just defined \(C\) by
\[
P(C = 1) = P(C = -1) = 1/2
\]
without saying anything about the underlying sample space \(\Omega\). The danger in this view is that if we don’t explicitly define \(C\) as a function on some probability space, it may make comparison of random variables difficult. To see an example of this, consider the random variable \(S\) defined on the die roll sample space, \(\Omega = \{1,\ldots, 6\}\) by
\[
S(i) =
\begin{cases}
1 &\text{ if } i \text{ is even}\\
-1 &\text{ if } i \text{ is odd}.
\end{cases}
\]
Notice that, like our variable \(C\) defined for coin flips, we have \(P(S = 1) = P(S = -1) = 1/2\), so in some sense \(C\) and \(S\) are “the same.” However, they are defined on different sample spaces: \(C\) is defined on the sample space of coin flips, while \(S\) is defined on the sample space of die rolls.

Consider a game where the play is determined by a coin flip and a die roll. For the examples above, the random variable \(C\) depends only on the outcome of the coin flip, while \(R\) and \(S\) depend only on the outcome of the die roll. Since the outcome of the coin flip has no effect on the outcome of the die roll, the variables \(C\) and \(R\) are independent of one another, as are \(C\) and \(S\). However, \(R\) and \(S\) depend on the same outcome (the die roll) so their values may depend on each other. In fact, the value of \(S\) is completely determined by the value of \(R\)! So knowing the value of \(R\) allows us to determine the value of \(S\), and knowing the value of \(S\) tells us something about the value of \(R\) (namely whether \(R\) is even or odd).

Definition. Suppose \(X\) and \(Y\) are random variables defined on the same probability space (i.e., \(X, Y : \Omega \to \mathbf{R}\)). We say that \(X\) and \(Y\) are independent if for all possible values \(x\) of \(X\) and \(y\) of \(Y\) we have
\[
P(X = x \text{ and } Y = y) = P(X = x) P(Y = y).
\]

For our examples above with the coin flip and the die roll, \(C\) and \(R\) cannot be said to be independent because they are defined on different probability spaces. The variables \(R\) and \(S\) are both defined on the die roll sample space, so they can be compared. However, they are not independent. For example, we have \(P(R = 1) = 1/6\) and \(P(S = 1) = 1/2\). Since \(S(i) = 1\) only when \(i\) is even, we have
\[
P(R = 1 \text{ and } S = 1) = 0 \neq \frac{1}{6} \cdot \frac{1}{2}.
\]

Let \(W\) be the random variable on the die roll sample space defined by
\[
W(i) =
\begin{cases}
1 & \text{ if } i = 1, 4\\
2 & \text{ if } i = 2, 5\\
3 & \text{ if } i = 3, 6.
\end{cases}
\]
We claim that \(W\) and \(S\) are independent. This can be verified by brute force calculation. For example, note that we have \(S = 1\) and \(W = 1\) only when the outcome of the die roll is 4. Therefore,
\[
P(S = 1 \text{ and } W = 1) = \frac{1}{6} = P(S = 1) P(W = 1).
\]
Similar calculations show that similar equalities hold for all possible values of \(S\) and \(W\), hence these random variables are independent.

Given two random variables \(X\) and \(Y\) defined on the same probability space, we can define their sum \(X + Y\) and product \(X Y\).

Proposition. Suppose \(X, Y : \Omega \to \mathbf{R}\) are independent random variables. Then
\[
E(X + Y) = E(X) + E(Y) \quad\text{and}\quad E(X Y) = E(X) E(Y).
\]

Proof. For the first equality, by definition we compute
\[
E(X + Y) = \sum_{x, y} P(X = x \text{ and } Y = y) (x + y).
\]
Using the fact that \(X\) and \(Y\) are independent, we find
\[
\begin{align}
E(X + Y) &= \sum_{x, y} P(X = x \text{ and } Y = y) (x + y)\\
&= \sum_{x, y} P(X = x) P(Y = y) (x + y)\\
&= \sum_{x} P(X = x) x \sum_y P(Y = y) + \sum_y P(Y = y) y \sum_x P(X = x)\\
&= \sum_x P(X = x) x + \sum_y P(Y = y) y\\
&= E(X) + E(Y).
\end{align}
\]
The fourth equality holds because \(P\) satisfies \(\sum_x P(X = x) = 1\) and \(\sum_y P(Y = y) = 1\). Similarly, we compute
\[
\begin{align}
E(X Y) &= \sum_{x, y} P(X = x \text{ and } Y = y) x y\\
&= \sum_{x, y} P(X = x) P(Y = y) x y\\
&= \left(\sum_{x} P(X = x) x\right) \left(\sum_{y} P(Y = y) y\right)\\
&= E(X) E(Y).
\end{align}
\]
These equations give the desired results. ∎

The equation \(E(X + Y) = E(X) + E(Y)\) is satisfied even if \(X\) and \(Y\) are not independent. This fundamental fact about probability is known as the linearity of expectation. However, in order to have \(E(X Y) = E(X) E(Y)\), \(X\) and \(Y\) must be independent.

Exercise. Prove that \(E(X + Y) = E(X) + E(Y)\) without assuming that \(X\) and \(Y\) are independent.

Exercise. Give an example of random variables \(X\) and \(Y\) for which \(E(X Y) \neq E(X) E(Y)\). (Note that \(X\) and \(Y\) cannot be independent.

teaching

Undergraduate Panel Notes

A couple weeks ago in Math 495 (Teaching College Mathematics) we had a panel of undergraduate students at UCLA talk about their experiences in the math department. We tried to get a good cross section of students, from those whose focus is math, to students who only took math classes as a departmental requirement. Given the diverse background of our panel, it was somewhat surprising to me how much they agreed about how a math discussion section should be run. Below I’ve summarized the panel’s advice to TAs as to what makes for good recitation class.

  1. Be organized. Write a brief lesson plan/list of topics on the board before class and follow it. The students rely on this structure to remind them of the big picture for the class. (Author’s note: Connie Chung, the undergraduate advisor in the math department said that disorganization is the most common issue that students cite in poor course evaluations. So it is unsurprising that this was the first piece of advice given by the undergraduate panel.)
  2. Do concrete examples, and do a lot of them. Don’t dwell too much on theory in discussion–a brief (~5 minute) review usually suffices. Students tend to have more difficulty in connecting the theory with applications than understanding the theory itself.
  3. Do examples you find “interesting” for upper division classes, but stick with more “standard” examples for lower division courses. Most students in lower division classes are in your discussion section to see how to solve the types of problems likely to appear on their homework/exams.
  4. Communicate with the instructor/lecturer for the course. Make sure that you are reviewing material that the students have seen in lecture. This is usually more helpful than giving a preview of material to come.
  5. Give detailed solutions or don’t bother giving a solution at all. The students want to see problems solved to completion, not just set up and cast aside. Often students struggle with the details in finishing a problem. Only when they see a full detailed solution are they confident they understand how to solve the problem.
  6. Gauge the level of your class. Don’t assume that because nobody asked questions about an example or topic that everyone understands it. Often, the students who don’t understand are afraid to speak up. So ask questions like, “What should I do next?” rather than “Are there any questions?” If you have to wait a while for suggestions on how to proceed, you need to spend more time on that topic.
  7. Use icebreakers. Get the students talking to each other and open avenues for them to interact more. This is best done at the beginning of the term to get everyone talking on day 1. (Author’s note: I always hated ice breakers as a student because I thought they were a waste of time. I was surprised that our panel unanimously supported them and found them valuable. I may be a convert yet…)
  8. Your discussion session should be informal. Students want to be comfortable asking and answering questions
  9. Always start the class with “easy” examples. This will make sure the students are not overwhelmed early on, and will help you diagnose problems if they students have difficulty. (See #6).
  10. Make your lessons interactive. If you want to solve a problem as an example, don’t jump right into the solution. Write the question on the board, and give students time to think about how to solve it. Solicit suggestions from the class on how to approach the problem, and help guide them to a solution. The students get the most out of the class when they are actively participating in the problem solving, rather than just copying down your solution. (This approach also helps you understand better what your students need the most help with. See #6.)
  11. Choose content/applications from student interest. For example, if you have a lot of physics majors in your class, they might want to see examples of how the material in your course is relevant to mechanics or electricity and magnetism. You might need to do a bit of homework yourself to find connections with biology, etc, but the students will appreciate it.
  12. You are never going too slow. (Not a single panelist had ever been in a discussion where they felt the TA was going too slowly. Not. One.)

Looking over all of the advice and suggestions our undergraduate panel gave the new TAs in the math department, there are very few things I found surprising. Nonetheless, I thought the advice was more impactful coming from students rather than professors or TAs. Overall, the students preferred discussion sections that employed active learning. Many studies show that active learning generally leads to higher student achievement, and the anecdotal evidence from the panel shows that not only is active learning effective, but the students prefer it (at least in discussion sections). A recent series of articles/blog posts from the AMS serve as a great primer for active learning strategies in math.

math teaching

A tricky limit from the Math 32A final


The following limit appeared on yesterday’s Math 32A final:
\[
\lim_{(x, y) \to (0, 0)} \frac{\sin x^2 + \tan^2 y}{x^2 + \tan y^2}.
\]
The limit seemed to cause a lot of trouble. In fact, no one in the lecture of 200 students had a perfect solution! I thought I would write up a solution here to give some closure on the problem. I will use only tools that appear in a standard calculus course: the squeeze theorem and the fact that \(\lim_{x \to 0} \sin(x) / x = 1\).

The first thing to note is that evaluating the limit along the \(x\)-axis (taking \(y = 0\)) gives a limit of \(1\):
\[
\lim_{x \to 0} \frac{\sin x^2}{x^2} = \lim_{u \to 0} \frac{\sin u}{u} = 1.
\]
This tells us that if the original limit exists, it must be \(1\). However, this argument does not allow us to conclude that the limit does indeed exist. Choosing any other path through \((0, 0)\) will also give a limit of \(1\), so one should anticipate that the limit exists and is equal to \(1\). We remark that, in general, \(\lim_{(x, y) \to (a, b)} f(x, y) = L\) if and only if \(\lim_{(x, y) \to (a, b)} |f(x, y) – L| = 0\). This reformulation is convenient because \(0 \leq |f(x, y) – L|\), so applying the squeeze theorem is less fussy: we only need to bound \(|f(x, y) – L| \leq g(x, y)\) where \(\lim_{(x, y) \to (a, b)} g(x, y) = 0\). Thus, we wish to show that
\[
\lim_{(x, y) \to (0, 0)} \left| \frac{\sin x^2 + \tan^2 y}{x^2 + \tan y^2} – 1\right| = 0.
\]
To this end, we compute the following bound
\[
\begin{align*}
\left| \frac{\sin x^2 + \tan^2 y}{x^2 + \tan y^2} – 1\right| &= \left| \frac{\sin x^2 + \tan^2 y}{x^2 + \tan y^2} – \frac{x^2 + \tan y^2}{x^2 + \tan y^2}\right|\\
&= \left| \frac{\sin x^2 – x^2 + \tan^2 y – \tan y^2}{x^2 + \tan y^2}\right|\\
&\leq \left| \frac{\sin x^2 – x^2}{x^2 + \tan y^2}\right| + \left| \frac{\tan^2 y – \tan y^2}{x^2 + \tan y^2}\right|\\
&\leq \left| \frac{\sin x^2 – x^2}{x^2}\right| + \left| \frac{\tan^2 y – \tan y^2}{\tan y^2}\right|.
\end{align*}
\]
The first inequality holds by the triangle inequality (\(|a + b| \leq |a| + |b|\)), while the second inequality holds because \(\tan y^2 \geq 0\) (when \(y\) is close to \(0\)) and \(x^2 \geq 0\). Since we have
\[
0 \leq \left| \frac{\sin x^2 + \tan^2 y}{x^2 + \tan y^2} – 1\right| \leq \left| \frac{\sin x^2 – x^2}{x^2}\right| + \left| \frac{\tan^2 y – \tan y^2}{\tan y^2}\right|,
\]
it suffices to show that the limit of the expression on the right is \(0\) as \((x, y) \to (0, 0)\). The expression on the right is much easier to deal with than the original limit because it is the sum of two terms, each of which only involves a single variable. Thus, one can use familiar methods from single variable calculus to compute the limits. One can show that in fact we have
\[
\lim_{x \to 0} \left| \frac{\sin x^2 – x^2}{x^2}\right| = 0 \quad\text{and}\quad \lim_{y \to 0} \left| \frac{\tan^2 y – \tan y^2}{\tan y^2}\right| = 0.
\]
The first limit follows easily from the fact that \(\lim_{x \to 0} \sin(x) / x = 1\), while the second limit requires a little more ingenuity. We compute
\[
\begin{align*}
\lim_{y \to 0} \left| \frac{\tan^2 y – \tan y^2}{\tan y^2}\right| &= \left| \lim_{y \to 0} \frac{\tan^2 y}{\tan y^2} – \lim_{y \to 0} \frac{\tan y^2}{\tan y^2} \right|\\
&= \left| \lim_{y \to 0} \frac{\tan^2 y}{\tan y^2} – 1 \right|\\
&= \left| \lim_{y \to 0} \frac{\tan^2 y}{y^2} \frac{y^2}{\tan y^2} – 1 \right|\\
&= \left| \lim_{y \to 0} \left(\frac{\tan y}{y}\right)^2 \frac{y^2}{\tan y^2} – 1 \right|\\
&= | (1)^2 (1) – 1 |\\
&= 0.
\end{align*}
\]
The penultimate equality follows from the fact that \(\lim_{y \to 0} \tan(y) / y = 1\), which can be deduced by writing \(\tan y = (\sin y) / (\cos y)\) and using the familiar \(\sin(y) / y\) limit identity (or an application of L’Hopital’s rule). At any rate, we may now deduce from the squeeze theorem that
\[
\lim_{(x, y) \to (0, 0)} \frac{\sin x^2 + \tan^2 y}{x^2 + \tan y^2} = 1.
\]
Whew.

math teaching

Math 32A Final Review Problems

I have just finished putting together some problems for the final review for Math 32A, which are available here. I have included answers to the problems. I haven’t gotten a chance to double check my work on the solutions, so let me know in the comments below if you disagree with any of the answers. I will go over the solutions this evening again and update the answers if I find any problems.

teaching

Two Tricky Linear Algebra Problems


There are a couple questions from the homework that seem to have given people (myself included) a fair amount of trouble. Since I wasn’t able to give satisfactory answers to these questions in office hours, I thought I’d write up clean solutions to the problems, available here. The questions both involve projections \(E : V \to V\) where \(V\) is an inner product space. The problems ask you to prove:

  1. If \(E\) is idempotent (\(E^2 = E\)) and normal (\(E^* E = E E^*\)), then \(E\) is self-adjoint (\(E = E^*\)).
  2. If \(E\) satisfies \(\|E v\| \leq \|v\|\) for all \(v \in V\), then \(E\) is an orthogonal projection onto its image.
math teaching

Math 32A Midterm 2 Review Questions

I have compiled a few review questions for the second midterm for Math 32A, which are available here. The questions are on the harder side of what I would expect to see on the actual exam, but hopefully you find them interesting and/or helpful in your studying. Let me know in the comments below if you have questions about any of the problems.

math teaching

Math 115AH Midterm 1 Review Questions

I’ve written up a few review questions for the first midterm for Math 115AH, available here. I will go over solutions to any of the problems during class or the review session (which will be Thursday, April 30 from 5 to 7 PM in Boelter 2444). Let me know in the comments below if you notice any typos or if you have questions about any of the problems.

expository math teaching

Logic and Sets

I have just uploaded notes on Basic Logic and Naive Set Theory for math 115AH. Please let me know in the comments below if you notice typos or if anything is unclear.

math teaching

Math 61 Final Review Questions

I have complete some final review questions for Math 61! I will go over as many as these problems as possible during Wednesday’s review session. As usual, let me know in the comments below if you have any questions about the problems.