Monthly Archives: November 2016

computer science edutainment math teaching

Numberphile video on the Josephus Problem

Recently, the following Numberphile video on the Josephus Problem has been making the rounds on math-related social media. I watched the video, and I thought Daniel Erman did a remarkably good job at explaining how to solve a mathematical problem. Daniel’s approach is similar to the techniques described in Polya‘s “How to Solve It.” Yet the particular story that Daniel tells also has an appealing narrative arc.

Daniel’s video adheres to the following principles, which I think are fairly universal in mathematical problem solving.

  1. Start with a concrete problem. If the problem has a nice story to go along with it, all the better. The Josephus Problem is a great example of a concrete mathematical question. Given a method by which the soldiers kill one another and the number of soldiers, where should Josephus stand to be the last living soldier?
  2. Formalize and generalize the problem. What is special about the number 41? The mechanism by which the soldiers kill one another works just as well for any number of soldiers, so consider the problem for \(n\) soldiers.
  3. Consider simpler versions of the general problem. Now that we have the general \(n\)-soldier Josephus problem, we can easily work out a few examples when \(n\) is small. To quote Polya, “If you can’t solve a problem, then there is an easier problem you can’t solve: find it.” This process of finding simpler and simpler related problems until you find one you can solve is to me the most important general problem solving method.
  4. Solve enough of the “simple” problems until you see a pattern. Solving the simpler problems gives one both data and intuition that will allow you to conjecture about a general solution.
  5. Generalize the pattern as much as you can so that it fits the examples you’ve solved. Even if the pattern doesn’t give a complete answer (for example, Daniel’s observation that if \(n\) is a power of \(2\), soldier \(1\) is the last living soldier), even a partial solution is likely valuable to understanding a complete solution.
  6. Prove your generalization of the pattern to obtain a solution to the general problem. Often, this doesn’t happen all at once. The Numberphile video happens to give a particularly elegant solution in a very short period of time. Don’t get discouraged when not everything falls into place the first time you try to solve the problem!
  7. Apply your general solution to the original concrete problem.

In my own research, I follow the strategies above. In particular, Polya’s advice regarding finding and solving simpler problems (steps 3 and 4) is maybe the most valuable single piece of problem solving advice I know of. I think math could be characterized as the art of generalizing simple observations. Often, the simple observations arise by wasting a lot of paper trying to solve simple problems.

The narrative outlined in the steps above is also valuable from a pedagogic standpoint. By starting with a tangible (if slightly morbid) problem, the student/participant immediately has some intuition about the problem before beginning formal analysis. In my experience, one of the biggest challenges students face is connecting abstract statement and theorems to concrete problems. By introducing the concrete problem first and using this problem to motivate the formal development needed to solve the problem, students can begin using their imagination earlier in the problem solving process. This makes learning more interactive, memorable, and effective.

computer science expository math

Testing Equality in Networks

Yesterday, I went to an interesting talk by Klim Efremenko about testing equality in networks. The talk was based on his joint paper with Noga Alon and Benny Sudakov. The basic problem is as follows. Suppose there is a network with \(k\) nodes, and each node \(v\) has an input in the form of an \(n\)-bit string \(M_v \in \{0, 1\}^n\). All of the nodes in the network want to verify that all of their strings are equal, i.e., that \(M_v = M_u\) for all nodes \(v\) and \(u\), and they may only communicate with their neighbors in the network. How many bits must be communicated in total?

For concreteness, suppose the network is a triangle. That is, there are three nodes–say played by Alice, Bob, and Carol–and each pair of nodes can communicate. Let’s first consider some upper bounds on the number of bits sent to check that all inputs are the same. The simplest thing to do would be for each player to send their input to the other two players. After this step, everyone knows everyone else’s input, so each player can check that all of the inputs are the same. The total communication required for this protocol is \(6 n\), as each player sends their entire input to the two other players.

It is not hard to see that this procedure is wasteful. It is enough that one player–say Alice–knows whether all of the inputs are the same, and she can tell the other players as much. Thus, we can make a more efficient procedure. Bob and Carol both send their inputs to Alice, and Alice checks if both inputs are the same as her input. She sends a single bit back to Bob and Carol depending on if all inputs are equal. The total communication required for this procedure is \(2 n + 2\). So we’ve improved the communication cost by a factor of \(3\)–not too shabby. But can we do better?

Let’s talk a little bit about lower bounds. To this end, consider the case where there are only two players, Alice and Bob. As in the previous paragraph, they can check that their inputs are equal using \(n + 1\) bits of communication. To see that \(n\) bits are in fact required, we can use the “fooling set” technique from communication complexity. Suppose to the contrary that Alice and Bob can check equality with \(b < n\) bits of communication. For any \(x \in \{0, 1\}^n\), consider the case where both Alice and Bob have input \(x\). Let \(m_x\) be the ``transcript'' of Alice and Bob's conversation when they both have \(x\) as their input. By assumption, the transcript contains at most \(b < n\) bits. Therefore, there are at most \(2^b\) distinct transcripts. Thus, by the pigeonhole principle, there are two values \(x \neq y\) that give the same transcript: \(m_x = m_y\). Now suppose Alice is given input \(x\) and Bob has input \(y\). In this case, when Alice and Bob communicate, they will generate the same transcript \(m_x (= m_y)\). Since the communication transcript determines whether Alice and Bob think they have the same input, they will both be convinced their inputs are the same, contradicting the assumption that \(x \neq y\)! Therefore, we must have \(b \geq n\)--Alice and Bob need to communicate at least \(n\) bits to check equality.

To obtain lower bounds for three players, we can use the same ideas as the two player lower bound of \(n\) bits. Assume for a moment that Bob and Carol know they have the same input. How much must Alice communicate with Bob/Carol to verify that her input is the same as theirs? By the argument in the previous paragraph, Alice must exchange a total of \(n\) bits with Bob/Carol. Similarly, Bob must exchange \(n\) bits with Alice/Carol, and Carol must exchange \(n\) bits with Alice/Bob. So the total number of bits exchanged must be at least \(3 n / 2\). The factor of \(1/2\) occurs because we count each bit exchanged between, say Alice and Bob, twice: once when we consider the communication between Alice and Bob/Carol and once when we consider the communication between Bob and Alice/Carol.

As it stands we have a gap in the communication necessary to solve the problem: we have an upper bound of \(2 n + 2\) bits, and a lower bound of \(3 n / 2\) bits. Which of these bounds is correct? Or is the true answer somewhere in between? Up to this point, the techniques we’ve used to understand the problem are fairly routine (at least to those who have studied some communication complexity). In what follows–the contribution of Klim, Noga, and Benny–we will see that it is possible match the lower bound of \(3 n / 2\) bits using a very clever encoding of the inputs.

The main tool that the authors employ is the existence of certain family of graphs, which I will refer to as Rusza-Szemeredi graphs. Here is the key lemma:

Lemma (Rusza-Szemeredi, 1978). For every \(m\), there exists a tri-partite graph \(H\) on \(3 m\) vertices which contains \(m^2 / 2^{O(\sqrt{\log m})}\) triangles such that no two triangles share a common edge.

The lemma says that there is a graph with a lot of triangles such that each triangle is determined by any one of its edges. To see how this lemma is helpful (or even relevant) to the problem of testing equality, consider the following modification of the original problem. Instead of being given a string in \(\{0, 1\}^n\), each player is player is given a triangle in the graph \(H\) from the lemma. Each triangle consists of \(3\) vertices, but the condition of the lemma ensures that each triangle is determined by any two of the three vertices–i.e., a single edge of the triangle. Thus, the three players can verify that they all have the same triangle by sharing information in the following way: Alice sends Bob the first vertex in her triangle; Bob sends Carol the second vertex in his triangle, and Carol sends Alice the third vertex in her triangle.

Miraculously, this procedure reveals enough information for Alice, Bob, and Carol to determine if they were all given the same triangle! Suppose, for example, that Alice and Bob have the same triangle \(T\), but Carol has a triangle \(T’ \neq T\). By the condition of the lemma, \(T’\) and \(T\) can share at most one vertex, so they must differ in at least two vertices. In particular, it must be the case that the second or third (or both) vertices of \(T\) and \(T’\) differ. If the second vertex of \(T\) and \(T’\) differ, then Carol will see that her triangle is not the same as Bob’s triangle (since Bob sends Carol his second vertex). If the third vertex of \(T\) and \(T’\) differ, then Alice will see that her triangle differs from Carol’s, as Carol sends Alice her triangle’s third vertex.

Now consider the case where Alice, Bob, and Carol are all given different triangles \(T_a = \{u_a, v_a, w_a\}\), \(T_b = \{u_b, v_b, w_b\}\), and \(T_c = \{u_c, v_c, w_c\}\). Suppose that all three players accept the outcome of the communication protocol. This means that \(u_a = u_b\) (since Alice sends \(u_a\) to Bob), and similarly \(v_b = v_c\) and \(w_c = w_a\). In particular this implies that \(H\) contains the edges
\[
\{u_a, v_b\} \in T_b, \{v_b, w_a\} \in T_c, \{w_a, u_a\} \in T_a.
\]
Together these three edges form form a triangle \(T’\) which is present in the graph \(H\). However, observe that \(T’\) shares an edge \(\{w_a, u_a\}\) with \(T_a\), contradicting the property of \(H\) guaranteed by the lemma. Therefore, Alice, Bob, and Carol cannot all accept if they are each given different triangles! Thus they can determine if they were all given the same triangle by each sending a single vertex of their triangle, which requires \(\log m\) bits.

To solve the original problem–where each player is given an \(n\)-bit string–we encode each string as a triangle in a suitable graph \(H\). In order to make this work, we need to choose \(m\) (the number of vertices in \(H\)) large enough that there is one triangle for each possible string. Since there are \(2^n\) possible strings, using the lemma, we need to take \(n\) sufficiently large that
\[
m^2 / 2^{O(\sqrt{\log m})} \geq 2^n.
\]
Taking the logarithm of both sides, we find that
\[
\log m – O(\sqrt{\log m}) \geq \frac 1 2 n
\]
is large enough. Thus, to send the identity of a single vertex of the triangle which encodes a player’s input, they must send \(\log m\) bits, which is roughly \(\frac 1 2 n\). Therefore, the total communication in this protocol is roughly \(\frac 3 2 n\), which matches the lower bounds. We summarize this result in the following theorem.

Theorem. Suppose \(3\) players each hold \(n\)-bit strings. Then
\[
\frac 3 2 n + O(\sqrt{n}).
\]
bits of communication are necessary and sufficient to test if all three strings are equal.

The paper goes on to generalize this result, but it seems that all of the main ideas are already present in the \(3\) player case.