I am currently taking a course on communication complexity with Alexander Sherstov. Much of communication complexity involves matrix analysis, so yesterday we did a brief review of results from linear algebra. In the review, Sherstov gave the following definition for the rank of a matrix \(M \in \mathbf{F}^{n \times m}\):

\[

\mathrm{rank}(M) = \min\{k | M = A B, A \in \mathbf{F}^{n \times k}, B \in \mathbf{F}^{k \times m}\}

\]

Since this “factor” definition of rank appears very different from the standard definition given in a linear algebra course, I thought I would prove that the two definitions are equivalent.

The “standard” definition of rank is

\[

\mathrm{rank}(M) = \mathrm{dim}\ \mathrm{span} \{\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_m\}

\]

where \(\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_m\) are the columns of \(M\). Equivalently, \(\mathrm{rank}(M)\) is the number of linearly independent columns of \(M\).

An easy consequence of the standard definition of rank is that the rank of a product of matrices is at most the rank of the first matrix: \(\mathrm{rank}(A B) \leq \mathrm{rank}(A)\). Indeed, the columns of \(AB\) are linear combinations of the columns of \(A\), hence the span of the columns of \(AB\) is a subspace of the span of the columns of \(A\). Thus, if we can write \(M = A B\) with \(A \in \mathbf{F}^{n \times k}\) and \(B \in \mathbf{F}^{k \times m}\), we must have \(\mathrm{rank}(M) \leq k\).

It remains to show that if \(\mathrm{rank}(M) = k\) then we can factor \(M = A B\) with \(A \in \mathbf{F}^{n \times k}\) and \(B \in \mathbf{F}^{k \times m}\). To this end, assume without loss of generality that the first \(k\) columns of \(M\) are linearly independent. Write these columns as vectors

\(\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k\). Denote the remaining columns by \(\mathbf{w}_1, \mathbf{w}_2, \ldots, \mathbf{w}_{m-k}\). Since these vectors lie in the span of the first \(k\) columns of \(M\), we can write the \(\mathbf{w}_j\) as linear combinations of the \(\mathbf{v}_i\):

\[

\mathbf{w}_j = \sum_{i = 1}^k b_{i, j} \mathbf{v}_i \quad\text{for}\quad j = 1, 2, \ldots, m – k.

\]

Now define the matrices

\[

A = (\mathbf{v}_1\ \mathbf{v}_2\ \cdots\ \mathbf{v}_k)

\]

and

\[

B = (\mathbf{e}_1\ \mathbf{e}_2\ \cdots\ \mathbf{e}_k\ \mathbf{b}_1\ \mathbf{b}_2\ \cdots\ \mathbf{b}_{m – k})

\]

where \(\mathbf{e}_j\) is the \(j\)th standard basis vector in \(\mathbf{F}^k\) and

\[

\mathbf{b}_j =

\begin{pmatrix}

b_{1, j}\\

b_{2, j}\\

\vdots\\

b_{k, j}

\end{pmatrix}

\quad\text{for}\quad j = 1, 2, \ldots, m – k.

\]

It is straightforward to verify that we have \(M = A B\), which proves the equivalence of the two definitions of rank.