Rank

Identities about rank

 * $$\text{rank}(BC) \le \min\{\text{rank}(B), \text{rank}(C)\}$$
 * If $$A=BC$$ with $$B \in \mathbb{R}^{m\times r}$$ and $$C \in \mathbb{R}^{r\times n}$$, then $$\text{rank}(A) \le r$$. This property is very useful for the application of fast matrix-vector multiplication. I.e., to obtain the product $$Ax$$, we can instead do $$BCx$$ and if $$r$$ is very small compared to $$m$$ or $$n$$, then we'd speed up the computation a lot.
 * $$\text{rank}(A) + \dim \mathcal{N}(A) = n$$ (see also Fundamental Theorem of Linear Algebra).

Ranks of Partitioned Matrices and multiplication of matrices
The following results are from Sections 17.2 and 17.4 of Harville. They mostly use the concept of essential disjointness to sharpen inequalities regarding the ranks of partitioned matrices or multiplied matrices.
 * If $$\mathcal C (\mathbf A)$$ and $$\mathcal{C}(\mathbf B)$$ are essentially disjoint, then $$\text{rank}(\mathbf A, \mathbf B)=\text{rank}(\mathbf A)+ \text{rank}(\mathbf B).$$ If they are not, then $$\text{rank}(\mathbf A, \mathbf B)<\text{rank}(\mathbf A)+ \text{rank}(\mathbf B)$$.
 * For any appropriately sized matrices $$\mathbf A$$ and $$\mathbf B$$, $$\mathcal{C}(\mathbf A)$$ and $$\mathcal{C}[(\mathbf{I-AA^-})\mathbf B]$$ are essentially disjoint. Intuitively terms, this result says that $$\mathcal{C}[(\mathbf{I-AA^-})\mathbf B]$$ hits the part of the  $$\mathbf B$$'s column space that cannot be hit by $$\mathcal{C}(\mathbf A)$$. A direct consequence of this is that $$\text{rank}[\mathbf A, (\mathbf{I-AA^-})\mathbf B] = \text{rank}(\mathbf A) + \text{rank}[(\mathbf{I-AA^-})\mathbf B].$$
 * For any $$\mathbf{T,U,V}$$ $$\text{rank}\begin{bmatrix} \mathbf T & \mathbf U \\ \mathbf V & \mathbf 0 \end{bmatrix} = \text{rank}(\mathbf V) + \text{rank}[\mathbf U, \mathbf{T(I-V^-V)}].$$
 * For any $$\mathbf{T,U,V}$$ $$\text{rank}\begin{bmatrix} \mathbf T & \mathbf U \\ \mathbf V & \mathbf 0 \end{bmatrix} \le \text{rank}(\mathbf V) + \text{rank} (\mathbf U) + \text{rank} (\mathbf T)$$ with eq. holding iff $$\mathcal{R}( \mathbf T)$$ and $$\mathcal{R}( \mathbf V)$$ are essentially disjoint and $$\mathcal{C}( \mathbf T)$$ and $$\mathcal{C}( \mathbf V)$$ are also essentially disjoint.
 * $$\text{rank}[(\mathbf{I-AA^-})\mathbf B] = \text{rank}(\mathbf B)-\dim[\mathcal{C}( \mathbf A)\cap\mathcal{C}( \mathbf B)]$$
 * $$\text{rank}(\mathbf{A+B}) \le \text{rank}(\mathbf{A})+\text{rank}(\mathbf B)$$ (see C4.5.9 )
 * $$\text{rank}(\mathbf{A+B}) \le \text{rank}(\mathbf{A,B}) \le \text{rank}(\mathbf{A})+\text{rank}(\mathbf{B})$$. The leftmost inequality holds with equality iff $$\mathcal{C}( \mathbf A)$$ and $$\mathcal{C}( \mathbf B)$$ are essentially disjoint (see T17.4.2 ). The rightmost inequality holds with equality iff $$\mathcal{R}( \mathbf A)$$ and $$\mathcal{R}( \mathbf B)$$ as well as $$\mathcal{C}( \mathbf A)$$ and $$\mathcal{C}( \mathbf B)$$ are essentially disjoint (see T18.5.7 ).


 * $$\text{rank}(\mathbf{A'A}) = \text{rank}(\mathbf A)$$ (see C7.4.5 )
 * $$\text{rank}(\mathbf{A,B}) = \text{rank} (\mathbf A)+\text{rank}(\mathbf B)-\dim[\mathcal{C}( \mathbf A)\cap\mathcal{C}( \mathbf B)]$$
 * $$\text{rank}(\mathbf{AB}) \ge \text{rank} (\mathbf A) + \text{rank}(\mathbf B) - n$$ with equality holding iff $$\mathbf{(I-BB^-)(I-A^-A)=0}$$ (see T17.5.4 ).
 * $$\text{rank}(\mathbf {AB}) = \text{rank}(\mathbf B)-\dim[\mathcal{C}( \mathbf B)\cap \mathcal N(\mathbf A)] = \text{rank}(\mathbf A)-\dim[\mathcal{C}( \mathbf B')\cap \mathcal N(\mathbf A')]$$ (see Sec. 4.5 )
 * $$\text{rank}(\mathbf{A+B}) \ge |\text{rank}(\mathbf{A})-\text{rank}(\mathbf{B})|$$
 * Small perturbations can't reduce rank: For sufficiently small $$E$$ (see p216 ), $$\text{rank}(A+E) \ge \text{rank}(A)$$

Rank in real applications
According to how they were obtained, matrices can be classified at least into two: Those that are derived analytically (e.g., a DCT basis) and those that are derived from data. For the former, the concept of rank makes perfect sense, however, it may make little sense for the latter. The approximate rank of a matrix can be much more useful for such matrices.

Why can the rank be of little use in real life?
To understand why, consider the problem of estimating $$x$$ from observations $$y$$, i.e., $$y = Ax$$, where $$A$$ is square (it's easy to think about non-square matrices as well). More realistically this problem can be formulated with some noise $$v$$:

$$y=Ax+v$$

If we had a rank-deficient $$A$$, then it would be easy to say that the problem is insoluble. However, we can add tiny perturbations to the elements of $$A$$ and make it nonsingular (apparently with probability 1 ). Would this "magic", which suddenly made $$A$$ invertible, really work? The answer is no, and an SVD analysis of the matrix would reveal that it wouldn't as the approximate rank of the matrix would probably be very small. The reason that it wouldn't work is that when we apply this invertible $$A$$ to estimate $$x$$ we would get

$$A^{-1}y = A^{-1}Ax+A^{-1}v \implies x = A^{-1}y - A^{-1}v$$

Now when $$A$$ is made artifically invertible with small perturbations, $$A^{-1}$$ would be a huge matrix (i.e., would have large norm) and the equation above suggests that we can have a huge noise amplification in the estimation.

The problem with matrices $$A$$ collected from real data is that we perhaps wouldn't even suspect that there is a problem because with real data there is always some noise therefore $$A$$ could appear as full rank, although it could have tiny singular values that would make its inverse huge.