Rank

Identities about rank

 * $$\text{rank}(BC) \le \min\{\text{rank}(B), \text{rank}(C)\}$$
 * If $$A=BC$$ with $$B \in \mathbb{R}^{m\times r}$$ and $$C \in \mathbb{R}^{r\times n}$$, then $$\text{rank}(A) \le r$$. This property is very useful for the application of fast matrix-vector multiplication. I.e., to obtain the product $$Ax$$, we can instead do $$BCx$$ and if $$r$$ is very small compared to $$m$$ or $$n$$, then we'd speed up the computation a lot.
 * $$\text{rank}(A) + \dim A = n$$ (see also Fundamental Theorem of Linear Algebra).

Rank in real applications
According to how they were obtained, matrices can be classified at least into two: Those that are derived analytically (e.g., a DCT basis) and those that are derived from data. For the former, the concept of rank makes perfect sense, however, it may make little sense for the latter.

To understand why, consider the problem of estimating $$x$$ from observations $$y$$, i.e., $$y = Ax$$, where $$A$$ is square (it's easy to think about non-square matrices as well). More realistically this problem can be formulated with some noise $$v$$:

$$y=Ax+v$$

If we had a rank-deficient $$A$$, then it would be easy to say that the problem is insoluble. However, we can add tiny perturbations to the elements of $$A$$ and make it nonsingular (apparently with probability 1 ). Would this "magic", which suddenly made $$A$$ invertible, really work? The answer is no, and an SVD analysis of the matrix would reveal that it wouldn't. The reason that it wouldn't work is that when we apply this invertible $$A$$ to estimate $$x$$ we would get

$$A^{-1}y = A^{-1}Ax+A^{-1}v \implies x = A^{-1}y - A^{-1}v$$

Now when $$A$$ is made artifically invertible with small perturbations, $$A^{-1}$$ would be a huge matrix (i.e., would have large norm) and the equation above suggests that we can have a huge noise amplification in the estimation.

The problem with matrices $$A$$ collected from real data is that we perhaps wouldn't even suspect that there is a problem because with real data there is always some noise therefore $$A$$ could appear as full rank, although it could have tiny singular values that would make its inverse huge.