Cayley-Hamilton theorem

Let $$p(s) = a_0 +a_1 s+\dots +a_k s^k$$ be a polynomial and let us overload this polynomial function so that as input it can take a square matrix, i.e.:

$$p(A) = a_0 I +a_1 A + \dots +a_k A^k$$

The 'Cayley-Hamilton theorem' states that for any $$A \in \mathbb{R}^{n\times n}$$ we have $$\mathcal{X}(A) = 0$$, where $$\mathcal{X}(s)$$ is the characteristic polynomial and $$\mathcal{X}(A)$$ is its version overloaded for matrices.

What's the point of the C-H theorem?
At least one of the uses of the C-H theorem is the following corollary, which shows that every power of a matrix can be expressed as a linear combination of its first $$n-1$$ powers (including $$0$$th power).

'Corollary': for every $$k \in \mb{Z}_+$$, we have:

$$A^K \in \text{span}\{I, A, A^2, \dots, A^{n-1}\}$$.

If $$A$$ is invertible, this holds also for $$k \in \mathbb{Z}$$.