Cayley-Hamilton theorem

Let $$p(s) = a_0 +a_1 s+\dots +a_k s^k$$ be a polynomial and let us overload this polynomial function so that as input it can take a square matrix, i.e.:

$$p(A) = a_0 I +a_1 A + \dots +a_k A^k$$

The Cayley-Hamilton theore' states that for any $$A \in \mathbb{R}^{n\times n}$$ we have $$\mathcal{X}(A) = 0$$, where $$\mathcal{X}(s)$$ is the characteristic polynomial and $$\mathcal{X}(A)$$ is its version overloaded for matrices.

What's the point of the C-H theorem?
At least one of the uses of the C-H theorem is the following corollary, which shows that every power of a matrix can be expressed as a linear combination of its first $$n-1$$ powers (including $$0$$th power).

Corollary: for every $$k \in \mathbb{Z}_+$$, we have:

$$A^k \in \text{span}\{I, A, A^2, \dots, A^{n-1}\}$$.

If $$A$$ is invertible, this holds also for $$k \in \mathbb{Z}$$.

Reachability
An important implication of this corollary is that if a discrete LDS with a matrix $$A\in\mathbb{R}^{n\times n}$$ can reach a certain state $$x(t)$$ it can do so in $$n$$ time steps; that is, the additional time steps cannot add more reachable points to the system.