LTI systems and convolution

Professors in engineering schools often mention that the output of an LTI (Linear-Time Invariant) system can be obtained via convolution. While this statement is taken almost as an axiom, the reason of whi it holds is seldom mentioned, even though it actually has a very simple explanation.

Let $$\mathcal L\{\cdot\}$$ be a linear system, and $$x[n]$$ a discrete sequence. Let $$y[n]:=\mathcal{L}\{x[n]\}$$ be the output of the system. Clearly, the input $$x[n]$$ can be written as an infinite sum of Kronecker delta functions:

$$x[n] = \sum_{k=-\infty}^{infty} x[k] \delta[n-k]$$