LTI systems and convolution

Professors in engineering schools often mention that the output of an LTI (Linear-Time Invariant) system can be obtained via convolution. While this statement is taken almost as an axiom, the reason of why it holds is seldom mentioned, even though it actually has a very simple explanation.

Let $$\mathcal L\{\cdot\}$$ be a linear system, and $$x[n]$$ a discrete sequence. Let $$y[n]:=\mathcal{L}\{x[n]\}$$ be the output of the system. Clearly, the input $$x[n]$$ can be written as an infinite sum of Kronecker delta functions:

$$x[n] = \sum_{k=-\infty}^{\infty} x[k] \delta[n-k]$$.

Therefore, the output can be written as:

$$y[n] = \mathcal{L}\{x[n]\}= \mathcal{L}\{\sum_{k=-\infty}^{\infty} x[k] \delta[n-k]\} = \sum_{k=-\infty}^{\infty}x[k] \mathcal{L}\{\delta[n-k]\} =\sum_{k=-\infty}^{\infty}x[k] h[n-k]\} $$

where $$h[n]:=\mathcal{L}\{\delta[n]$$ is the famous impulse response of the system. Thus, the output of a linear system to a given input can be obtained by convolving the input sequence with the impulse response of the system:

$$y[n] = \mathcal{L}\{x[n]\}= \sum_{k=-\infty}^{\infty}x[k] h[n-k]\} $$

Relation to system function (or transfer function)
It is also somewhat of a mystery why the transfer function is so much studied. For discrete systems, transfer function is obtained through $$z$$-transformation. Since $$z$$ transform shares the convolution property with Fourier transform (i.e., convolution in time domain is equal to multiplication in $$z$$-domain), it holds that

$$Y(z) = X(z)H(z)$$

This justifies the usage of the famous transfer function $$H(z)$$,

$$H(z)=\frac{Y(z)}{X(z)}$$,

to characterize a system