I was asked this question recently. Here is a proof based on Information Theory.
Assume that we have to sort an array of n elements, all of which are distinct. Hence, there are $n! $ permutations possible, out of which only one is sorted.
Now let us assume the lower bound on the comparisons to be $f(n) $. Since each comparison has only 2 outcomes, thus,
$2^{f(n)} \geqslant n! $
has to hold.
Therefore,
$f(n) = \log _2 ( n!) $
which, by Stirling’s approximation is $\Omega(n \log_2 n)$. Knuth also gives a similar explanation in Volume 3.