I was asked this question recently. Here is a proof based on Information Theory.

Assume that we have to sort an array of n elements, all of which are distinct. Hence, there are $n! $ permutations possible, out of which only one is sorted.

Now let us assume the lower bound on the comparisons to be $f(n) $. Since each comparison has only 2 outcomes, thus,

$2^{f(n)} \geqslant n! $

has to hold.

Therefore,

$f(n) = \log _2 ( n!) $

which, by Stirlingâ€™s approximation is $\Omega(n \log_2 n)$. Knuth also gives a similar explanation in Volume 3.