The central limit theorem (CLT) states that the sampling distribution of the mean of a population that is not normally distributed is approximately normally distributed around the population mean, given a sufficiently large sample size. I do not currently have the math chops to prove the CLT, but can provide the following simulation to demonstrate the theorem in action.
Suppose our population is gamma distributed with a shape parameter of 3.5 and a scale parameter of two:
We draw 2500 random samples of size 500 from this population and store the means of each sample in a vector called means.vector:
Plotting the histogram of the means vector shows that it appears normally distributed around the population mean:
Performing a Shapiro-Wilk test for normality on the means vector confirms that it is normally distributed: