## Central Limit Theorem

Breaking News:

We are rebuilding our website. Please visit us here:
https://statistical-engineering.com/clt-summary/

This URL will remain active briefly and be decomissioned and traffic directed to the new site.

### Examples

The CLT is responsible for this remarkable result:

The distribution of an average tends to be Normal, even when the distribution from which the average is computed is decidedly non-Normal.

Thus, the Central Limit theorem is the foundation for many statistical procedures, including Quality Control Charts, because the distribution of the phenomenon under study does not have to be Normal because its average will be.

Furthermore, this normal distribution will have the same mean as the parent distribution, AND, variance equal to the variance of the parent divided by the sample size.

### The Fine-Print:

The distribution of an average will tend to be Normal as the sample size increases, regardless of the distribution from which the average is taken except when the moments of the parent distribution do not exist.  All practical distributions in statistical engineering have defined moments, and thus the CLT applies.

### Statistical Moments:

Readers have requested further explanation of the fine print, so a slight digression is in order. Statistical Moments are analogous to moments in physics, where we consider a force multiplied by its distance from the centroid or fulcrum. The first statistical moment is the mean, which is the sum of the distances from zero, times the probability of being at that distance,

If the density is continuous, rather than discrete, the sum becomes an integral.

equation 1

The mean of random variable X is also referred to as the expected value of X, written E X, or E(x).

The variance is the second statistical moment, and is the sum of the squared distances from the mean, times the probability of being at that distance. Higher order moments, skewness (asymmetry) and kurtosis (peakedness) are similarly defined, with the distances, (x - m) raised to the 3rd and 4th power, respectively.

### Sometimes the Moments Diverge:

The Cauchy is an example of a pathological distribution with nonexistent moments. The density is

The density looks like this:

The Cauchy is a proper density, however, since it integrates to one.

This can be easily seen since

so that

But the mean (the first statistical moment) doesn't exist. (In fact, none of the moments exists.) That is, the integral defined by equation 1 diverges. It turns out that showing that the moment integrals do not converge is somewhat complicated. The moment-generating function won't work since the moment generating function for a Cauchy doesn't exist. Casella and Berger, however, use a clever computational trick to show that E | X | does not exist and thus neither does E X:

Now, for any positive number, M,

Therefore

Since E| X | does not exist neither does E X.  The mean of the Cauchy density does not exist.

### Summary:

The Central Limit Theorem describes the relation of a sample mean to the population mean. If the population mean doesn't exist, then the CLT doesn't apply and the characteristics of the sample mean, Xbar, are not predictable. Attention to detail is needed here: You can always compute the numerical mean of a finite number of observations from any density (if every observation is finite). But the population mean is defined as an integral, which diverges for the Cauchy, so even though a sample mean is finite, the population mean is not.

The Cauchy has another interesting property - the distribution of the sample average is that same as the distribution of an individual observation, so the scatter never diminishes, regardless of sample size.

### Caveat:

The Central Limit Theorem almost always holds, but caution is required in its application.  If the population mean doesn't exist, then the CLT is not applicable.  Further, even if the mean does exist, the CLT convergence to a normal density might be slow, requiring hundreds or even thousands of observations, rather than the few dozen in these examples.  The prudent practitioner will know the limitations of any rule, algorithm or function, in statistics or in engineering.

### Reference:

Casella and Berger, Statistical Inference, 2nd ed., Duxbury, 2002