.

 Goodness-of-Fit tests for Statistical Distributions

Of the many quantitative goodness-of-fit techniques (e.g.: Komolgorov-Smirnov, Anderson-Darling, Shipiro-Wilk, von Mises), I prefer the Anderson-Darling test because it is more sensitive to deviations in the tails of the distribution than is the older Komolgorov-Smirnov test.

Note:   The Anderson-Darling test (or Komolgorov-Smirnov or Shipiro-Wilk) does not tell you that you do have a Normal density.  It only tells you when the data make it unlikely that you do not.(1)

Anderson-Darling can be applied to any distribution, but finding tables of critical values isn't so easy. Included here are two of the most useful tables, for the normal and lognormal, and for the Weibull, exponential, and Gumbel.

For the normal and lognormal distributions, the test statistic, A2 is calculated from

where n is the sample size, and w is the standard normal cdf, F[(x-m)/s].

This formula needs to be modified for small samples,

and then compared to an appropriate critical value from the table below.

 a 0.1 0.05 0.025 0.01 A2crit 0.631 0.752 0.873 1.035

(Reference:  D'Agostino and Stephens, Goodness-Of-Fit Techniques, Marcel-Dekker, New York, 1986, Table 4.7, p.123.  All of Chapter 4, pp.97-193, deals with goodness-of-fit tests based on empirical distribution function (EDF) statistics.)

The other popular family of distributions includes the Weibull for distributions of minima, and Gumbel for distributions of maxima. The Gumbel variable X, and Weibull variable Y are related by X=ln(1/Y) . A Weibull distribution with the shape parameter equal to one produces the exponential distribution as a special case.

For the Weibull (2) (and Gumbel) distributions, the test statistic, A2 is again calculated from

just as for the normal, but w is the cdf for the distribution under consideration. For the Weibull this is

and h, b, are the model scale and shape parameters.

This formula needs to be modified for small samples,

and then compared to an appropriate critical value from the table below.

 a 0.1 0.05 0.025 0.01 A2crit 0.637 0.757 0.877 1.038

(Ref: D'Agostino and Stephens, 1986, Table 4.17, p.146)

_____________

Notes:

1. The Anderson-Darling test, does not tell you that you have a Normal density.  It only tells you when the data make it unlikely that you do not.  Engineers (and I'm one) hate this kind of statistical double-talk.   But the fact remains:  Any frequentist test is constructed to disprove something.  Just as a dry sidewalk is evidence that it didn't rain, a wet sidewalk might be caused by rain or by the sprinkler system.  So a wet sidewalk can't prove that it rained, while a not-wet one is evidence that it did not rain.

2. Although the Weibull, a distribution of "weakest-link" minima, is more widely known, it may not always be the best choice, as its sister, the Gumbel, the asymptotic distribution of maxima.

3. Useful as the Anderson-Darling test is, good engineering practice requires use of the IntraOcular Trauma Test to confirm goodness-of-fit, and other preliminary findings.

4. R2 is a common criterion for goodness-of-fit for regression models but it isn't very good.