|
|
| Contrasting
the Statistical with the Mathematical Page 4 of 4 Results The results are interesting. While NESSUS converged to within acceptable tolerance of the apparent beta (based on the sample means and sample standard deviations), the range for the true betas was very large (-9.383 < b < -2.294) calculated from the converged, unscaled values for W, E, B, H, and L (see Figure 2, repeated here) and the true means and standard deviations.
About 58% of the NESSUS solutions were overdesign; 42% underdesign (anticonservative), but the consequences of over/under-design are not symmetric: Underestimating beta results in many more failures than over estimating beta reduces them, because the rate is already very small, p=1×10-6. Thus, the average failure rate (the total expected number of failures divided by the 10,000 evaluations of this example) was 53X (5,300%) the nominal failure rate. Note that the higher average failure rate is not related to "confidence." The average failure rate is much higher than the nominal rate because the sample parameters (means and standard deviations) are not the true values but only estimates of them. This is an unwelcome, but unavoidable consequence of the statistical properties of NESSUS. Of course this reality is not limited only to NESSUS but is true of any probabilistic calculation that relies on sample estimates as stand-ins for true parameter values - that is, all real (as contrasted with idealized) calculations (§). This fact is well known in the statistical community but we engineers have been slow to appreciate its negative implications, i.e. that our calculated nominal failure probabilities may considerably under-estimate the true risk (*). (§) This explains why Monte Carlo studies often appear to confirm NESSUS results, when in reality they are both wrong. (*) For an interesting discussion of the row this caused among statisticians 80 years ago, see Joan Fisher Box, R.A. Fisher - the Life of a Scientist, Wiley, 1978
Acknowledgement:
|
||
|