While it may be true that only real data can provide real experience, even
fighter pilots use simulators. The spreadsheet provides an easy mechanism for acquiring
experience with things such as the influence of sample size, the most effective
experimental design, the influence of data from less than optimal experiments, dealing
with convergence difficulties, and as important as anything, the influence of happenstance
when repeated, nominally identical, experiments produce sometimes surprising differences.
Simulated responses can be generated by creating a cell (C5) containing the probability of
a single observation, based on current values of L and S, as shown in Table
IF(RAND() > $G5, 0,
@IF(@RAND > $G5, 0,
@IF(@RAND > $G5, 0,
RAND() generates a uniform random
variate on (0,1) and the IF compares it with p (in column G), which is linked to
the explanatory variable, and returns a 1 if true and 0 if false. Copy this cell into the
column to contain the "response" data (column C), and the first realization of
outputs is generated. Since this column will be automatically recalculated any time any
cell is updated, its values need to be fixed before they can be analyzed. To do this,
highlight the column, choose COPY, then PASTE VALUES. (This is usually done with icons and
a few mouse clicks.) The column will no longer contain the generating functions, only
zeros and ones.
Leaving a generator cell unchanged at the top of the column (unused
in row 5) makes it easy to copy it into the entire column and generate another
"realization" of data. Other realizations can be generated by pressing the
RECALCULATE NOW key before changing the column from functions to numbers. A cell (C64)
containing the sum of the responses (Sy)
provides a qualitative feel for the data. (Were half the "observations" ones?
Did you remember to redefine this sum for any change in sample size?) The spreadsheet is a
great learning tool because of it immediacy. Results are plotted (almost) instantly and
its ease of use encourages experimentation.
The modern P/C spreadsheet can provide far greater access to
Generalized Linear Models than is currently afforded by the relative paucity and high cost
of sophisticated statistical software. The techniques described here allow the quality
practitioner to compute maximum likelihood parameter estimates and their likelihood ratio
confidence contours and plot the resulting model with its 95% confidence bounds, or a
specific point of interest like a90/95. The methods are easy to
understand and implement and encourage the user to learn by experimenting.
I wish to thank my engineering colleagues for their support and my
statistical brethren, especially Bill Meeker, for their tutelage and patience.
[First page] [back] [next] [Spreadsheets] [References]