THE TEACHER'S STATISTICAL PACKAGE
by George W. Goth
These programs are shareware. You may use them as much
as you like for as long as you like. You may also give as
many copies as you like to any friends, post them on
bulletin boards, use them as party favors and so forth. If
you find them useful, would you eventually please send
$10.00 to
George Goth
Skyline College
3300 College Drive
San Bruno, CA 94066
Those contributing their share will become registered users
and will receive updates when they will become available.
          
GETTING STARTED
It is STRONGLY recommended that you make a backup copy
of the disk and put the original in a safe place. The disk
is not copy protected. Be sure to copy BOTH sides of the
disk.
Once you have made the backup copy, boot it and in a
few moments the monitor screen will display the following:
THE TEACHER'S STATISTICAL
PACKAGE
WOULD YOU LIKE TO CALCULATE:
M: MEANS AND STANDARD DEVIATIONS
T: TTEST COMPARISON OF MEANS
A: ANALYSIS OF VARIANCE
C: CHISQUARED ANALYSIS
L: LEAST SQUARE FITS
R: MULTIPLE REGRESSIONS
S: SEE MENU OF MORE PROGRAMS
Q: QUIT THE PROGRAMS
PLEASE ENTER A LETTER < >
This is called the PROGRAM MENU and you use it to select the
statistical programs. At the bottom of the screen
you see the phrase "PLEASE ENTER A LETTER < >" with a
flashing box, called the cursor, inside the brackets. You
select the program you wish to run by typing the letter
corresponding to the program. For example, typing the letter
M will run the program that calculates means and standard
deviations and so forth. You need not type a carriage
return after typing the letter.
[A NOTE ON ENTERING INFORMATION INTO THE COMPUTER: To use
these programs you have to enter two types of information,
choices and data. CHOICES are just that; you are given
a list of options and you choose one. Whenever you must make
a choice the available choices are either indicated by a
list (or menu) or will be a "Y" (for yes) or "N" (for no) to
a question (such as "SEND RESULTS TO PRINTER? < >").
Choices are made with single keystrokes with no carriage
return. DATA consists of the actual numerical results you
want to analyze and the names you give the data. To enter
the data, type in the correct number or name and then type a
carriage return. If you have any qualms about whether
to type a carriage return, enter the information and
wait a second  if the computer does not immediately do
something, it means it is waiting for a carriage return].
MEAN AND STANDARD DEVIATION
Type a capital letter M to select the Mean and Standard
Deviation Program and we will begin our tour of statistics.
If you acquire data, such as by making measurements of
something or grading a group of tests, you will almost never
find that all the measurements (or tests) have the same
value. Rather the values range from a minimum to a maximum,
with perhaps a clustering toward the middle. What we would
want to know about this data are:
a) what is the best representation of its average
value; and
b) what is the best representation of its variability
(i. e. are the values tightly clustered together or
spread out).
There are several ways of expressing the average,
the three best known being:
a) the mean (also called the arithmetic mean), which is the
sum of all the values divided by the number of values.
Thus the mean of 1, 2, 3 and 3 is
1 + 2 + 3 + 3 _
mean =  = 2.25 = x
4
The mean is the most useful of the averages. If the
data collected was from the entire population, the mean is
called the population mean; if the data represented only a
portion of the population, the mean is the sample mean. We
will generally be working with sample means, often
designated by an x with a line over it (called x bar).
b) the median, which is middle value, that is half the
other values are smaller than this and half are larger.
For our data from a) the median is 2.5.
c) the mode (from French for 'fashionable'), which is
the value that occurs most often. For our data, the
mode is 3, as there are two 3's. If more than two
numbers occur with the same maximum (as in the case
1,2,3,3,4,5,5), the distribution is bimodal.
There are also several measures of the variability
(or dispersion) of the data. The three best known are:
a) the range, the difference between the maximum and
minimum values. For our example from above, the range
would be 31=2. The range is not a particularly useful
measure of dispersion. For instance, you could have a
thousand 3's and a single value of 53, and the range
would be 50. The range is also 50 for a single 3 and a
single 53.
b) the mean deviation. If you were to subtract the mean
value from each of the data points (to calculate the
deviation of each point) and sum the result, you would
get zero.
1  2.25 = 1.25
2  2.25 =  .25
3  2.25 = .75
3  2.25 = .75
sum 0
The mean deviation is the sum of the absolute values
of the deviations, divided by the number of values. In
this case it is (1.25 + .25 +.75 + .75)/4 = .75. The
mean deviation tends to underestimate the variability
in many cases.
c) the variance is the sum of the squares of the
individual deviations divided by the number of
observations less 1. The positive square root of the
variance is called the standard deviation, s. For our
sample the standard deviation is
_______________________________________
/ 2 2 2 2
s = _ / (1.25) + (.25) + (.75) + (.75) =.96
\ / 
\ / 4  1
The standard deviation is the most useful of the
measures of dispersion, as we will see later.
With those definitions in mind, let's use the program
to calculate some results. When you pressed the letter M,
after a few moments you saw the following on the monitor
screen:
MEAN AND STANDARD DEVIATION
(C) 1986 GEORGE GOTH V 1.00
I: INPUT MODULE
E: EDIT MODULE
C: CALCULATE MODULE
D: DISK MODULE
Q: QUIT
PLEASE ENTER A CHOICE < >
All the programs are constructed around modules
which allow you to enter data (INPUT), change data (EDIT),
calculate results (CALCULATE) or store it on disk (DISK).
Since no data has been entered yet, let's do that by typing
I (no carriage return) to get to the INPUT MODULE.
Once in the Input Module you are asked for a NAME OF DATA SET.
This is a tag or ID of the data. If you don't have a name, press
return. However, let's assume that you want to know the averages of
the ages of a group of mathematics students. So name the Data set
MATH STUDENT AGES. Now you see a line saying "Point #1" with a
flashing cursor after it. You are being asked to enter the age of the
first student. Suppose the ages were
18, 17, 19, 19, 22, 23, 23, 18, 20, 22
Enter each of these numbers (followed by a carriage return)
and write the word END to end the data entry and return to the Main
Menu.
Suppose you think you made an error in the data entry.
Now choose E from the Main Menu to go to the EDIT MODULE. If
you do this you will see a new screen and across the bottom
are the words:
View Data Order Data
Change Data Main Menu
Delete Data
The View Data Option (V) allows you to look at the data
to see if you made a mistake. You will be asked if you want
to send the data to the printer for a hardcopy (Y for yes,
N for no). If there are more than 10 data points, the
program will pause at every 10th value. The points are
numbered sequentially (the first age of 18 is given the
number 1, etc).
The Change Data Option (C) allows you to change either
the Data Set Name or a Value of a point by selecting either
D or V. If you wish to change a value, enter the number of
the point you wish to change, the value will be displayed
and either type in a new value (to correct it) or press
return (to keep it).
The Delete Data Option (D) allows you to remove a value
from the data. If you wish to do this, type the number of
the point and, after its value is displayed, type either a Y
for yes or a N for no to delete or keep the point. If a
point at the beginning or in the middle of the data is
deleted, all the others are renumbered. Thus, if you deleted
the first 18, the value of point 1 would become 17.
The Order Data Option (O) allows you to arrange the
value from lowest to greatest value. This is useful if you
want to draw a graph of the data later.
The Main Menu Option (M) returns you to the Main Menu
when you have completed your editing.
After editing the data, return to the Main Menu and
Choose C for the Calculation Module. Before the calculation
is done, you will be asked if you want to send the results
to the printer (answer Y or N) and if you want to calculate
the median and mode [1]. Let's calculate the median and mode
in this case.
In a few moments, the following appears
NAME OF DATA SET: MATH STUDENT AGES
NUMBER OF DATA POINTS: 10
MEDIAN: 19.5
MODE: 23
MEAN: 20.1
STANDARD DEVIATION: 2.23358208

AVE MEAN AND STD. DEV. SCORES
AIN MENU
PLEASE ENTER A CHOICE < >
The Save Mean and Standard Deviation Option (S) allows
you to save the mean and standard deviation [2] in memory
for use later. When you press S, the computer will beep,
indicating the data has been saved.
The Z Scores Option (Z) will display a set of
statistics called, most appropriately, the z scores.
A z score for a point is:
value of point  mean
z = 
standard deviation
Thus the z score for age of 18 in this collection of ages would be:
18  20.1
z =  = .9401
2.23358208
The closer the size of the z score is to zero, the closer
the point is to the mean value. Z scores are discussed more
fully below.
If you have not already done so, save these results
using the S option and return to the Main Menu.
Choose the Input option. Notice that you are warned
that data are already in memory. Since you are about to
calculate another mean, clear it. Now do the following problem.
Problem: A group of 12 students in an English class have the
following ages: 22, 23, 21, 19, 18, 22, 20, 18, 23, 24, 18,
23. Calculate the mean and standard deviation of the ages of
this group. Save the results when you are done and return
to the Main Menu.
[Answer: mean is 20.9166667, standard deviation is 2.23437335.]
You are now ready to explore the wonders of the Disk
Option (D). When you press D, you see the following:
S: Save to Disk
F: Fetch from Disk
C: Catalog Disk
L: Lock a File
U: Unlock a File
R: Rename a File
D: Delete a File
M: Main Menu
The Save to Disk Option (S) allows you to save data on
the floppy disk. The disk must have sufficient room on it for the data
to be stored, so you should have a Prodos formatted disk named
STAT.PRODOS available. You will be asked whether you wish to save Raw
Data (the data values only) or a Set of Means (the means and standard
deviations of the data sets you have stored in memory, along with
their data set names). Choose S to save the set of means, for we will
be using them later, and give the set an appropriate name [3], such as
STUDENT AGES.
The Fetch from Disk Option (F) allows you to recover
raw, unanalyzed data from the disk.
The Catalog Option (C) allows you to examine the
contents of the disk to see what programs and files are
stored on it.
The Lock a File Option (L) allows you to "lock" a file,
that is prevent another file with an identical name from
replacing it.
The Unlock a File Option (U) unlocks the file, so that
another file with the same name can replace it or so that it
can be deleted.
The Rename a File Option (R) allows you to give the
file a new name.
The Delete a File Option (D) permanently removes the
file from the disk.
When done, return to the Main Menu, choose Q (for
Quit), which will get you back to the program menu. You can
either take a break now or go onto the next section.
FOOTNOTES ON MEAN
[1] If there are several modes, the program will find only
the one with the largest value. Median and mode calculations
involve ordering and ranking the data, and can take a
considerable time if there is a large number of data points.
[2] The program will also store the sum of the squares of
the data. This information is used in the ANOVA program.
[3] See list of file suffixes at the end of this manual.
WHAT ARE WE GOING TO DO WITH THE MEAN?
Now that we know how to calculate the mean and standard
deviation, what are we going to do with it? In the next several
programs we will compare one mean with another (or others) to see if
they could be representatives of the same population. Generally the
two (or more) samples of different populations will not have exactly
the same value for their means, but because the data are dispersed (as
measured by the standard deviation), it could well be possible that
the populations do have the same mean. So, we wish to develop a
series of tests, using the samples drawn from the populations, for
determining if the two (or more) populations have identical means.
The method of doing this is called hypothesis testing.
A hypothesis is a statement or claim about the nature
of the populations. We will focus on what is called the
null hypothesis (designated H ) and develop tests for
0
accepting or rejecting this null hypothesis as true or
false.
As an example, suppose we wish to see if a coin is
'fair', that is if we tossed it a very large number of times
it would come up heads half the time and tails the other
half. But we don't have the time to flip it a very large
number of times, we only can flip it, say, 50 times. We do
so and it comes up heads 26 times and tails 24 times. Is it
truly a fair coin? Or, to put it into statistical jargon is
H : p(heads) = 0.5
0
to be accepted or rejected? (The symbols mean "the null
hypothesis is that the coin will come up heads 50 per cent of the
time.)
If we can reject the null hypothesis, then we accept
what is called the alternative hypothesis, which in our case
would be that the coin does not come up heads 50 per cent of the
time.
It is obvious from our results (26 heads, 24 tails)
that it did not come up heads exactly 50 per cent of the time, but
it would not be unreasonable to say that even if we did have
a strictly fair coin and flipped it 50 times over and over,
some of those times we would get 26 heads and 24 tails. If
this result occurred fairly often (say 5 times out of 100),
then we could say that we were certain that our coin was
fair to the 5 per cent level of significance. If a coin came
up with 26 heads and 24 tails only very occasionally (say 1
time out of 1,000,000) we would be very doubtful that our
coin was truly a fair one (there would only be a one in a
million chance that it was), in other words we could reject
it at the 5 per cent level of significance.
We do not have the space here to discuss, much less to
develop, the theories behind the procedures that
are used to test hypotheses. As the programs were designed
to carry out the calculations and state the results,
we will present merely a brief discussion of conditions
under which the tests are used. However, we must begin with
a short discussion of the term "normal distribution."
For many phenomena, the data are arranged in what is
popularly called a 'bell shaped curve.' The curve is more
properly called a normal curve, or a Gaussian curve, and it
is determined by the value of its mean and its
standard deviation; that is, if you know these two values,
you can calculate the value of the normal distribution at
any point. The mean determines where the central maximum
lies and the standard deviation determines the 'flatness' of
the curve; that is, if the standard deviation is small, the
normal distribution becomes spikelike, if the standard
deviation is large, the normal distribution becomes
flattened. If a set of data are normally distributed, then
the probability that a value x in the data set will be found
is directly related to the value of the normal curve for x.
For instance, if the mean of a set of normally distributed
data is 10 and the standard deviation is 2, then the
probability of finding a data point of value 14 is 0.0228
(2.28 per cent). For the same data, the probability of finding a
data point of value 16 is 0.0013 (0.13 per cent). We will be
assuming for the first several programs that the data are
normally distributed.
As for the tests we use to test the hypothesis, they
are
a) z scores. As mentioned above, a z score is given by
value of point  mean
z = 
standard deviation
Therefore a point far from the mean (assuming the
standard deviation is small), will have a large z
score. Since large z scores imply small
probabilities, the likelihood of finding such a point
will be small. For instance, the probability of finding
a point with a z of 3.09 is only 0.1 per cent.
For instance, it can be shown that if you flip a fair
coin 500 times, if will come up heads with a mean of
250 times and a standard deviation of 11. If we flip a
coin 500 times and it comes up heads 280 times, is it a
fair coin? First we calculate the z score
280  250
z =  = 2.73
11
If we look up in the appropriate table (found in any
elementary statistics book), we will find that this
value of z corresponds to a probability of 0.0032
(0.32 per cent). That is, if we were to flip a truly fair coin
for 10,000 cases of 500 times each (a total of 5
million flips!), only in about 3 cases would we have
come up with 280 (or more) heads. Thus it would be
reasonable to reject the null hypothesis that our coin
(the one with a z of 2.73) was a fair one.
The z scores are used to test the null hypothesis in
the MANNWHITNEY U, SIGN, RUNS and RANK CORRELATION
programs.
b) Student's t. When we have two normally distributed
sets of data, with known means and standard deviations,
we can test to see if the means are the same (i.e the
null hypothesis is that the means are the same) at a
given level of significance by computing a particular
statistic called t and checking whether the
value of t occurs with the level of significance (or
better) we desire. This method of comparing
means was developed by William Gosset, published his
results under the name Student. The particular value of
t will also depend on what is called the 'degrees of
freedom' (df), which is related to the sample size.[1]
The t test is used in the programs TTEST COMPARISON OF
MEANS and ANOVA (what is termed a 'multiplet t test' is
used in the second).
c) F test (named after Sir Ronald Fisher) is used when
three or more sets of data are to be compared to see if
all the means are identical. It is used in the program
ANOVA (which stands for analysis of variance).
c) chi squared test. The chi squared test is used in
several statistical applications. In these
programs the chi squared test is used to test
hypotheses for data which may not be normally
distributed and, in fact, may not be ordinal
(that is, have true numerical values). For instance, we
assign in our classes the grades A, B, C and so forth.
These letters are not ordinal (they are what is termed
nominal, that is used in naming classes and
could be replaced by 1, 2, 3 etc) much less normally
distributed and yet it is still possible to do
statistics on such data sets.
With all that in mind, let's check some hypotheses.
FOOTNOTE
[1] If the degree of freedom is 30 or more, the value of t
is approximately equal to the value of z.
TTEST COMPARISON OF MEANS
The first statistical test we will carry out is
Student's t test for comparing means. Boot the disk (if it
is not on already) and from the program menu choose T to
select TTEST COMPARISON OF MEANS. Shortly you will see a
menu very much like that used in the MEANS programs (all but
one of these programs have this as their main menu). Since
we have already calculated some means and stored the result
as the file AGES, let's analyze that.
Type D to get to the disk menu and then F to fetch a
file. Type the word AGES to load that file and then M to
return to the main menu. Then type C for calculation and in
a few moments you will see the following result:
NAME OF DATA SET: AGES
NUMBER OF VALUES: 2
# NAME NO MEAN STD. DEV.
1 ENGLISH 12 2.09E+01 2.23E+00
2 MATH 10 2.01E+01 2.23E+00
STD. ERROR OF DIFFERENCE: 1.00E+00
TTEST: 8.14E01 DEG. OF FREE.: 20
NULL HYPOTHESIS: MEANS ARE THE SAME
ACCEPT NULL HYPOTHS AT 5.0% LEVEL OF SIG
Notice that strange looking numbers (like
2.09E+01) occur. This is because some of the numbers are
expressed in exponential (or scientific) notation. This
allows the program to fit numbers of various lengths
(example 2.09, .0000289) in the slots of the same size on
the screen and makes for a pleasant looking format. In case
you are not familiar with this notation, the 'E' means
'raised to the power of 10', so:
1
2.09E+01 is the same as 2.09 x 10 = 20.9
0
2.23E+00 is the same as 2.23 x 10 = 2.23
1
8.14E01 is the same as 8.14 x 10 = 0.814
The program prints out the set names, means, standard
deviations for each of the data sets compared, along with
the standard error of the difference (a statistic used in
the t test), the value of t and the degrees of freedom. Of
most interest to us are the last two lines, the statement of
the null hypothesis (the means are the same) and the
conclusion, which in this case is that we should accept the
null hypothesis at the 5.0 per cent level of significance.
Note that if the null hypothesis is accepted at
the 5.0 per cent level of significance, it would also be accepted at
the 10 per cent, 20 per cent, 50 per cent etc levels of significance 
for any level more than 5.0 per cent. To find out if it were accepted
at, say, 2 per cent, you would have to look up the value of t at 2 per
cent for 20 degrees of freedom and see if it were greater than our
calculated t (0.814), in which case our hypothesis would be
accepted at 10 per cent, or less than 0.814, in which case it would
be rejected. The program will only accept or reject at
levels of 5.0 per cent, 1.0 per cent and 0.1 per cent, which are the
most commonly used levels in hypothesis testing with the t test.
Let's try to add some more data. Return to the main menu by
pressing M, then get to the Input Module with an I, indicate you do
not wish to clear the current data with a N and then let's add the
following data for a sample of ages of telecommunications students :
Name = TELE COMM, number = 17, mean = 29.6, standard
deviation = 3.8. Then type END to end the input module and
then C to get back to the calculation.
Now the program will halt and ask you which of the
three means you wish to compare.[If you can't remember which
number corresponds to which mean, return to the main menu
and use the Edit module to view the data.] Let's compare 1
and 3 (that is the ages of English students with those in
telecommunications). So type 1 (return) and 3 (return) and
you will see
NAME OF DATA SET: AGES
NUMBER OF VALUES: 3
# NAME NO MEAN STD. DEV.
1 ENGLISH 12 2.09E+01 2.23E+00
3 TELE COMM 17 2.96E+01 3.80E+00
STD. ERROR OF DIFFERENCE: 1.27E+00
TTEST: 6.58E+00 DEG. OF FREE.: 27
NULL HYPOTHESIS: MEANS ARE THE SAME
REJECT NULL HYPOTHS AT 0.1% LEVEL OF SIG
This time we see that we are to reject the null hypothesis
and conclude that there is a difference in the ages of the
two types of students at the 0.1 per cent level of significance.
There is less than a 1 in 1000 chance that the means of the
two ages are the same.
ANALYSIS OF VARIANCE
Suppose you would want to compare more than two means,
how would you do it? You could, perhaps, to a series of
ttests between the individual means but there are both
practical and theoretical reasons for not doing this.
Practically, even only a few means to compare would
result in many ttests (9 different means would result in a total of 36 different tests, which, even with a computer, would take a long time).
Also, when doing it by pairs other complications arise,
particularly the possibility of what is called a Type I error is
increased [1]. The preferred method of comparing more than two means
is called an Analysis of Variance (ANOVA), which, not surprisingly,
analyses variance between several means.
Without going into the details, the ANOVA computes two
statistics, the Within Groups Sums of Squares (SSE),
a measure of the variability within the samples because of their
standard deviations, and the Between Groups Sums of Squares
(SSB), a measure of the variability between the samples due
to the differences in their means. Appropriate numbers of
degrees of freedom and a statistic F are then computed. The
null hypothesis that all the means are the same is then
rejected or accepted at a level of significance by comparing
the calculated F with appropriate values from a table. The
levels of significance tested by this program are 5 per cent and 1 per
cent, the most commonly used ones.
For instance, suppose we wished to see if school
attendance varied with the day of the week. On 18 randomly
selected days we take attendance and get the following
results:
Mondays: 143, 128, 110
Tuesdays: 162, 136, 144, 158
Wednesdays: 160, 132, 180, 160, 138
Thursdays: 138, 168, 120
Fridays: 110, 130, 135
We would first need to calculate the means, standard
deviationsand an additional term called the sum of the
squares, which is done for us by the MEAN program. After
we calculate these values and store them on disk, we can use
the ANOVA program to analyze the results. We get to the
ANOVA program by typing A from the Program Menu.
After we are in the ANOVA program, we fetch the data on
the means we have stored by using the Disk module and then
go to the Calculate module. We are asked how many of the
means we wish to compare, and since we want to analyze all
of them, we type 5. Shortly afterwards, the program
displays:
NAME OF DATA SET: ATTENDANCE
NUMBER OF DATA VALUES: 5
# NAME # NAME
1 MONDAY 2 TUESDAY
3 WEDNESDAY 4 THURSDAY
5 FRIDAY
SOURCE OF VAR DF SS MS
BETWEEN GRPS 4 2.52E+03 6.29E+02
AMONG GRPS 13 4.00E+03 3.08E+02
TOTAL 17 6.52E+03
F = 2.05E+00
NULL HYPOTHSIS: ALL MEANS ARE SAME
ACCEPT NULL HYPOTHSIS AT 5% LEVEL OF SIG
The result, surprisingly, is that the null hypothesis is accepted at
the 5 per cent level of significance. There is no evidence that the
attendance varies with the day of the week.
[Problem: For a biology class project a student grows bean
plants utilizing four different fertilizers. She then
collects and weighs the beans produced by each plant. From
the data given below, determine if the fertilizers have any
effect on the yield of the beans. Also determine which, if
any, of the fertilizers, produce significantly different
yields.]
Fertilizer Yield (in kilograms)
Macrogro 3.7, 4.4, 3.8, 4.2
Multiyield 3.6, 3.5, 4.1, 3.9, 3.8
Polybean 6.3, 5.9, 4.8, 6.0, 5.8, 4.9
Legumeorama 2.9, 3.3, 2.8
[Answer: Using the MEAN program to calculate the means and
other needed statistics and then the ANOVA program to do the
analysis, we conclude that, at the 1 per cent level of significance,
the null hypothesis that the mean yields are the same should
be rejected. But how do we answer the second part, that is,
how do we determine which, if any, of the fertilizers differ
from each other. To do this, we use the Ttest Option (T)
at the bottom of the Calculation Module screen. This allows
us to compare any two means using a multiplet ttest. If
we compare Macrogro (1) with Multiyield (2), we see that
at the 5 per cent level of significance, the means are the same.
Similarly, the yields of Macrogro and Polybean (3) are
different at the 0.1 per cent level of significance and so on.]
FOOTNOTES ON ANOVA
[1] A Type I error is when the null hypothesis is true but
is incorrectly rejected. A Type II error is when the null
hypothesis is false but is incorrectly accepted. Most null
hypothesis testing is based on type I errors.
THE METHOD OF LEAST SQUARES
In the section on ANOVA we learned how a group of data
sets could be compared to learn if their means were the same
within a certain error. Thus, for instance, if a student
placed varying amounts of fertilizer on bean plants and
measured the yields, she could determine if all the means
were the same, or if not, which were different. However, we
can extract more information from this type of experiment. We can
learn if the amount of fertilizer and the yield are 'linearly
correlated'.
The term linearly correlated means that if the amount
of fertilizer is increased, the yield increases (or perhaps
decreases) by a fixed amount; that is the two variables are
related by a straight line. A straight line is fully
characterized by its slope and intercept. The slope is the
amount the dependent variable (in this case the yield)
changes when the independent variable (in this case the
amount of fertilizer) and is generally given the letter 'm'.
The intercept is the value of the dependent variable when
the independent has a value of 0 and is most often given the
symbol 'b'. The independent variable is generally given the
letter 'x' and the dependent the letter 'y'. The equation
for the straight line is then:
y = m * x + b
(the * symbol means multiply).
We could attempt to find the relationship by plotting
the dependent variable (yield) on the vertical axis (the ordinate) and
the independent variable on the horizontal axis (the abscissa) of a
graph, drawing a straight line through the data points and then
extracting the slope and intercept by hand. However, the question
would then be, "since an infinite number of straight lines can be
drawn on the graph, how do we know if we have drawn the line that be
have drawn the one that best represents the correlation?" In other
words, how do we know we have drawn the 'line of best fit'?
The line of best fit tells us if the dependent variable
(the yield in this case) is directly effected by the independent
variable (the amount of fertilizer). The line of best fit is
calculated by a method called 'least squares' and the least squares
program on this disk does this and reports the slope and intercept of
the best fit line.
A statistic, called Pearson's r, is also calculated.
Pearson's r is a measure of the 'goodness' of the fit, and
it has the following properties:
a) if r = 1 the data are totally correlated;
b) if r = 0 then either there is no relationship
between x and y, or the relationship is not linear;
c) the closer the value of r is to 1, the more closely
the calculated values lie to the measured ones. An r of
.9 indicates a good fit. [2]
Choose L from the program menu we can use it to analyze
the following data.
A student puts varying amounts (in grams) of fertilizer
on a series of bean plants. At a fixed time, she harvests
the beans and measures the yield (in grams). She obtains
the following data:
fertilizer (grams) .50 1.0 2.0 3.6 5.0 7.5
yield (grams) 11.8 15.0 14.5 22.0 26.5 30.5
What is the equation for the best line relating the yield
and fertilizer? What is the value of r?
We choose the Input module, give the data set the name
BEAN PROJECT, the Independent variable the name FERTILIZER
(GRAMS), the Dependent YIELD (GRAMS) and enter the numerical
data and end with an END. Then from the main menu, we
choose G for graph and then L for a linear fit [3], the
program calculates the results and....
Lo and behold! A graph! Yes, the program draws a scatter diagram
of the data and the best straight line. Plus, it gives us at the
bottom of the screen, the values for the slope, intercept and fit!
We can get a hardcopy of these results (the values,
not, alas, the graph) by pressing any key and then typing an
R (for send Results to printer). This gives us:
INDEPENDENT VARIABLE (X) : FERTILIZER (GRAMS)
DEPENDENT VARIABLE (Y) : YIELD (GRAMS)
SLOPE : 2.74981414
ERROR ON SLOPE : .291349548 [4]
PERCENT ERROR ON SLOPE : 10.5%
INTERCEPT : 11.0443587
ERROR ON INTERCEPT : 1.18742069
PERCENT ERROR ON INTERCEPT : 10.7%
PEARSON'S R ('FIT') : .978277057
TYPE OF FIT : LINEAR
Thus, in this case, the equation for the straight line
relating the yield and fertilizer would be (with rounding)
yield = (2.75) * (fertilizer) + 11.04
Since Pearson's r is close to 1, the fit is a good one.
It is also possible to interpolate a value, that is
determine what the calculated value for y would be for a
given value of x. For example, if from the Graph menu we
choose I for interpolate, and enter 10 for the abscissa (the
x value), we get a value of the ordinate (y) of 39.0 grams.
[Problem: A group of 19 students in an English class are
given a vocabulary test and a spelling test. The number of
errors for each are given below. Are the data correlated?
What is the equation for the line between them? If a student
got 13 errors on the vocabulary test, what would be the
likely value for the number of errors on the spelling test?
Errors on

Vocabulary 14 19 13 15 14 14 17 10 15 12 9 16 11 15 16 8
Spelling 6 11 4 8 6 3 11 3 6 5 4 9 3 10 10 4
Vocabulary 15 17 16
Spelling 9 11 9
[Answer: The data are somewhat correlated. The points are
scattered around the line and the fit is only 0.843.
The equation would be:
spelling errors = (.873) * (vocabulary errors)  5.27
For a student with 13 vocabulary errors, the most likely
number of spelling errors would be 6.]
FOOTNOTES TO THE LEAST SQUARE PROGRAM
[1] The line of best fit has the following characteristic:
if y' is the value of the dependent variable as calculated
from the equation for the line for each value of the
independent variable, x, and y is the actual measured value
for the independent variable for each value of x, then the
best fit line has the smallest possible value for the term:
sum over 2
all the x's (y'  y)
Since, to find the best line, we want to minimize a term that is a
square, the procedure is called THE METHOD OF LEAST SQUARES.
[2] Pearson's r has a range of from +1 to 1. A value of +1 means the
data are totally correlated and that as x increases, y increases. A
value of 1 means that the data are totally correlated and that as x
increases y decreases. A value of 0 for Pearson's r means that the
data are totally uncorrelated (that is, the value of x has no effect
on the value of y). This program reports only the absolute value of
Pearson's r. You can tell if it is negative or positive by looking at
the graph.
[3] This program also allows you to transform the data to
see if the line is linear in some other way. The other fits
that we can do are:
a * x
Exponential: log (y) = a * x + b ; or y = b * e
((yb)/a)
Logarithmic: y = a * log (x) + b ; or e = x
a
Power: log (y) = a * log (x) + b ; or y = b * x
These are useful mainly in scientific applications.
[4] The error on the slope is essentially the standard
deviation of the slope parameter. The error on the intercept
is, essentially, the standard deviation of the intercept.
MULTIPLE REGRESSION
What if a variable is a function of two other
variables, can we still determine a relationship between the
three of them? For instance, could we find the relationship
between the yield of the bean plants, the amount of
fertilizer and the amount of water provided. We can, using
the method of multiple regression. Multiple regression can
be expanded to find the relationship between any number of
independent variables and a dependent one, but the program
on this disk limits you to two independent variables.
We get into the multiple regression program by typing
the letter R from the program menu. As an example, suppose
that a mathematics teacher believes that the students'
grades on his first calculus test will be a linear function
of the grades on algebra and geometry placement exams given
at the beginning of the course. Here is the data:
Algebra placement score 14 14 9 10 9 15 19 12 18
Geometry placement score 7 6 4 4 7 8 9 8 7
Calculus exam grade 76 64 42 57 69 82 91 76 84
When we enter the data and run the calculation we get
following:
DATA SET NAME: PREDICTION OF CALCULUS
NO. OF PNTS : 9
SYM VARIABLE
Y DEPENDENT CALCULUS GRADE
X1 1ST INDEP ALGEBRA SCORE
X2 2ND INDEP GEOMETRY SCORE
Y = 5.84E+00 * X1 + 1.54E+00 * X2
+ 1.17E+01
CORR. COEF. = 9.23E01
This means that the relationship is
Calc. grade = 5.84 * (alg scr) + 1.54 * (geo scr) + 11.7
The 'corr. coef.' is the coefficient of correlation, a
statistic similar to Pearson's r, which is a measure of the
fit. A value of .923 indicates that the fit is fairly good.
[Problem: A physics instructor believes that the location of
a falling object is a function of the time since its release
and the square of the time. He takes the following
measurements:
Time (seconds) 1 2 3 4
Time squared (seconds squared) 1 4 9 16
Location (meters) 12.0 14.2 6.6 10.8
Are the variables correlated? What is the relationship
between them? What is the location of the body at 5 seconds?
[Answer: They are perfectly correlated (this is, in fact,
the data for an object in free fall in a vacuum). The
equation is
y = 16.9 * x1  4.9 * x2 + 0
At 5 seconds (and 25 seconds squared) the object would be at
38 meters.
THE CHI SQUARE TEST
A. ONE WAY
So far, we have been looking at situations where we have made
some observations (called the observed, O) and compared it with some
theoretical or expected values (called the expected, E) to see if the
difference between the two is a significant deviation (such as a bias)
or only to do a sampling error (one that results because our sample is
of finite size). In all the cases so far, the data has been assumed
to be normally distributed. What if it is not normally distributed
or, horrors, it is by class (A, B, C, etc) instead of numerical. Can
we still determine if there is a significant difference between the
observed and the expected?
Indeed there is. We use the chi square test. The formula
for the calculation of chi square is:
2
2 (O  E)
X = sum of 
E
The X is the Greek letter chi, which looks very much like an X. The
chi square test can be applied to any case where we know the observed
and expected values, even one where the data are normally distributed.
We reject or accept our null hypothesis (which is that there is no
difference between the observed and expected) by checking our value of
chi square in a table of the chi square distribution (for a certain
number of degrees of freedom). If our value exceeds the value in the
table, for a specified level of significance, we reject the null
hypothesis; if the two are equal or ours is lower, we accept the null
hypothesis.
For instance, suppose that in a certain course, the grade
distribution had traditionally been:
Grade A B C D F
Percent receiving 5 20 50 20 5
But when a particular instructor teaches the course, his
grade distribution is 26 A's, 92 B's, 270 C's, 101 D's and
14 F's. Is there a significant difference between this
instructor's grade distribution and the traditional one?
We type C from the program menu to get to the chi
square programs and the O to select the oneway program (one
way is when we have only one column of data, two way is when
we have rows and columns). Giving the data set the name
GRADE DISTRIBUTION and entering the observed data, we are
next asked if the expected values are to be entered as
numbers (that is values for each point), as percents, or to be
assumed equal for all classes. We will choose percents and
then enter the expected percentages. Calling the calculation
module, we get
NAME OF DATA SET: GRADE DISTRIBUTION
DATA POINTS: 5 DEGREES OF FREEDOM: 4
CALCULATED CHI SQUARE: 7.07E+00
NULL HYPOTHS: OBSERVED AND EXPECTED SAME
ACCEPT NULL HYPOTHS AT 5.0% LEVEL OF SIG
So, we conclude that there is no significant difference
between the teacher's grade distribution and the traditional
distribution. We can also examine the chi square values
point by point, in which case we would see
CLASS NAME CHI SQUARE
A 2.87E02
B 7.35E01
C 1.36E+00
D 1.59E03
F 4.94E+00
We see here that the largest contribution to the total chi
square came from the low number of F's. We can also use the Edit
module to look at the data. Since there is a complex relationship
between the observed and expected values, the Edit module will only
allow you to change the data set and class names. If you want to edit
the values, you must reenter them using the input module.
[Problem: An instructor believes that the further a student
sits from the lecture table, the more likely the student is
to be absent. The instructor takes attendance for a week
and sums the number of absences for each row of seats. The
following results are obtained:
Row 1 2 3 4 5 6 7 8 9 10 11 12
Absences 7 9 6 11 12 9 10 8 13 10 11 13
Is the instructor correct in this belief?
[Answer: To find out, we run a one way chi square. The null
hypothesis is that the observed and expected number of
absences are the same. The choice for the expected number
of absences is that they are all the same, regardless of
row. Thus, if we reject the null hypothesis we have reason
to believe the instructor is correct. However, the
calculation indicates that the null hypothesis is to be
accepted  that the observed and expected are the same.
Since the expected number was independent of row, there is
no evidence that the instructor was correct.
B. TWO WAY
A two way chi square is when you have both rows and
columns of data (for example the number of students classified by
both age and sex). The data entry, calculation and
presentation of results are carried out in much the same way
as in the one way chi square, with one addition. When you
enter the expected values, you can select the Independence
Values option.
To illustrate what the independence values are, suppose
we had the following observed division of students by age
and sex:
Age
Less than 25 More then 25 Total
S Males 18 12 30
e
x Females 35 37 72
Total 53 49 Grand
total 102
The independence values are calculated by assuming that only
the totals in the rows and columns effect the expected
values. For example, there are 102 students in total, 53,
or 51.9 per cent, of whom are under 25. Since we have a total of 30
males, we would expect 51.9 per cent of 30 or 16 (rounded to whole
number) would be less than 25. Also, the other expected
values are:
Age
Less than 25 More then 25 Total
S Males 16 14 30
e
x Females 37 35 72
Total 53 49 Grand
total 102
As an example, let's assume that at a particular school, the
distribution of male and female instructors by highest earned degree
was:
Bachelor Master Doctor
Male 4 69 65
Female 12 31 9
Is there a difference between the observed and the expected
independence values? Running the two way chi square program
we get:
NAME OF DATA SET: EDUCATION OF FACULTY
DATA POINTS : 6 DEGREES OF FREEDOM: 2
CALCULATED CHI SQUARE: 3.04E+01
NULL HYPOTHS: OBSERVED AND EXPECTED SAME
REJECT NULL HYPOTHS AT 0.1% LEVEL OF SIG
The observed and expected values are very different. This
means that either their is a significant difference between
the two or that number of faculty in a particular category
is a significant relationship of the category's education
and sex values. We can also look at the chi square by
individual category and at the listing of the data to see
that the largest contributions to the total chi square come
from the high number of female bachelors (12, expect 4, chi
square 16.0) and the low number of female doctors (9, expect
20, chi square 6.05). Take this up with your bargaining
agent immediately!
[Problem: A student survey finds the following results for
the family income and the likelihood that a student has a
parttime job:
Family Income (in 1000's of $)
<10 1020 2030 3040 >40
Has Job 16 59 53 17 3
No Job 12 78 67 52 41
What is the chi square for these data and what can you
conclude?
[Answer: The chi square is 31.5 and we reject the null
hypothesis that the observed and expected are the same.
That is, whether the student has a job does depend on
the family income. Looking at the data and the individual
chi square values, we see that the largest contributions to
the total chi square come from the low number of high income
students with jobs (3, expect 16, chi square 10.6) and the
high number of low income students with jobs (16, expect 10,
chi square 3.6).
OTHER STATISTICAL TESTS
The remainder of the statistical tests on this disk are
reached by pressing S (See more programs) from the Program
menu. It will be necessary to flip the disk in the drive
to access them.
SIGN TEST
In most of the analyses we have examined so far we have
assumed that the data are normally distributed. What if this
is not the case? Can we still analyze it? Yes, but we use
what are called nonparametric or distribution free tests.
The results of these tests are generally easier to
understand than the classical tests and are useful when the
data cannot be quantitatively expressed (for instance, when
it is ranked, as one would rank a group of wines). Interestingly,
while the data may not be normally distributed, it is possible to
calculate statistics from the data which are normally distributed.
The Sign Test (more properly, the Twosample Sign Test)
is one of the simplest of these tests. It is called the Sign
Test because it determines whether the difference between
paired observations is positive or negative and then uses
the number of plus and minus signs to calculate a statistic
for accepting or rejecting the null hypothesis.
For example, suppose you wished to see if an exercise
program effected the blood pressures of the students in a PE
class. Here is the data you wish to analyze:
Student Name Pressure Before Pressure After
Exercise Exercise
Tom 134 139
Dick 124 124
Harry 172 175
Bob 149 140
Ted 167 155
Alice 145 152
Carol 148 140
Using the Sign Test program, you would enter something such
as "BLOOD PRESSURE" for the Name of Data Set, "STUDENT" for
the Name of Variable, "BEFORE EXERCISE" for the 1st Condition Name and
"AFTER EXERCISE" for the 2nd Condition Name. Then you would enter the
names [1] and before and after exercise blood pressures and type "END"
for the STUDENT entry when this is done. Selecting Calculation from
the Main Menu, you get the following results:
NAME OF DATA SET: BLOOD PRESSURE
NAME OF VARIABLE: STUDENT
# CONDITION EXAMINED MEAN
1 BEFORE EXERCISE 1.48E+02
2 AFTER EXERCISE 1.46E+02
NUMBER: 7 EFFECTIVE NO.: 6
NULL HYPTHS: CONDITION HAS NO EFFECT
PLUSES: 3 MINUSES: 3 TIES: 1
CALCULATED PROBABILITY: .65625
ACCEPT NULL HYPOTHS AT 5% LEVEL OF SIG
The result NUMBER (7) is the number of entries (students),
EFFECTIVE NO. (6) is the number whose condition changed
(pressure before exercise differed from the pressure after),
the NULL HYPTHS is the null hypothesis, namely that there
was no effect on blood pressure because of exercise, PLUSES (3),
MINUSES (3) and TIES (1) are the number of entries with
positive, negative and no changes in the blood pressures,
and CALCULATED PROBABILITY [2] (.65625) is probability that,
in this case, six independent samples, each of which could
either increase or decrease with equal probability, would
result in a distribution with 3 or more plus changes.
The result you will most likely be most interested one
is the final line, which says "ACCEPT NULL HYPOTHS AT 5%
LEVEL OF SIG." In this case, it means that we have, at a 5%
level of significance, no evidence that the exercise program
had any effect on blood pressure.
Problem: To study the effect of room temperature on student
performance, 19 students were given two quizzes, one when
the temperature was 65 degrees, another when it was 70
degrees. Here are the results. At the 5 per cent level of
significance, does temperature have any effect on quiz
performance?
Student Low Temp High Temp Student Low Temp High Temp
Grade Grade Grade Grade
A 12 8 B 11 12
C 6 6 D 8 9
E 10 12 F 4 7
G 12 12 H 8 8
I 7 6 J 7 9
K 4 3 L 3 2
M 11 10 N 6 8
O 9 10 P 6 9
Q 12 9 R 10 12
S 5 6
[Answer: The null hypothesis is accepted at the 5 per cent level of
significance, that is temperature has no effect on the grade.]
NOTES ON THE SIGN TEST
[1] When entering the data under the variable column, a carriage
return will number the entries in order.
[2] If the effective sample size is 30 or more, the calculated
statistic is a zscore.
THE MANNWHITNEY U TEST
(A RANK SUM TEST)
The assumption made when the means of samples drawn
from two populations are compared using Student's ttest (or
for more than two populations using the analysis of variance
program) is that the populations are normally distributed.
Often this is not the case. For instance, the ages of students at
community colleges are not normally distributed  we have a group of
people in their teens and twenties, many less in their thirties to
fifties and then a smaller bunch in their sixties. The distribution
might look something like:



#  
  
  
_______________________
0 10 20 30 40 50 60 70
Age of Community College
Student
This is not a normally distributed population and to compare
it with a sample from another population requires something
other than the programs we have used so far. The method we
will use for comparing samples drawn from non normally
distributed populations is a ranksum test, specifically
the MannWhitney U test.
Basically, the MannWhitney U test (or Utest) combines
the two samples into one, ranks the combined sample from low
to high and then determines the sum of the ranks for each of
the individual samples. For example, suppose sample A
consisted of the values 1.0 and 3.2 and sample B of the
values 2.6, 4.9 and 3.7.
Sample A: 1.0, 3.2
Sample B: 2.6, 4.9, 3.7
Combined Sample: 1.0, 3.2, 2.6, 4.9, 3.7
identity  > A B A B B
Ordered Combined Sample: 1.0, 2.6, 3.2, 3.7, 4.9
A B A B B < identity
Ranked Combined Sample: 1, 2, 3, 4, 5
Rank sum of A : 1 + 3 = 4
Rank sum of B : 2 + 4 + 5 = 11
To perform the Utest, we calculate a statistic called U based on the
size of the samples and the rank sums and calculate the mean and
standard deviation of the combined sample based on the null hypothesis
that the two samples forming it were identically distributed. Using
the values of U, the mean and the standard deviation, we can then
calculate another statistic (in this case a zscore) to see if the
null hypothesis is accepted or rejected at a given level of
significance.
Example: The ages of 17 Canada students and 16 Skyline
students follow.
Canada: 22, 45, 37, 19, 63, 58, 18, 63, 46, 22, 37, 47, 29,
38, 48, 35, 27
Skyline: 19, 21, 19, 26, 58, 43, 22, 19, 31, 18, 22, 18, 50,
20, 20, 29
The question (null hypothesis) is: "Do the two colleges have
identical age distributions?" To find out, we run the Utest
program, calling the Data Set Name "AGES" and the Sample
Names "CANADA STUDENTS" and "SKYLINE STUDENTS", respectively. After
entering the data (enter END to end data entry) and selecting the
Calculation Mode, we get
DATA SET NAME : AGES
# SAMPLE NAME NO. RANK SUM
1 CANADA STUDENTS 17 353.5
2 SKYLINE STUDENTS 16 207.5
MEAN 1: 38.4705883 MEAN 2: 27.1875
MEAN OF U : 136
STD. DEV. OF U : 27.7608838
CALCULATED U : 200.5
CALCULATED Z : 2.32341306
NULL HYPOTHESIS: SAMPLE DISTRIBS. SAME
REJECT NULL HYPOTHIS AT 5% LEVEL OF SIG
The rank sums, mean of U, standard deviation of U, calculated U and
calculated Z are given for those interested. Focusing on the
conclusion, we see that the null hypothesis (that the two
distributions are the same) is rejected at the 5 per cent level of
significance [however, it would be accepted at the 2 per cent and
lower levels]. We therefore conclude that the age distributions at the
two campuses are different. Since the mean age of the Skyline students
[27.19] is less than that of the Canada students [38.47], we
conclude that Skyline students are, on the average, younger.
Problem: A class of 9 students is given extra assistance in
the Learning Center while a similar class of 10 students
receives no such assistance. The final grades are given
below:
With assistance: 96, 67, 43, 83, 72, 68, 51, 87, 77
Without assistance: 38, 38, 81, 64, 29, 64, 54, 50, 56, 61
Use the Utest to determine if the distributions are the
same at the 5 per cent level of significance
[Answer: The null hypothesis is rejected at the 5 per cent level of
significance. Thus we conclude that the distributions are
different. Since the mean grade of those who had assistance [71.56]
is higher than that of those who did not [54.5], we can conclude that
assistance from the Learning Center is beneficial.]
THE RUNS TEST
Is the data you are analyzing truly random? That is,
are you sure that no bias, either intentional or accidental,
entered into the sampling procedure? You can determine the
randomness of the sample by means of the Runs Test, which is
based on the order in which the data was collected.
A run is defined as a unbroken subsequence of identical
symbols. For instance, the answers on a truefalse test might
be:
T T F T F F T T F F T F
which, when broken down into runs, gives us:
T T / F / T / F F / T T / F F / T / F
Thus we have 4 runs of T and 4 of F. If a sample has too few
runs (e. g. T T T T T T T T T T T T = 1 run) or too many
(for example T F T F T F T F T F T F = 12 runs), it would not be
random. The Runs Test calculates the number of runs of each
type, the mean and standard deviation of the number of runs
for a random sample of the same size and then calculates a
statistic (a z score) upon which we can accept or reject the
null hypothesis that the sequence is random.
Example: The correct answers to a truefalse portion of a
test are:
T F F T F F T T T F F T F F T F F T T F T T T F F T T F F F
T T T F T F T T F F T T F F T F T F T F T F T F F T F T T F
F T T F F T
Is the sample random or is there a pattern?
Using the Runs Test, we enter an appropriate name for
the Data Set and indicate that the data are by class (see
below) and that the two classes will be TRUE and FALSE.
After entering the data as T and F (the program will
automatically assign different symbols for the two classes,
even if the classes are, for instance, TOUGH and TOUGHER) and
concluding with an END, the results are calculated and we
get the following:
NAME OF DATA SET: TRUE FALSE TEST
DATA IS BY CLASS
# CLASS EXAMINED NUMBER
1 TRUE 3.30E+01
2 FALSE 3.30E+01
NULL HYPOTHS: SEQUENCE IS RANDOM
NUMBER OF RUNS: 41 TOTAL NO.: 66
MEAN OF NUMBER OF RUNS: 3.40E+01
STANDARD DEVIATION OF RUNS: 4.03E+00
CALCULATED Z: 1.74E+00
ACCEPT NULL HYPTHS AT 5% LEVEL OF SIG
We see that, at the 5 per cent level of significance, the null
hypothesis is accepted. The sequence is random.
The program will also accept numerical data for analysis. In this
case, the median is calculated and the data are examined for runs of
values above or below the median.
Problem: A chemistry instructor believes that he need not
grade his finals as, he believes, the first 10 per cent of the
students handing in the tests will be F's, the next 10 per cent will
be A's, the next 70 per cent will be B's and C's and the last 10 per cent will be F's again. Nonetheless, he does grade the exams and, in order they were handed in, the grades were:
26, 13, 38, 76, 65, 84, 44, 91, 78, 31, 68, 86, 94, 41, 78,
84, 96, 12, 87, 43, 34, 78
Should the instructor's hypothesis be accepted or rejected
at the 5 per cent level?
[Answer: The hypothesis, as stated, is too complex to analyze with
these programs. However, when we do a Runs test on the data, we find
that the sequence is randomly distributed about the median (at the 5
per cent level of significance), so the instructor's hypothesis must
be rejected. However, he continues to believe it.]
RANK CORRELATION
The least squares method allows you to determine if two normally
distributed variables are correlated. If they are not normally
distributed, it still possible to determine if there is a correlation
by the means of the Rank Correlation. For example, we might wish to
determine, for example, if the opinions of two judges of seven horses
are correlated. The opinions of a judge need not be normally
distributed since it might well be possible, for example, that, in the
judge's opinion, four of the horses are excellent (although
different), two are mediocre (but again different) and the
seventh is a real dog. While the horses can be ranked from 1
to 7, the distribution is not normal.
The rank correlation coefficient, r, was introduced by Spearman
in 1904 to test if there is a correlation between ranked variables.
This coefficient is the statistic that is examined to test the null
hypothesis. The null hypothesis in this test, by the way, is that the
two rankings are DIFFERENT.
For instance, suppose a group of faculty and a group of
administrators independently rank twelve different applicants for the
position of chief executive officer of a school. Here are the ranks:
Applicant Rank by Faculty Rank by Administrators
A 6 1
B 3 5
C 10 3
D 7 10
E 1 9
F 11 4
G 5 2
H 9 7
I 2 8
J 8 6
K 4 11
L 12 12
Are the rankings of the faculty and administrators they same
of different? Running the Rank Correlation program, entering
RANKING OF APPLICANTS for the Data Set Name, APPLICANT for
the Variable and RANK BY FACULTY for the 1st Condition and
RANK BY ADMINISTRATORS for the 2nd Condition, then entering
the ranks and finally doing the calculation, we get:
NAME OF DATA SET: RANKING OF APPLICANTS
NAME OF VARIABLE: APPLICANT
# CONDITION EXAMINED MEAN
1 RANK BY FACULTY 6.50E+00
2 RANK BY ADMINISTRATORS 6.50E+00
NUMBER: 12
NULL HYPTHS: CONDITIONS ARE DIFFERENT
RANK CORRELATION COEFFICIENT: 5.59E02
STD. DEV. OF RANK CORR COEF: 3.02E01
CALCULATED Z: 1.86E01
ACCEPT NULL HYPTHS AT 5% LEVEL OF SIG
The rank correlation coefficient [.0559] is very close to zero, which
generally means that the null hypothesis should be accepted. Indeed,
the calculated z score is small (the fact that it is negative is
irrelevant) and the null hypothesis is accepted. Thus we would
conclude that the rankings by the faculty and the rankings by the
administrators are different.
Problem: The program will allow you to enter numerical data (which it
will then rank). For instance, suppose we wished to see if the IQ's
of the members of married couples are correlated. Here's the data:
COUPLE HUSBAND WIFE
A 96 110
B 114 118
C 119 99
D 134 137
E 142 124
F 109 108
G 127 101
H 99 98
I 147 153
J 125 138
K 127 104
[Answer: When the rank correlation coefficient is calculated, we see
that the null hypothesis should be rejected at the 2 per cent level,
that is, the IQ's of the husband and the wife are correlated.]
PROBABILITY
The Probability program on the Program menu will allow you to
calculate the probability of a value, or range of values, in the
binomial, normal and Poisson distributions.
For the binomial distribution, you enter the total number in the
sample and the probability that a single event will occur. You then
ask for the probability that either a specific number will occur or a
range of numbers will occur. For example, you could say you have 2
objects in your sample and that the probability of a single event
happening is 0.5. If you ask what is the probability that 2 events
will happen, the program will tell you it is 0.25. If the number
of objects in the sample is greater than 30, either the normal or the
Poisson approximation to the binomial distribution is used.
For the normal distribution, you must enter the mean
and standard deviation. The other information is the same
as the binomial. The program will only calculate a probability for
several events within plus or minus three standard deviations of the
mean.
For the Poisson distribution, you must enter the mean. The
probability of any number of events occurring will be calculated but
if the value is far from the mean, the calculation could take a long
time as factorials are being calculated.
FILE SUFFIXES
When you save a file to disk, the program will add a suffix to it
as an identification tag. Here is a list of programs, suffixes and
type of data stored:
Program Suffix Type of data
MEAN .MEAN Raw data
MEAN .MEST Analyzed data
ONE WAY CHI SQUARE .CS1D Raw data
TWO WAY CHI SQUARE .CS2D Raw data
LEAST SQUARES .LSF Raw data
MULTIPLE REGRESSION .MTRG Raw data
SIGN .SIGNS Raw data
MANN WHITNEY U .MWUT Raw data
RUNS .RUNS Raw data
RANK CORRELATION .RCD Raw data
ONE WAY CHI SQUARE .CS1D Raw data
TWO WAY CHI SQUARE .CS2D Raw data
LEAST SQUARES .LSF Raw data
MULTIPLE REGRESSION .MTRG Raw data
SIGN .SIGNS Raw data
MANN WHITNEY U .MWUT Raw data
RUNS .RUNS Raw data
RANK CORRELATION .RCD Raw data
APPLE II FOREVER!