1. Provide citation and reference to the material(s) you discuss. Describe what you found interesting regarding this topic, and why.
2. Describe how you will apply that learning in your daily life, including your work life.
3. Describe what may be unclear to you, and what you would like to learn.
Reference/Module
Module 13: Comparing More Than Two Groups
Using Designs with Three or More Levels of an Independent Variable
Comparing More than Two Kinds of Treatment in One Study
Comparing Two or More Kinds of Treatment with a Control Group
Comparing a Placebo Group to the Control and Experimental Groups
Analyzing the Multiple-Group Design
One-Way Between-Subjects ANOVA: What It Is and What It Does
Review of Key Terms
Module Exercises
Critical Thinking Check Answers
Module 14: One-Way Between-Subjects Analysis of Variance (ANOVA)
Calculations for the One-Way Between-Subjects ANOVA
Interpreting the One-Way Between-Subjects ANOVA
Graphing the Means and Effect Size
Assumptions of the One-Way Between-Subjects ANOVA
Tukey’s Post Hoc Test
Review of Key Terms
Module Exercises
Critical Thinking Check Answers
Chapter 7 Summary and Review
Chapter 7 Statistical Software Resources
In this chapter, we discuss the common types of statistical analyses used with designs involving more than two groups. The inferential statistics discussed in this chapter differ from those presented in the previous two chapters. In Chapter 5, single samples were being compared to populations (z test and t test), and in Chapter 6, two independent or correlated samples were being compared. In this chapter, the statistics are designed to test differences between more than two equivalent groups of subjects.
Several factors influence which statistic should be used to analyze the data collected. For example, the type of data collected and the number of groups being compared must be considered. Moreover, the statistic used to analyze the data will vary depending on whether the study involves a between-subjects design (designs in which different subjects are used in each group) or a correlated-groups design. (Remember, correlated-groups designs are of two types: within-subjects designs, in which the same subjects are used repeatedly in each group, and matched-subjects designs, in which different subjects are matched between conditions on variables that the researcher believes are relevant to the study.)
We will look at the typical inferential statistics used to analyze interval-ratio data for between-subjects designs. In Module 13 we discuss the advantages and rationale for studying more than two groups; in Module 14 we discuss the statistics appropriate for use with between-subjects designs involving more than two groups.
Learning Objectives
•Explain what additional information can be gained by using designs with more than two levels of an independent variable.
•Explain and be able to use the Bonferroni adjustment.
•Identify what a one-way between-subjects ANOVA is and what it does.
•Describe what between-groups variance is.
•Describe what within-groups variance is.
•Understand conceptually how an F-ratio is derived.
The experiments described so far have involved manipulating one independent variable with only two levels—either a control group and an experimental group or two experimental groups. In this module, we discuss experimental designs involving one independent variable with more than two levels. Examining more levels of an independent variable allows us to address more complicated and interesting questions. Often, experiments begin as two-group designs and then develop into more complex designs as the questions asked become more elaborate and sophisticated.
Using Designs with three or More Levels of an Independent Variable
Researchers may decide to use a design with more than two levels of an independent variable for several reasons. First, it allows them to compare multiple treatments. Second, it allows them to compare multiple treatments to no treatment (the control group). Third, more complex designs allow researchers to compare a placebo group to control and experimental groups (Mitchell & Jolley, 2001).
Comparing More than Two Kinds of Treatment in One Study
To illustrate this advantage of more complex experimental designs, imagine that we want to compare the effects of various types of rehearsal on memory. We have participants study a list of 10 words using either rote rehearsal (repetition) or some form of elaborative rehearsal. In addition, we specify the type of elaborative rehearsal to be used in the different experimental groups. Group 1 (the control group) uses rote rehearsal, Group 2 uses an imagery mnemonic technique, and Group 3 uses a story mnemonic device. You may be wondering why we do not simply conduct three studies or comparisons. Why don’t we compare Group 1 to Group 2, Group 2 to Group 3, and Group 1 to Group 3 in three different experiments? There are several reasons why this is not recommended.
You may remember from Module 11 that a t test is used to compare performance between two groups. If we do three experiments, we need to use three t tests to determine any differences. The problem is that using multiple tests inflates the Type I error rate. Remember, a Type I error means that we reject the null hypothesis when we should have failed to reject it; that is, we claim that the independent variable has an effect when it does not. For most statistical tests, we use the .05 alpha level, meaning that we are willing to accept a 5% risk of making a Type I error. Although the chance of making a Type I error on one t test is .05, the overall chance of making a Type I error increases as more tests are conducted.
Imagine that we conducted three t tests or comparisons among the three groups in the memory experiment. The probability of a Type I error on any single comparison is .05. The probability of a Type I error on at least one of the three tests, however, is considerably greater. To determine the chance of a Type I error when making multiple comparisons, we use the formula 1 − (1 − α)c, where c equals the number of comparisons performed. Using this formula for the present example, we get
1 − (1 − .05)3 = 1 − (.95 )3 = 1 − .86 = .14
Thus, the probability of a Type I error on at least one of the three tests is .14, or 14%.
Bonferroni adjustment A means of setting a more stringent alpha level in order to minimize Type | errors.
One way of counteracting the increased chance of a Type I error is to use a more stringent alpha level. The Bonferroni adjustment , in which the desired alpha level is divided by the number of tests or comparisons, is typically used to accomplish this. For example, if we were using the t test to make the three comparisons just described, we would divide .05 by 3 and get .017. By not accepting the result as significant unless the alpha level is .017 or less, we minimize the chance of a Type I error when making multiple comparisons. We know from discussions in previous modules, however, that although using a more stringent alpha level decreases the chance of a Type I error, it increases the chance of a Type II error (failing to reject the null hypothesis when it should have been rejected—missing an effect of an independent variable). Thus, the Bonferroni adjustment is not the best method of handling the problem. A better method is to use a single statistical test that compares all groups rather than using multiple comparisons and statistical tests. Luckily for us, there is a statistical technique that will do this—the analysis of variance (ANOVA), which will be discussed shortly.
Another advantage of comparing more than two kinds of treatment in one experiment is that it reduces both the number of experiments conducted and the number of subjects needed. Once again, refer back to the three-group memory experiment. If we do one comparison with three groups, we can conduct only one experiment, and we need subjects for only three groups. If, however, we conduct three comparisons, each with two groups, we need to perform three experiments, and we need participants for six groups or conditions.
Comparing Two or More Kinds of Treatment with a Control Group
Using more than two groups in an experiment also allows researchers to determine whether each treatment is more or less effective than no treatment (the control group). To illustrate this, imagine that we are interested in the effects of aerobic exercise on anxiety. We hypothesize that the more aerobic activity engaged in, the more anxiety will be reduced. We use a control group that does not engage in any aerobic activity and a high aerobic activity group that engages in 50 minutes per day of aerobic activity—a simple two-group design. Assume, however, that when using this design, we find that both those in the control group and those in the experimental group have high levels of anxiety at the end of the study—not what we expected to find. How could a design with more than two groups provide more information? Suppose we add another group to this study—a moderate aerobic activity group (25 minutes per day)—and get the following results:
Control Group |
High Anxiety |
Moderate Aerobic Activity |
Low Anxiety |
High Aerobic Activity |
High Anxiety |
Based on these data, we have a V-shaped function. Up to a certain point, aerobic activity reduces anxiety. However, when the aerobic activity exceeds a certain level, anxiety increases again. If we had conducted only the original study with two groups, we would have missed this relationship and erroneously concluded that there was no relationship between aerobic activity and anxiety. Using a design with multiple groups allows us to see more of the relationship between the variables.
Figure 13.1 illustrates the difference between the results obtained with the three-group versus the two-group design in this hypothetical study. It also shows the other two-group comparisons—control compared to moderate aerobic activity, and moderate aerobic activity compared to high aerobic activity. This set of graphs allows you to see how two-group designs limit our ability to see the full relationship between variables.
FIGURE 13.1FIGURE> Determining relationships with three-group versus two-group designs: (a) three-group design; (b) two-group comparison of control to high aerobic activity; (c) two-group comparison of control to moderate aerobic activity; (d) two-group comparison of moderate aerobic activity to high aerobic activity
Figure 13.1a shows clearly how the three-group design allows us to assess more fully the relationship between the variables. If we had only conducted a two-group study, such as those illustrated in Figure 13.1b, c, or d, we would have drawn a much different conclusion from that drawn from the three-group design. Comparing only the control to the high aerobic activity group (Figure 13.1b) would have led us to conclude that aerobic activity does not affect anxiety. Comparing only the control to the moderate aerobic activity group (Figure 13.1c) would have led to the conclusion that increasing aerobic activity reduces anxiety. Comparing only the moderate aerobic activity group to the high aerobic activity group (Figure 13.1d) would have led to the conclusion that increasing aerobic activity increases anxiety.
Being able to assess the relationship between the variables means that we can determine the type of relationship that exists. In the previous example, the variables produced a V-shaped function. Other variables may be related in a straight linear manner or in an alternative curvilinear manner (for example, a J-shaped or S-shaped function). In summary, adding levels to the independent variable allows us to determine more accurately the type of relationship that exists between the variables.
Comparing a Placebo Group to the Control and Experimental Groups
A final advantage of designs with more than two groups is that they allow for the use of a placebo group—a group of subjects who believe they are receiving treatment but in reality are not. A placebo is an inert substance that participants believe is a treatment. How can adding a placebo group improve an experiment? Consider an often-cited study by Paul (1966, 1967) involving children who suffered from maladaptive anxiety in public-speaking situations. Paul used a control group, which received no treatment; a placebo group, which received a placebo that they were told was a potent tranquilizer; and an experimental group, which received desensitization therapy. Of the participants in the experimental group, 85% showed improvement, compared with only 22% in the control condition. If the placebo group had not been included, the difference between the therapy and control groups (85% − 22% = 63%) would overestimate the effectiveness of the desensitization program. The placebo group showed 50% improvement, meaning that the therapy’s true effectiveness is much less (85% − 50% = 35%). Thus, a placebo group allows for a more accurate assessment of a therapy’s effectiveness because, in addition to spontaneous remission, it controls for participant expectation effects.
DESIGNS WITH MORE THAN TWO LEVELS OF AN INDEPENDENT VARIABLE
Advantages |
Considerations |
Allows comparisons of more than two types of treatment |
Type of statistical analysis (e.g., multiple t tests or ANOVA) |
Fewer subjects are needed |
Multiple t tests increase chance of Type I error; Bonferroni adjustment increases chance of Type II error |
Allows comparisons of all treatments to control condition |
|
Allows for use of a placebo group with control and experimental groups |
|
Imagine that a researcher wants to compare four different types of treatment. The researcher decides to conduct six individual studies to make these comparisons. What is the probability of a Type I error, with α = .05, across these six comparisons? Use the Bonferroni adjustment to determine the suggested alpha level for these six tests.
Analyzing the Multiple-Group Design
ANOVA (analysis of variance) An inferential parametric statistical test for comparing the means of three or more groups.
As noted previously, t tests are not recommended for comparing performance across groups in a multiple-group design because of the increased probability of a Type I error. For multiple-group designs in which interval-ratio data are collected, the recommended parametric statistical analysis is the ANOVA (analysis of variance). As its name indicates, this procedure allows us to analyze the variance in a study. You should be familiar with variance from Chapter 3 on descriptive statistics. Nonparametric analyses are also available for designs in which ordinal data are collected (the Kruskal-Wallis analysis of variance) and for designs in which nominal data are collected (chi-square test). We will discuss some of these tests in later modules.
We will begin our coverage of statistics appropriate for multiple-group designs by discussing those used with data collected from a between-subjects design. Recall that a between-subjects design is one in which different participants serve in each condition. Imagine that we conducted the study mentioned at the beginning of the module in which subjects are asked to study a list of 10 words using rote rehearsal or one of two forms of elaborative rehearsal. A total of 24 participants are randomly assigned, 8 to each condition. Table 13.1 lists the number of words correctly recalled by each participant.
one-way between-subjects ANOVA An inferential statistical test for comparing the means of three or more groups using a between-subjects design.
Because these data represent an interval-ratio scale of measurement and because there are more than two groups, an ANOVA is the appropriate statistical test to analyze the data. In addition, because this is a between-subjects design, we use a one-way between-subjects ANOVA . As discussed earlier, the term between-subjects indicates that participants have been randomly assigned to conditions in a between-subjects design. The term one-way indicates that the design uses only one independent variable—in this case, type of rehearsal. We will discuss statistical tests appropriate for correlated-groups designs and tests appropriate for designs with more than one independent variable in Chapter 8. Please note that although the study used to illustrate the ANOVA procedure in this section has an equal number of subjects in each condition, this is not necessary to the procedure.
TABLE 13.1 Number of words recalled correctly in rote, imagery, and story conditions
ROTE REHEARSAL |
IMAGERY |
STORY |
|
2 |
4 |
6 |
|
4 |
5 |
5 |
|
3 |
7 |
9 |
|
5 |
6 |
10 |
|
2 |
5 |
8 |
|
7 |
4 |
7 |
|
6 |
8 |
10 |
|
3 |
5 |
9 |
|
¯¯¯X=4X¯=4 |
¯¯¯X=5.5X¯=5.5 |
¯¯¯X=8X¯=8 |
Grand Mean = 5.833 |
One-Way Between-Subjects ANOVA: What It Is and What It Does
The analysis of variance (ANOVA) is an inferential statistical test for comparing the means of three or more groups. In addition to helping maintain an acceptable Type I error rate, the ANOVA has the added advantage over using multiple t tests of being more powerful and thus less susceptible to a Type II error. In this section, we will discuss the simplest use of ANOVA—a design with one independent variable with three levels.
Let’s continue to use the experiment and data presented in Table 13.1. Remember that we are interested in the effects of rehearsal type on memory. The null hypothesis (H0) for an ANOVA is that the sample means represent the same population (H0: μ1 = μ2 = μ3). The alternative hypothesis (Ha) is that they represent different populations (Ha: at least one μ ≠ another μ). When a researcher rejects H0 using an ANOVA, it means that the independent variable affected the dependent variable to the extent that at least one group mean differs from the others by more than would be expected based on chance. Failing to reject H0 indicates that the means do not differ from each other more than would be expected based on chance. In other words, there is not enough evidence to suggest that the sample means represent at least two different populations.
grand mean The mean performance across all participants in a study.
error variance The amount of variability among the scores caused by chance or uncontrolled variables.
In our example, the mean number of words recalled in the rote rehearsal condition is 4, for the imagery condition it is 5.5, and in the story condition it is 8. If you look at the data from each condition, you will notice that most subjects in each condition did not score exactly at the mean for that condition. In other words, there is variability within each condition. The overall mean, or grand mean , across all participants in all conditions is 5.83. Because none of the participants in any condition recalled exactly 5.83 words, there is also variability between conditions. What we are interested in is whether this variability is due primarily to the independent variable (differences in rehearsal type) or to error variance —the amount of variability among the scores caused by chance or uncontrolled variables (such as individual differences between subjects).
within-groups variance The variance within each condition; an estimate of the population error variance.
The amount of error variance can be estimated by looking at the amount of variability within each condition. How will this give us an estimate of error variance? Each subject in each condition was treated similarly—they were each instructed to rehearse the words in the same manner. Because the subjects in each condition were treated in the same manner, any differences observed in the number of words recalled are attributable only to error variance. In other words, some participants may have been more motivated, or more distracted, or better at memory tasks—all factors that would contribute to error variance in this case. Therefore, the within-groups variance (the variance within each condition or group) is an estimate of the population error variance.
Now compare the means between the groups. If the independent variable (rehearsal type) had an effect, we would expect some of the group means to differ from the grand mean. If the independent variable had no effect on number of words recalled, we would still expect the group means to vary from the grand mean, but only slightly, as a result of error variance attributable to individual differences. In other words, all subjects in a study will not score exactly the same. Therefore, even when the independent variable has no effect, we do not expect that the group means will exactly equal the grand mean, but they should be close to the grand mean. If there were no effect of the independent variable, then the variance between groups would be due to error.
between-groups variance An estimate of the effect of the independent variable and error variance.
Between-groups variance may be attributed to several sources. There could be systematic differences between the groups, referred to as systematic variance. The systematic variance between the groups could be due to the effects of the independent variable (variance due to the experimental manipulation). However, it could also be due to the influence of uncontrolled confounding variables (variance due to extraneous variables). In addition, there will always be some error variance included in any between-groups variance estimate. In sum, between-groups variance is an estimate of systematic variance (the effect of the independent variable and any confounds) and error variance.
F-ratio The ratio of between-groups variance to within-groups variance.
By looking at the ratio of between-groups variance to within-groups variance, known as an F-ratio , we can determine whether most of the variability is attributable to systematic variance (due, we hope, to the independent variable and not to confounds) or to chance and random factors (error variance):
F=Between-groupsvarianceWithin-groupsvariance=Systematicvariance+errorvarianceErrorvarianceF=Between-groups varianceWithin-groups variance=Systematic variance+ error varianceError variance
Looking at the F-ratio, we can see that if the systematic variance (which we assume is due to the effect of the independent variable) is substantially greater than the error variance, the ratio will be substantially greater than 1. If there is no systematic variance, then the ratio will be approximately 1.00 (error variance over error variance). There are two points to remember regarding F-ratios. First, in order for an F-ratio to be significant (show a statistically meaningful effect of an independent variable), it must be substantially greater than 1 (we will discuss exactly how much larger than 1 in the next module). Second, if an F-ratio is approximately 1, then the between-groups variance equals the within-groups variance and there is no effect of the independent variable.
Refer back to Table 13.1, and think about the within-groups versus between-groups variance in this study. Notice that the amount of variance within the groups is small—the scores within each group vary from each individual group mean, but not by very much. The between-groups variance, on the other hand, is large—two of the means across the three conditions vary from the grand mean to a greater extent. With these data, then, it appears that we have a relatively large between-groups variance and a smaller within-groups variance. Our F-ratio will therefore be larger than 1.00. To assess how large it is, we will need to conduct the appropriate calculations (described in the next module). At this point, however, you should have a general understanding of how an ANOVA analyzes variance to determine if there is an effect of the independent variable.
ONE-WAY BETWEEN-SUBJECTS ANOVA
Concept |
Description |
Null hypothesis (H0) |
The independent variable had no effect—the samples all represent the same population |
Alternative hypothesis (Ha) |
The independent variable had an effect—at least one of the samples represents a different population than the others |
F-ratio |
The ratio formed when the between–groups variance is divided by the within–groups variance |
Between–groups variance |
An estimate of the variance of the group means about the grand mean; includes both systematic variance and error variance |
Within–groups variance |
An estimate of the variance within each condition in the experiment; also known as error variance, or variance due to chance factors |
Imagine that the following table represents data from the study just described (the effects of type of rehearsal on number of words recalled from the 10 words given). Do you think that the between-groups and within-groups variances are large, moderate, or small? Would the corresponding F-ratio be greater than, equal to, or less than 1.00?
Rote Rehearsal |
Imagery |
Story |
|
2 |
4 |
5 |
|
4 |
2 |
2 |
|
3 |
5 |
4 |
|
5 |
3 |
2 |
|
2 |
2 |
3 |
|
7 |
7 |
6 |
|
6 |
6 |
3 |
|
3 |
2 |
7 |
|
¯¯¯X=4X¯=4 |
¯¯¯X=3.88X¯=3.88 |
¯¯¯X=4X¯=4 |
Grand Mean = 3.96 |
REVIEW OF KEY TERMS
ANOVA (analysis of variance) (p. 222)
between-groups variance (p. 224)
Bonferroni adjustment (p. 218)
error variance (p. 223)
F-ratio (p. 224)
grand mean (p. 223)
one-way between-subjects ANOVA (p. 222)
within-groups variance (p. 224)
MODULE EXERCISES
(Answers to odd-numbered questions appear in Appendix B.)
1.What is/are the advantage(s) of conducting a study with three or more levels of the independent variable?
2.What does the term one-way mean with respect to an ANOVA?
3.Explain between-groups variance and within-groups variance.
4.If a researcher decides to use multiple comparisons in a study with three conditions, what is the probability of a Type I error across these comparisons? Use the Bonferroni adjustment to determine the suggested alpha level.
5.If H0 is true, what should the F-ratio equal or be close to?
6.If Ha is supported, should the F-ratio be greater than, less than, or equal to 1?
CRITICAL THINKING CHECK ANSWERS
Critical Thinking Check 13.1
The probability of a Type I error would be 26.5% [1 −(1 − .05)6] = [1 − (.95)6] = [1 − .735] = 26.5%. Using the Bonferroni adjustment, the alpha level would be .008 for each comparison.
Critical Thinking Check 13.2
Both the within-groups and between-groups variances are moderate. This should lead to an F-ratio of approximately 1.
Learning Objectives
•Identify what a one-way between-subjects ANOVA is and what it does.
•Use the formulas provided to calculate a one-way between-subjects ANOVA.
•Interpret the results from a one-way between-subjects ANOVA.
•Calculate η2 for a one-way between-subjects ANOVA.
•Calculate Tukey’s post hoc test for a one-way between-subjects ANOVA.
Calculations for the One-Way Between-Subjects ANOVA
To see exactly how ANOVA works, we begin by calculating the sums of squares (SS). This should sound somewhat familiar to you because we calculated sums of squares as part of the calculation for standard deviation in Module 5. The sums of squares in that formula represented the sum of the squared deviation of each score from the overall mean. Determining the sums of squares is the first step in calculating the various types or sources of variance in an ANOVA.
Several types of sums of squares are used in the calculation of an ANOVA. In describing them in this module, I provide definitional formulas for each. The definitional formula follows the definition for each sum of squares and should give you the basic idea of how each SS is calculated. When dealing with very large data sets, however, the definitional formulas can become somewhat cumbersome. Thus, statisticians have transformed the definitional formulas into computational formulas. A computational formula is easier to use in terms of the number of steps required. However, computational formulas do not follow the definition of the SS and thus do not necessarily make sense in terms of the definition for each SS. If your instructor would prefer that you use the computational formulas, they are provided in Appendix D.
TABLE 14.1 Calculation of SSTotal using the definitional formula
ROTE REHEARSAL |
IMAGERY |
STORY |
X |
(X−¯¯¯¯XG)2(X−X¯G)2 |
X |
(X−¯¯¯¯XG)2(X−X¯G)2 |
X |
(X−¯¯¯¯XG)2(X−X¯G)2 |
2 |
14.69 |
4 |
3.36 |
6 |
.03 |
4 |
3.36 |
5 |
.69 |
5 |
.69 |
3 |
8.03 |
7 |
1.36 |
9 |
10.03 |
5 |
.69 |
6 |
.03 |
10 |
17.36 |
2 |
14.69 |
5 |
.69 |
8 |
4.70 |
7 |
1.36 |
4 |
3.36 |
7 |
1.36 |
6 |
.03 |
8 |
4.70 |
10 |
17.36 |
3 |
8.03 |
5 |
.69 |
9 |
10.03 |
|
Σ=50.88Σ=50.88 |
|
Σ=14.88Σ=14.88 |
|
Σ=61.56Σ=61.56 |
SSTotal = 50.88 + 14.88 + 61.56 = 127.32 |
Note: All numbers have been rounded to two decimal places.
total sum of squares The sum of the squared deviations of each score from the grand mean.
The first sum of squares that we need to describe is the total sum of squares (SSTotal)—the sum of the squared deviations of each score from the grand mean. In a definitional formula, this would be represented as Σ=(X−¯¯¯XG)2Σ=(X − X¯G)2, where X represents each individual score and ¯¯¯XGX¯G represents the grand mean. In other words, we determine how much each individual subject varies from the grand mean, square that deviation score, and sum all of the squared deviation scores. Using the study from the previous module on the effects of rehearsal type on memory (see Table 13.1), the total sum of squares (SSTotal) = 127.32. To see where this number comes from, refer to Table 14.1. (For the computational formula, see Appendix D.) Once we have calculated the sum of squares within and between groups (see below), they should equal the total sum of squares when added together. In this way, we can check our calculations for accuracy. If the sum of squares within and between do not equal the sum of squares total, then you know there is an error in at least one of the calculations.
within-groups sum of squares The sum of the squared deviations of each score from its group mean.
Because an ANOVA analyzes the variance between groups and within groups, we need to use different formulas to determine the amount of variance attributable to these two factors. The within-groups sum of squares is the sum of the squared deviations of each score from its group or condition mean and is a reflection of the amount of error variance. In the definitional formula, it would be Σ(X−¯¯¯Xg)2Σ(X−X¯g)2, where X refers to each individual score and ¯¯¯XgX¯g refers to the mean for each group or condition. In order to determine this, we find the difference between each score and its group mean, square these deviation scores, and then sum all of the squared deviation scores. The use of this definitional formula to calculate SSWithin is illustrated in Table 14.2. The computational formula appears in Appendix D. Thus, rather than comparing every score in the entire study to the grand mean of the study (as is done for SSTotal), we compare each score in each condition to the mean of that condition. Thus, SSWithin is a reflection of the amount of variability within each condition. Because the subjects in each condition were treated in a similar manner, we would expect little variation among the scores within each group. This means that the within-groups sum of squares (SSWithin) should be small, indicating a small amount of error variance in the study. For our memory study, the within-groups sum of squares (SSWithin) = 62.
TABLE 14.2 Calculation of SSWithin using the definitional formula
ROTE REHEARSAL |
IMAGERY |
STORY |
X |
(X−¯¯¯¯Xg)2(X−X¯g)2 |
X |
(X−¯¯¯¯Xg)2(X−X¯g)2 |
X |
(X−¯¯¯¯Xg)2(X−X¯g)2 |
2 |
4 |
4 |
2.25 |
6 |
4 |
4 |
0 |
5 |
.25 |
5 |
9 |
3 |
1 |
7 |
2.25 |
9 |
1 |
5 |
1 |
6 |
.25 |
10 |
4 |
2 |
4 |
5 |
.25 |
8 |
0 |
7 |
9 |
4 |
2.25 |
7 |
1 |
6 |
4 |
8 |
6.25 |
10 |
4 |
3 |
1 |
5 |
.25 |
9 |
1 |
|
Σ = 24 |
|
Σ = 14 |
|
Σ = 24 |
SSWithin = 24 + 14 + 24 = 62 |
NOTE: All numbers have been rounded to two decimal places.
between-groups sum of squares The sum of the squared deviations of each group’s mean from the grand mean, multiplied by the number of subjects in each group.
The between-groups sum of squares is the sum of the squared deviations of each group’s mean from the grand mean, multiplied by the number of subjects in each group. In the definitional formula, this would be Σ[¯¯¯Xg−¯¯¯XG]2n]Σ[X¯g − X¯G]2n], where ¯¯¯XgX¯g refers to the mean for each group, ¯¯¯XGX¯G refers to the grand mean, and n refers to the number of subjects in each group. The use of the definitional formula to calculate SSBetweenis illustrated in Table 14.3. The computational formula appears in Appendix D. The between-groups variance is an indication of the systematic variance across the groups (the variance due to the independent variable and any confounds) and error. The basic idea behind the between-groups sum of squares is that if the independent variable had no effect (if there were no differences between the groups), then we would expect all the group means to be about the same. If all the group means were similar, they would also be approximately equal to the grand mean and there would be little variance across conditions. If, however, the independent variable caused changes in the means of some conditions (caused them to be larger or smaller than other conditions), then the condition means not only will differ from each other but will also differ from the grand mean, indicating variance across conditions. In our memory study, SSBetween = 65.33.
TABLE 14.3 Calculation of SSBetween using the definitional formula
Rote Rehearsal |
(¯¯¯Xg−¯¯¯XG)2n=(4−5.833)28=(−1.833)28=(3.36)8=26.88(X¯g − X¯G)2n=(4−5.833)28=(−1.833)28=(3.36)8=26.88 |
Imagery |
(¯¯¯Xg−¯¯¯XG)2n=(5.5−5.833)28=(−.333)28=(.11)8=.88(X¯g − X¯G)2n=(5.5−5.833)28=(−.333)28=(.11)8=.88 |
Story |
(¯¯¯Xg−¯¯¯XG)2n=(8−5.833)28=(2.167)28=(4.696)8=37.57(X¯g−X¯G)2n=(8−5.833)28=(2.167)28=(4.696)8=37.57 |
SSBetween = 26.88 + .88 + 37.57 = 65.33 |
We can check the accuracy of our calculations by adding the SSWithin to the SSBetween. When added, these numbers should equal SSTotal. Thus, SSWithin (62) + SSBetween (65.33) = 127.33. The SSTotal that we calculated earlier was 127.32 and is essentially equal to SSWithin + SSBetween, taking into account rounding errors.
mean square An estimate of either variance between groups or variance within groups.
Calculating the sums of squares is an important step in the ANOVA. It is not, however, the end. Now that we have determined SSTotal, SSWithin, and SSBetween, we must transform these scores into the mean squares. The term mean square (MS) is an abbreviation of mean squared deviation scores. The MS scores are estimates of variance between and within the groups. In order to calculate the MS for each group (MSWithin and MSBetween), we divide each SS by the appropriate df (degrees of freedom). The reason for this is that the MS scores are our variance estimates. You may remember from Module 5 that when calculating standard deviation and variance, we divide the sum of squares by N (or N – 1 for the unbiased estimator) in order to get the average deviation from the mean. In the same manner, we must divide the SS scores by their degrees of freedom (the number of scores that contributed to each SS minus 1).
To do this for the present example, we first need to determine the degrees of freedom for each type of variance. Let’s begin with the dfTotal, which we will use to check our accuracy when calculating dfWithin and dfBetween. In other words, dfWithin and dfBetween should sum to the dfTotal. We determined SSTotal by calculating the deviations around the grand mean. We therefore had one restriction on our data—the grand mean. This leaves us with N − 1 total degrees of freedom (the total number of participants in the study minus the one restriction). For our study on the effects of rehearsal type on memory,
dfTotal = 24 − 1 = 23
Using a similar logic, the degrees of freedom within each group would then be n − 1 (the number of subjects in each condition minus 1). However, we have more than one group; we have k groups, where k refers to the number of groups or conditions in the study. The degrees of freedom within groups is therefore k(n − 1), or N − k. For our example,
dfWithin=24−3=21dfWithin=24−3=21
Lastly, the degrees of freedom between groups is the variability of k means around the grand mean. Therefore, dfBetween equals the number of groups (k) minus 1, or k − 1. For our study, this would be
dfBetween=3−1=2dfBetween=3−1=2
Notice that the sum of the dfWithin and df Between equals df total: 21 + 2=23. This allows you to check your calculations for accuracy. If the degrees of freedom between and within do not sum to the degrees of freedom total, you know there is a mistake somewhere.
Now that we have calculated the sums of squares and their degrees of freedom, we can use these numbers to calculate our estimates of the variance between and within groups. As stated previously, the variance estimates are called mean squares and are determined by dividing each SS by its corresponding df. In our example,
MSBetween=SSBetweendfBetween=SSBetweendfBetween=65.332=32.67MSWithin=SSWithindfWithin=6221=2.95MSBetween=SSBetweendfBetween=SSBetweendfBetween=65.332=32.67MSWithin=SSWithindfWithin=6221=2.95
We can now use the estimates of between-groups and within-groups variance to determine the F-ratio:
F=MSBetweenMSWithin=32.672.95=11.07F=MSBetweenMSWithin=32.672.95=11.07
The definitional formulas for the sums of squares, along with the formulas for the degrees of freedom, mean squares, and the final F-ratio, are summarized in Table 14.4. The ANOVA summary table for the F-ratio just calculated is presented in Table 14.5. This is a common way of summarizing ANOVA findings. You will sometimes see ANOVA summary tables presented in journal articles because they provide a concise way of presenting the data from an analysis of variance.
TABLE 14.4 ANOVA summary table: definitional formulas
source |
df |
SS |
MS |
F |
Between-groups |
k − 1 |
Σ[(¯¯¯Xg−¯¯¯XG)2n]Σ[(X¯g−X¯G)2n] |
SSBdfBSSBdfB |
MSw MSBMSWMSBMSW |
Within-groups |
N − k |
Σ(X−¯¯¯Xg)2Σ(X −X¯g)2 |
SSWdfWSSWdfW |
|
Total |
N − 1 |
Σ(X−¯¯¯XG)2Σ(X −X¯G)2 |
|
|
TABLE 14.5 ANOVA summary table for the memory study
SOURCE |
df |
SS |
MS |
F |
Between-groups |
2 |
65.33 |
32.67 |
11.07 |
Within-groups |
21 |
62 |
2.95 |
|
Total |
23 |
127.33 |
|
|
Interpreting the One-Way Between-Subjects ANOVA
Our obtained F-ratio of 11.07 is obviously greater than 1.00. However, we do not know whether it is large enough to let us reject the null hypothesis. To make this decision, we need to compare the obtained F (Fobt) of 11.07 with an Fcv—the critical value that determines the cutoff for statistical significance. The underlying F distribution is actually a family of distributions, each based on the degrees of freedom between and within each group. Remember that the alternative hypothesis is that the population means represented by the sample means are not from the same population. Table A.4 in Appendix A provides the critical values for the family of F distributions when α = .05 and when α = .01.
To use the table, look at the dfWithin running down the left side of the table and the dfBetween running across the top of the table. Fcv is found where the row and column of these two numbers intersect. For our example, dfWithin = 21 and dfBetween = 2. Because there is no 21 in the dfWithin column, we use the next lower number, 20. According to Table A.4, the Fcv for the .05 level is 3.49. Because our Fobt exceeds this, it is statistically significant at the .05 level. Let’s check the .01 level also. The critical value for the .01 level is 5.85. Our Fobt is larger than this critical value also. We can therefore conclude that the Fobt is significant at the .01 level. In APA publication format, this would be written as F(2, 21) = 11.07, p < .01. This means that we reject H0 and support Ha. In other words, at least one group mean differs significantly from the others. The calculation of this ANOVA using Excel, SPSS, and the TI-84 calculator is presented in the Statistical Software Resources section at the end of this chapter.
Let’s consider what factors might affect the size of the final Fobt. Because the Fobt is derived using the between-groups variance as the numerator and the within-groups variance as the denominator, anything that increases the numerator or decreases the denominator will increase the Fobt.
What might increase the numerator? Using stronger controls in the experiment could have this effect because it would make any differences between the groups more noticeable or larger. This means that the MSBetween (the numerator in the F-ratio) would be larger and therefore lead to a larger final F-ratio.
What would decrease the denominator? Once again, using better control to reduce overall error variance would have this effect; so would increasing the sample size, which increases dfWithin and ultimately decreases MSWithin. Why would each of these affect the F-ratio in this manner? Each would decrease the size of the MSWithin, which is the denominator in the F-ratio. Dividing by a smaller number would lead to a larger final F-ratio and, therefore, a greater chance that it would be significant.
Graphing the Means and Effect Size
As noted in Module 11, we usually graph the means when a significant difference is found between them. As in our previous graphs, the independent variable is placed on the x-axis and the dependent variable on the y-axis. A graph representing the mean performance of each group is shown in Figure 14.1. In this experiment, those in the Rote condition remembered an average of 4 words, those in the Imagery condition remembered an average of 5.5 words, and those in the Story condition remembered an average of 8 words.
eta-squared An inferential statistic for measuring effect size with an ANOVA.
In addition to graphing the data, we should also assess the effect size. Based on the Fobt, we know that there was more variability between groups than within groups. In other words, the between-groups variance (the numerator in the F-ratio) was larger than the within-groups variance (the denominator in the F-ratio). However, it would be useful to know how much of the variability in the dependent variable can be attributed to the independent variable. In other words, it would be useful to have a measure of effect size. For an ANOVA, effect size can be estimated using eta-squared (η2), which is calculated as follows:
η2=SSBetweenSSTotalη2=SSBetweenSSTotal
Because SSBetween reflects the differences between the means from the various levels of an independent variable and SSTotal reflects the total differences between all scores in the experiment, η2 reflects the proportion of the total differences in the scores that is associated with differences between sample means, or how much of the variability in the dependent variable (memory) is attributable to the manipulation of the independent variable (rehearsal type). In other words, η2 indicates how accurately the differences in scores can be predicted using the levels (conditions) of the independent variable. Referring to the summary table for our example in Table 14.5, η2 would be calculated as follows:
FIGURE 14.1 Number of words recalled as a function of rehearsal type
η2=65.33127.33=.51η2=65.33127.33=.51
In other words, approximately 51% of the variance among the scores can be attributed to the rehearsal condition to which the participant was assigned. In this example, the independent variable of rehearsal type is fairly important in determining the number of words recalled by subjects because the η2 of 51% represents a considerable effect. According to Cohen (1988), an η2 of .01 represents a small effect size; .06, a medium effect size; and .14, a large effect size. Thus, our obtained η2 represents a large effect size. In addition, as with most measures of effect size, η2 is useful in determining whether or not a result of statistical significance is also of practical significance. In other words, although researchers might find that an F-ratio is statistically significant, if the corresponding η2 is negligible, then the result is of little practical significance (Huck & Cormier, 1996).
Assumptions of the One-Way Between-Subjects ANOVA
As with most statistical tests, certain assumptions must be met to ensure that the statistic is being used properly. The assumptions for the between-subjects one-way ANOVA are similar to those for the t test for independent groups:
•The data are on an interval-ratio scale.
•The underlying distribution is normally distributed.
•The variances among the populations being compared are homogeneous.
Because ANOVA is a robust statistical test, violations of some of these assumptions do not necessarily affect the results. Specifically, if the distributions are slightly skewed rather than normally distributed, it does not affect the results of the ANOVA. In addition, if the sample sizes are equal, the assumption of homogeneity of variances can be violated. However, it is not acceptable to violate the assumption of interval-ratio data. If the data collected in a study are ordinal or nominal in scale, other statistical procedures must be used. These procedures are discussed in later modules.
Tukey’s Post Hoc Test
post hoc test When used with an ANOVA, a means of comparing all possible pairs of groups to determine which ones differ significantly from each other.
Because the results from our ANOVA indicate that at least one of the sample means differs significantly from the others (represents a different population from the others), we must now compute a post hoc test (a test conducted after the fact—in this case, after the ANOVA). A post hoc test involves comparing each of the groups in the study to each of the other groups to determine which ones differ significantly from each other. This may sound familiar to you. In fact, you may be thinking, isn’t that what a t test does? In a sense, you are correct. However, remember that a series of multiple t tests inflates the probability of a Type I error. A post hoc test is designed to permit multiple comparisons and still maintain alpha (the probability of a Type I error) at .05.
Tukey’s honestly significant difference (HSD) A post hoc test used with ANOVAs for making all pairwise comparisons when conditions have equal n.
The post hoc test presented here is Tukey’s honestly significant difference (HSD) . This test allows a researcher to make all pairwise comparisons among the sample means in a study while maintaining an acceptable alpha (usually .05, but possibly .01) when the conditions have equal n. If there is not an equal number of subjects in each condition, then another post hoc test, such as Fisher’s protected t test, would be appropriate. If you need to use Fisher’s protected t test, the formula is provided in the Computational Supplement in Appendix D.
Tukey’s test identifies the smallest difference between any two means that is significant with α = .05 or α = .01. The formula for Tukey’s HSD is
HSD.05=Q(k,dfWithin)√MSWithinnHSD.05=Q(k,dfWithin)MSWithinn
Using this formula, we can determine the HSD for the .05 alpha level. This involves using Table A.5 in Appendix A to look up the value for Q. To look up Q, we need k (the number of means being compared—in our study on memory, this is 3) and dfWithin (found in the ANOVA summary table, Table 14.5). Referring to Table A.5 for k = 3 and dfWithin = 21 (because there is no 20 in Table A.5, we use 20 here as we did with Table A.4), we find that at the .05 level, Q = 3.58. In addition, we need MSWithin from Table 14.5 and n (the number of participants in each group). Using these numbers, we calculate the HSD as follows:
HSD0.5=(3.58)√2.958=(3.58)√.369=(3.58)(.607)=2.17HSD0.5=(3.58)2.958=(3.58).369=(3.58)(.607)=2.17
This tells us that a difference of 2.17 or more for any pair of means is significant at the .05 level. In other words, the difference between the means is large enough that it is greater than what would be expected based on chance. Table 14.6 summarizes the differences between the means for each pairwise comparison. Can you identify which comparisons are significant using Tukey’s HSD?
TABLE 14.6 Differences between each pair of means in the memory study
|
ROTE REHEARSAL |
IMAGERY |
STORY |
Rote Rehearsal |
— |
1.5 |
4.0 |
Imagery |
|
— |
2.5 |
Story |
|
|
— |
If you identified the differences between the Story condition and the Rote Rehearsal condition and between the Story condition and the Imagery condition as the two honestly significant differences, you were correct. Because the F-ratio was significant at α = .01, we should also check the HSD01. To do this, we use the same formula, but we use Q for the .01 alpha level (from Table A.5). The calculations are as follows:
HSD.01=(4.64)√2.958=(4.64)(.607)=2.82HSD.01=(4.64)2.958=(4.64)(.607)=2.82
The only difference significant at this level is between the Rote Rehearsal condition and the Story condition. Thus, based on these data, those in the Story condition recalled significantly more words than those in the Imagery condition (p < .05) and those in the Rote Rehearsal condition (p < .01).
ONE-WAY BETWEEN-SUBJECTS ANOVA
Concept |
Description |
F-ratio |
The ratio formed when the between-groups variance is divided by the within-groups variance |
Between-groups variance |
An estimate of the variance of the group means about the grand mean; includes both systematic variance and error variance |
Within-groups variance |
An estimate of the variance within each condition in the experiment; also known as error variance, or variance due to chance factors |
Eta-squared |
A measure of effect size—the variability in the dependent variable attributable to the independent variable |
Tukey’s post hoc test |
A test conducted to determine which conditions in a study with more than two groups differ significantly from each other |
1.Of the following four F-ratios, which appears to indicate that the independent variable had an effect on the dependent variable?
a.1.25/1.11
b.0.91/1.25
c.1.95/0.26
d.0.52/1.01
2.The following ANOVA summary table represents the results from a study on the effects of exercise on stress. There were three conditions in the study: a control group, a moderate exercise group, and a high exercise group. Each group had 10 subjects and the mean stress levels for each group were Control = 75.0, Moderate Exercise = 44.7, and High Exercise = 63.7. Stress was measured using a 100-item stress scale, with 100 representing the highest level of stress. Complete the ANOVA summary table, and determine whether the F-ratio is significant. In addition, calculate eta-squared and Tukey’s HSD if necessary.
ANOVA Summary Table |
Source Between Within Total |
df |
SS 4,689.27 82,604.20 |
MS |
F |
REVIEW OF KEY TERMS
between-groups sum of squares (p. 229)
eta-squared (p. 233)
mean square (p. 230)
post hoc test (p. 235)
total sum of squares (p. 228)
Tukey’s honestly significant difference (HSD) (p. 235)
within-groups sum of squares (p. 228)
MODULE EXERCISES
(Answers to odd-numbered questions appear in Appendix B.)
1.A researcher conducts a study on the effects of amount of sleep on creativity. The creativity scores for four levels of sleep (2 hours, 4 hours, 6 hours, and 8 hours) for n = 5 subjects (in each group) are presented next.
Amount of Sleep (in hours) |
2 |
4 |
6 |
8 |
3 |
4 |
10 |
10 |
5 |
7 |
11 |
13 |
6 |
8 |
13 |
10 |
4 |
3 |
9 |
9 |
2 |
2 |
10 |
10 |
Source |
df |
SS |
MS |
F |
Between-groups |
|
187.75 |
|
|
Within-groups |
|
55.20 |
|
|
Total |
|
242.95 |
|
|
a.Complete the ANOVA summary table. (If your instructor wants you to calculate the sums of squares, use the preceding data to do so.)
b.Is Fobt significant at α = .05? at α = .01?
c.Perform post hoc comparisons if necessary.
d.What conclusions can be drawn from the F-ratio and the post hoc comparisons?
e.What is the effect size, and what does this mean?
f.Graph the means.
2.In a study on the effects of stress on illness, a researcher tallied the number of colds people contracted during a 6-month period as a function of the amount of stress they reported during the same time period. There were three stress levels: minimal, moderate, and high stress. The sums of squares appear in the following ANOVA summary table. The mean for each condition and the number of participants per condition are also noted.
Source |
df |
SS |
MS |
F |
Between-groups |
|
22.167 |
|
|
Within-groups |
|
14.750 |
|
|
Total |
|
36.917 |
|
|
Stress Level |
Mean |
n |
Minimal |
3 |
4 |
Moderate |
4 |
4 |
High |
6 |
4 |
a.Complete the ANOVA summary table.
b.Is Fobt significant at α = .05? at α = .01?
c.Perform post hoc comparisons if necessary.
d.What conclusions can be drawn from the F-ratio and the post hoc comparisons?
e.What is the effect size, and what does this mean?
f.Graph the means.
3.A researcher interested in the effects of exercise on stress had subjects exercise for 30, 60, or 90 minutes per day. The mean stress level on a 100-point stress scale (with 100 indicating high stress) for each condition appears next, along with the ANOVA summary table with the sums of squares indicated.
Source |
df |
SS |
MS |
F |
Between-groups |
|
4,689.27 |
|
|
Within-groups |
|
82,604.20 |
|
|
Total |
|
87,293.47 |
|
|
Exercise Level |
Mean |
n |
30 Minutes |
75.0 |
10 |
60 Minutes |
44.7 |
10 |
90 Minutes |
63.7 |
10 |
a.Complete the ANOVA summary table.
b.Is Fobt significant at α = .05? at α = .01?
c.Perform post hoc comparisons if necessary.
d.What conclusions can be drawn from the F-ratio and the post hoc comparisons?
e.What is the effect size, and what does this mean?
f.Graph the means.
4.A researcher conducted an experiment on the effects of a new “drug” on depression. The control group received nothing and the placebo group received a placebo pill. An experimental group received the “drug.” A depression inventory that provided a measure of depression on a 50-point scale was used (50 indicates that an individual is very high on the depression variable). The ANOVA summary table appears next, along with the mean depression score for each condition.
Source |
df |
SS |
MS |
F |
Between-groups |
|
1,202.313 |
|
|
Within-groups |
|
2,118.00 |
|
|
Total |
|
3,320.313 |
|
|
“Drug” Condition |
Mean |
n |
Control |
36.26 |
15 |
Placebo |
33.33 |
15 |
“Drug” |
24.13 |
15 |
a.Complete the ANOVA summary table.
b.Is Fobt significant at α = .05? at α = .01?
c.Perform post hoc comparisons if necessary.
d.What conclusions can be drawn from the F-ratio and the post hoc comparisons?
e.What is the effect size, and what does this mean?
f.Graph the means.
5.When should post hoc tests be performed?
6.What information does eta-squared (η2) provide?
CRITICAL THINKING CHECK ANSWERS
Critical Thinking Check 14.1
1.The F-ratio 1.95/0.26 = 7.5 suggests that the independent variable had an effect on the dependent variable.
2.
ANOVA Summary Table |
Source |
df |
SS |
MS |
F |
Between |
2 |
4,689.27 |
2,344.64 |
.766 |
Within |
27 |
82,604.20 |
3,059.41 |
|
Total |
29 |
87,293.47 |
|
|
The resulting F-ratio is less than 1 and thus not significant. Although stress levels differed across some of the groups, the difference was not large enough to be significant.
CHAPTER SUMMARY
In this chapter we discussed designs using more than two levels of an independent variable. Advantages to such designs include being able to compare more than two kinds of treatment, using fewer subjects, comparing all treatments to a control group, and using placebo groups. When interval-ratio data are collected using such a design, the parametric statistical analysis most appropriate for use is the ANOVA (analysis of variance). A one-way between-subjects ANOVA would be used for between-subjects designs. Appropriate post hoc tests (Tukey’s HSD) and measures of effect size (eta-squared) were also discussed.
CHAPTER 7 REVIEW EXERCISES
(Answers to exercises appear in Appendix B.)
Fill-in Self-Test
Answer the following questions. If you have trouble answering any of the questions, restudy the relevant material before going on to the multiple-choice self-test.
1.The______________provides a means of setting a more stringent alpha level for multiple tests in order to minimize Type I errors.
2.A(n) ______________ is an inferential statistical test for comparing the means of three or more groups.
3.The mean performance across all subjects is represented by the ______________.
4.The ______________ variance is an estimate of the effect of the independent variable, confounds, and error variance.
5.The sum of squared deviations of each score from the grand mean is the ______________
6.When we divide an SS score by its degrees of freedom, we have calculated a ______________
7.______________ is an inferential statistic for measuring effect size with an ANOVA.
8.For an ANOVA, we use ______________ to compare all possible pairs of groups to determine which ones differ significantly from each other.
Multiple-Choice Self-Test
Select the single best answer for each of the following questions. If you have trouble answering any of the questions, restudy the relevant material.
1.The F-ratio is determined by dividing ______________ by ______________.
a.error variance; systematic variance
b.between-groups variance; within-groups variance
c.within-groups variance; between-groups variance
d.systematic variance; error variance
2.If between-groups variance is large, then we have observed
a.experimenter effects.
b.large systematic variance.
c.large differences due to confounds.
d.possibly both large systematic variance and large differences due to confounds.
3.The larger the F-ratio, the greater the chance that
a.a mistake has been made in the computation.
b.there are large systematic effects present.
c.the experimental manipulation probably did not have the predicted effects.
d.the between-groups variation is no larger than would be expected by chance and no larger than the within-groups variance.
4.One reason to use an ANOVA over a t test is to reduce the risk of
a.a Type II error.
b.a Type I error.
c.confounds.
d.error variance.
5.If the null hypothesis for an ANOVA is false, then the F-ratio should be
a.greater than 1.00.
b.a negative number.
c.0.00.
d.1.00.
6.If in a between-subjects ANOVA, there are four groups with 15 participants in each group, then the df for the F-ratio is equal to
a.60.
b.59.
c.3, 56.
d.3, 57.
7.For an F-ratio with df = (3, 20), the Fcv for α = .05 would be
a.3.10.
b.4.94.
c.8.66.
d.5.53.
8.If a researcher reported an F-ratio with df = (2, 21) for a between-subjects one-way ANOVA, then there were ______________ conditions in the experiment and ______________ total subjects.
a.2; 21
b.3; 23
c.2; 24
d.3; 24
9.Systematic variance and error variance comprise the ______________ variance.
a.within-groups
b.total
c.between-groups
d.subject
10.If a between-subjects one-way ANOVA produced MSBetween = 25 and MSWithin = 5, then the F-ratio would be
a.25/5 = 5.
b.5/25 = .20.
c.25/30 = .83.
d.30/5 = 6.
Self-Test Problems
Calculate Tukey’s HSD and eta-squared for the following ANOVA.
Anova Summary Table |
Source |
df |
SS |
MS |
F |
Between |
2 |
150 |
|
|
Error |
18 |
100 |
|
|
Total |
20 |
|
|
|
If you need help getting started with Excel or SPSS, please see Appendix C: Getting Started with Excel and SPSS.
MODULES 13 AND 14 One-Way Between-Subjects ANOVA
The problem we’ll be using to illustrate how to calculate the one-way between-subjects ANOVA appears in Modules 13 and 14.
Let’s use the example from Modules 13 and 14 in which a researcher wants to study the effects on memory performance of rehearsal type. Three types of rehearsal are used (Rote, Imagery, and Story) by three different groups of subjects. The dependent variable is the subjects’ scores on a 10-item test of the material. These scores are listed in Table 13.1in Module 13.
Using Excel
We’ll use the data from Table 13.1 (in Module 13) to illustrate the use of Excel to compute a one-way between-subjects ANOVA. In this study, we had participants use one of three different types of rehearsal (rote, imagery, or story) and then had them perform a recall task. Thus we manipulated rehearsal and measured memory for the 10 words subjects studied. Because there were different participants in each condition, we use a between-subjects ANOVA. We begin by entering the data into Excel, with the data from each condition appearing in a different column. This can be seen next.
Next, with the Data ribbon highlighted, click on the Data Analysis tab in the top right corner. You should receive the following dialog box:
Select Anova: Single Factor, as in the preceding box, and click OK. The following dialog box will appear:
With the cursor in the Input Range box, highlight the three columns of data so that they are entered into the Input Range box as they are in the preceding box (highlight only the data, not the column headings). Then click OK. The output from the ANOVA will appear on a new Worksheet, as seen next.
You can see from the ANOVA Summary Table provided by Excel that F(2, 21) = 11.06, p = .000523. In addition to the full ANOVA Summary Table, Excel also provides the mean and variance for each condition.
Using SPSS
We’ll again use the data from Table 13.1 (in Module 13) to illustrate the use of SPSS to compute a one-way between-subjects ANOVA. In this study, we had subjects use one of three different types of rehearsal (rote, imagery, or story) and then had them perform a recall task. Thus we manipulated rehearsal and measured memory for the 10 words subjects studied. Because there were different participants in each condition, we use a between-subjects ANOVA. We begin by entering the data into SPSS. The first column is labeled Rehearsaltype and indicates which type of rehearsal the subjects used (1 for rote, 2 for imagery, and 3 for story). The recall data for each of the three conditions appear in the second column, labeled Recall.
Next, click on Analyze, followed by Compare Means, and then One-Way ANOVA, as illustrated next.
You should receive the following dialog box:
Enter Rehearsaltype into the Factor box by highlighting it and using the appropriate arrow. Do the same for Recall by entering it into the Dependent List box. After doing this, the dialog box should appear as follows:
Next click on the Options button and select Descriptive and Continue. Then click on the Post H oc button and select T ukey and then Continue. Then click on OK. The output from the ANOVA will appear in a new Output window as seen next.
You can see that the descriptive statistics for each condition are provided, followed by the ANOVA Summary Table in which F(2, 21) = 11.065, p = .001. In addition to the full ANOVA Summary Table, SPSS also calculated Tukey’s HSD and provides all pairwise comparisons between the three conditions along with whether or not the comparison was significant.
Using the TI-84
Let’s use the data from Table 13.1 (in Module 13) to conduct the analysis.
1.With the calculator on, press the STAT key.
2.EDIT will be highlighted. Press the ENTER key.
3.Under L1 enter the data from Table 13.1 for the rote group.
4.Under L2 enter the data from Table 13.1 for the imagery group.
5.Under L3 enter the data from Table 13.1 for the story group.
6.Press the STAT key once again and highlight TESTS.
7.Scroll down to ANOVA. Press the ENTER key.
8.Next to “ANOVA” enter (L1,L2,L3) using the 2nd function key with the appropriate number keys. Make sure that you use commas. The finished line should read “ANOVA(L1,L2,L3).”
9.Press ENTER.
The F score of 11.065 should be displayed followed by the significance level of .0005.