Analysis of Variance 12-1
Learning Outcomes Outcome 1. Understand the basic logic of analysis of variance. Outcome 2. Perform a hypothesis test for a single-factor design using analysis of variance manually and with the aid of Excel software. Outcome 3. Conduct and interpret post-analysis of variance pairwise comparison procedures. Outcome 4. Recognize when randomized block analysis of variance is useful and be able to perform analysis of variance on a randomized block design. Outcome 5. Perform analysis of variance on a two-factor design of experiments with replications using Excel and interpret the output. 12-2
12.1 One-Way Analysis of Variance It is a common situation when someone needs to determine whether three or more populations have equal means ANOVA analysis of variance Completely Randomized Design: An experiment that consists of the independent random selection of observations representing each level of one factor 12-3
One-Way Analysis of Variance An analysis of variance design in which independent samples are obtained from two or more levels of a single factor for the purpose of testing whether the levels have equal means. Examples: Accident rates for 1 st, 2 nd, and 3 rd shift Expected mileage for five brands of tires 12-4
Introduction to One-Way ANOVA Factor: A quantity under examination in an experiment as a possible cause of variation in the response variable Levels: The categories, measurements, or strata of a factor of interest in the current experiment Balanced Design: An experiment has a balanced design if the factor levels have equal sample sizes. 12-5
One-Way ANOVA Assumptions All populations are normally distributed. The population variances are equal. The observations are independent - that is, the occurrence of any one individual value does not affect the probability that any other observation will occur. The data are interval or ratio level. 12-6
Hypotheses of One-Way ANOVA If the null hypothesis is true, the populations have identical distributions sample means for random samples from each population should be close in value The null hypothesis should be rejected only if the sample means are substantially different (some pairs may be the same) 12-7
Hypotheses of One-Way ANOVA 12-8
Partitioning the Sum of Squares Total Variation (SST): The aggregate dispersion of the individual data values across the various factor levels Within-Sample Variation (SSW): The dispersion that exists among the data values within a particular factor level Between Sample Variation (SSB): Dispersion among the factor sample means 12-9
Partitioned Sum of Squares SST - Total sum of squares SSB - Sum of squares between SSW - Sum of squares within 12-10
Total Sum of Squares i k n = j i = 1 = 1 12-11
Sum of Squares Between k = i= 1 Variation Due to Differences Among Groups 12-12
Sum of Squares Within i k n = j i = 1 = 1 Variation Due to Differences Within Groups 12-13
Mean Squares Mean Square Between Samples: Mean Square Within Samples: 12-14
One-Way ANOVA Table Source of Variation Between Samples Within Samples Total SS df MS F-Ratio SSB MSB SSW MSW SST 12-15
One-Way ANOVA - Example Company runs business in several locations. The VP of sales for the company is interested in knowing whether the dollar value for orders made by individual customers differs, on average, between the four locations. Business Locations 1 2 3 4 Mean 7.00 9.00 8.00 9.00 8.25 Variance 7.341 8.423 7.632 5.016 n 8 8 8 8 12-16
One-Way ANOVA - Example Source of Variation SS df MS F-Ratio Between Samples Within Samples Total 220.88 22.00 7.33 198.88 7.10 Draw a conclusion: 12-17
One-Way Analysis of Variance 12-18
One-Way Analysis of Variance Step 5: Determine the decision rule Step 6: Compute the total sum of squares, sum of squares between, and sum of squares within, and complete the ANOVA table. Step 7: Reach a decision Step 8: Draw a conclusion 12-19
How to Do It in Excel? 1. Open file. 2. Select Data > Data Analysis. 3. Select ANOVA: Single Factor. 4. Define data range. 5. Specify Alpha. 6. Indicate output choice. 7. Click OK. 12-20
The Tukey-Kramer Procedure for Multiple Comparisons A method for testing which populations have different means after the one-way ANOVA rejected the null hypothesis Experiment-Wide Error Rate: The proportion of experiments in which at least one of the set of confidence intervals constructed does not contain the true value of the population parameter being estimated. 12-21
The Tukey-Kramer Procedure for Multiple Comparisons 12-22
Tukey-Cramer Critical Range 12-23
The Tukey-Kramer Procedure for Multiple Comparisons - Example Step 1: Specify the parameter(s) of interest Step 2: Formulate the appropriate null and alternative hypotheses Step 3: Specify the significance level for the test Step 4: Select independent simple random samples from each population Step 5: Check to see that the normality and equal-variance assumptions have been satisfied 12-24
The Tukey-Kramer Procedure for Multiple Comparisons - Example Step 6: Determine the decision rule Step 7: Use Excel to construct the ANOVA table Step 8: Reach a decision Step 9: Draw a conclusion Step 10: Use the Tukey-Kramer test to determine which populations have different means 12-25
The Tukey-Kramer Procedure for Multiple Comparisons - Example Company distributes weight loss-enhancing products. 263 people were studied. 89 people have received a placebo,91 product 1, and 83 product 2. At the end of six weeks, the subjects weight loss was recorded. The company was hoping to find statistical evidence that at least one of the products is an effective weight-loss aid. The following absolute differences of sample means were received: 12-26
The Tukey-Kramer Procedure for Multiple Comparisons - Example Placebo vs. Product 1 Placebo vs. Product 2 Product 1 vs. Product 2 Critical Range Significant? 4.20 1.785 Yes 4.33 1.827 Yes 0.13 1.818 No 12-27
12.2 Randomized Complete Block Analysis of Variance One-way ANOVA method is appropriate as long as we are interested in analyzing one factor at a time There are situations in which another factor may affect the observed response in a one-way design When an additional factor with two or more levels is involved, a design technique called blocking can be used to eliminate the additional factor s effect on the statistical analysis of the main factor of interest 12-28
Randomized Complete Block ANOVA Assumptions: The populations are normally distributed. The populations have equal variances. The observations within samples are independent. The data measurement must be interval or ratio level. Examples: Testing 5 routes to a destination through 3 different cab companies to see if differences exist Determining the best training program (out of 4 choices) for various departments within a company 12-29
Randomized Complete Block ANOVA Sum of Squares Partitioning for Randomized Complete Block Design: SST - Total sum of squares SSB - Sum of squares between factor levels SSBL - Sum of squares between blocks SSW - Sum of squares within levels 12-30
Sum of Squares for Blocking b = j= 1 Note: If the corresponding variation in the blocks is significant, the variation within the factor levels will be significantly reduced. This can make it easier to detect a difference in the population means if such a difference actually exists. 12-31
Mean Squares F-Ratios Mean Square Blocking: Mean Square Between Samples: Mean Square Within Samples 12-32
Randomized Block ANOVA Table Source of Variation Between Blocks Between Samples Within Samples Total SS df MS F-Ratio SSBL MSBL SSB MSB SSW MSW SST 12-33
Primary and Secondary Tests Primary or Main Factor Test: Secondary or Blocking Factor Test 12-34
Randomized Complete Block ANOVA Step 1: Specify the parameter of interest and formulate the appropriate null and alternative hypotheses Step 2: Specify the level of significance for conducting the tests Step 3: Select simple random samples from each population, and compute treatment means, block means, and the grand mean Step 4: Compute the sums of squares and complete the ANOVA table Step 5: Test to determine whether blocking is effective Step 6: Conduct the main hypothesis test to determine whether the populations have equal means 12-35
Randomized Complete Block ANOVA - Example Professor has developed three different midterm exams that are to be graded on a 1,000 point scale. Before she uses the exams in a live class, she wants to determine if the tests will yield the same mean scores. To test this, a random sample of fourteen people is selected. Each student will take each test. 12-36
Randomized Complete Block ANOVA - Example 12-37
Randomized Complete Block ANOVA - Example Source of Variation Between Blocks Between Samples Within Samples SS df MS F-Ratio F-Critical 116,605.0 13 8,969.6 0.9105 2.15 241,912.7 2 120,956.4 12.2787 3.40 256,123.9 26 9,850.9 Total 614,641.6 41 12-38
How to Do It in Excel? 1. Open file. 2. Select Data > Data Analysis. 3. Select ANOVA: Two Factor Without Replication. 4. Define data range. 5. Specify Alpha. 6. Indicate output choice. 7. Click OK. Blocking Test Main Factor Test 12-39
Fisher s Least Significant Difference Test Even if the null hypothesis of equal population means is rejected, the ANOVA does not specify which population means are different Fisher s test is one test for multiple comparisons that can be used for a randomized block ANOVA design 12-40
Fisher s Least Significant Difference Test 12-41
Fisher s Least Significant Difference Test - Example Note: The previous example us used Step 1: Compute LSD Statistic: Step 2: Compute the sample means from each population 12-42
Fisher s Least Significant Difference Test - Example Step 3: Form all possible contrasts by finding the absolute differences between all pairs of sample means. Compare these to the LSD value. Comparison Significant? 55.93 55.93 < 77.11 No 125.57 125.57 > 77.11 Yes 181.50 181.50 > 77.11 Yes 12-43
12.3 Two-Factor Analysis of Variance with Replication There are many situations in which there are actually two or more factors of interest in the same study Two-factor ANOVA follows the same logic as one-way ANOVA The measurements are called replications All combinations of two factors (A and B) are considered Example: miles redemption method (A) and age group (B) 12-44
Partitioning the Total Sum of Squares One part is due to differences in the levels of factor A (SS A ). Another part is due to the levels of factor B (SS B ). Another part is due to the interaction between factor A and factor B (SS AB ). The final component making up the total sum of squares is the sum of squares due to the inherent random variation in the data (SSE ). 12-45
Partitioning the Total Sum of Squares 12-46
Assumptions for Two-Factor ANOVA The population values for each combination of pairwise factor levels are normally distributed The variances for each population are equal. The samples are independent. The data measurement is interval or ratio level. 12-47
Hypotheses in Two-Factor ANOVA Factor A: Factor B: Interaction: 12-48
Two-Factor ANOVA Table Source of Variation SS df MS F-Ratio Factor A SS A MS A Factor B SS B MS B AB Interaction SS AB MS AB Error SSE MSE Total SST 12-49
Two-Factor ANOVA Equations Total Sum of Squares: Sum of Squares Factor A: Sum of Squares Factor B: a b n i= 1 j= 1 k= 1 a i= 1 b j= 1 Sum of Squares Interaction: a b i= 1 j= 1 Sum of Squares Error: a b n i= 1 j= 1 k= 1 12-50
Two-Factor ANOVA Equations Grand mean: a b n i= 1 j= 1 k= 1 Mean of each level of factor A: b n j= 1 k= 1 Mean of each level of factor B: a n i= 1 k= 1 Mean of each cell: n k = 1 12-51
Two-Factor ANOVA Equations Mean square factor A: Mean square factor B: Mean square interaction: Mean square error: 12-52
Two-Way ANOVA: The F-Test Statistic Factor A main effect: Factor B main effect: Interaction effect: 12-53
Two-Factor ANOVA - Example Airline company is concerned because many of its frequent flier program members have accumulated large quantities of free miles. They conducted an experiment in which each of three methods for redeeming frequent flier miles was offered to a sample of 16 customers divided in four age groups. Factor A is the redemption offer type with three levels. Factor B is the age group of each customer with four levels. 12-54
Two-Factor ANOVA - Example Summary in Excel 1. Open file. 2. Select Data > Data Analysis. 3. Select ANOVA: Two Factor With Replication. 4. Define data range (include factor A and B labels). 5. Specify the number of rows per sample: 4 6. Specify Alpha. 7. Indicate output range. 8. Click OK. 12-55
Two-Factor ANOVA Table Example Results 12-56
Interaction No Interaction Interaction is present 12-57
Interaction Test for interaction If present, conduct a one-way ANOVA to test the levels of one of the other factors using only one level of the other factor If NO interaction, test Factor A and Factor B 12-58
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America. 12-59