Analysis of Variance. Copyright 2014 Pearson Education, Inc.

Similar documents
A few things to remember about ANOVA

Week 7 One-way ANOVA

MGB 203B Homework # LSD = 1 1

Name May 3, 2007 Math Probability and Statistics

Unit 4: Inference for numerical variables Lecture 3: ANOVA

One-factor ANOVA by example

PLANNED ORTHOGONAL CONTRASTS

Experimental Design and Data Analysis Part 2

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

Select Boxplot -> Multiple Y's (simple) and select all variable names.

ANOVA - Implementation.

Statistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007

Unit4: Inferencefornumericaldata 4. ANOVA. Sta Spring Duke University, Department of Statistical Science

One-way ANOVA: round, narrow, wide

Factorial Analysis of Variance

Setting up group models Part 1 NITP, 2011

Legendre et al Appendices and Supplements, p. 1

Factorial ANOVA Problems

DOCUMENT RESUME. A Comparison of Type I Error Rates of Alpha-Max with Established Multiple Comparison Procedures. PUB DATE NOTE

One Way ANOVA (Analysis of Variance)

Stat 139 Homework 3 Solutions, Spring 2015

Chapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19

Example 1: One Way ANOVA in MINITAB

Class 23: Chapter 14 & Nested ANOVA NOTES: NOTES: NOTES:

Running head: DATA ANALYSIS AND INTERPRETATION 1

Biostatistics & SAS programming

Stats 2002: Probabilities for Wins and Losses of Online Gambling

BIOL 101L: Principles of Biology Laboratory

Taking Your Class for a Walk, Randomly

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Chapter 12 Practice Test

MJA Rev 10/17/2011 1:53:00 PM

5.1 Introduction. Learning Objectives

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement

Confidence Interval Notes Calculating Confidence Intervals

Empirical Example II of Chapter 7

Section I: Multiple Choice Select the best answer for each problem.

Chapter 2: ANOVA and regression. Caroline Verhoeven

Design of Experiments Example: A Two-Way Split-Plot Experiment

Navigate to the golf data folder and make it your working directory. Load the data by typing

Driv e accu racy. Green s in regul ation

IDENTIFYING SUBJECTIVE VALUE IN WOMEN S COLLEGE GOLF RECRUITING REGARDLESS OF SOCIO-ECONOMIC CLASS. Victoria Allred

Political Science 30: Political Inquiry Section 5

How Effective is Change of Pace Bowling in Cricket?

by Robert Gifford and Jorge Aranda University of Victoria, British Columbia, Canada

NCSS Statistical Software

Guide to Computing Minitab commands used in labs (mtbcode.out)

Math 121 Test Questions Spring 2010 Chapters 13 and 14

Basic Autoclave #1 Ideal Gas Law, Inert Gas

Midterm Exam 1, section 2. Thursday, September hour, 15 minutes

Safety at Intersections in Oregon A Preliminary Update of Statewide Intersection Crash Rates

Is lung capacity affected by smoking, sport, height or gender. Table of contents

BEFORE YOU OPEN ANY FILES:

Distancei = BrandAi + 2 BrandBi + 3 BrandCi + i

Traffic Accident Data Processing

A Combined Recruitment Index for Demersal Juvenile Cod in NAFO Divisions 3K and 3L

ESP 178 Applied Research Methods. 2/26/16 Class Exercise: Quantitative Analysis

Accident data analysis using Statistical methods A case study of Indian Highway

Session 2: Introduction to Multilevel Modeling Using SPSS

Analyses of the Scoring of Writing Essays For the Pennsylvania System of Student Assessment

In my left hand I hold 15 Argentine pesos. In my right, I hold 100 Chilean

Money Lost or Won -$5 +$3 +$7

In addition to reading this assignment, also read Appendices A and B.

Ozobot Bit Classroom Application: Boyle s Law Simulation

CHAPTER ANALYSIS AND INTERPRETATION Average total number of collisions for a try to be scored

Probability & Statistics - Solutions

PERFORMANCE OF LIGHT WEIGHT STOCKER CALVES GRAZING SUMMER NATIVE RANGE WITH 25 OR 40% PROTEIN SUPPLEMENTS

Transportation Research Forum

The Animated Eyes Symbol as Part of the WALK Signal: An Examination of the Generality of its Effectiveness Across a Variety of

Paper 2.2. Operation of Ultrasonic Flow Meters at Conditions Different Than Their Calibration

Efficiency Wages in Major League Baseball Starting. Pitchers Greg Madonia

March Madness Basketball Tournament

Building an NFL performance metric

Section 1: Multiple Choice Explained EXAMPLE

Exploring Measures of Central Tendency (mean, median and mode) Exploring range as a measure of dispersion

Introduction to Waves. If you do not have access to equipment, the following experiments can be observed here:

Lecture 16: Chapter 7, Section 2 Binomial Random Variables

Analysis of Shear Lag in Steel Angle Connectors

(JUN10SS0501) General Certificate of Education Advanced Level Examination June Unit Statistics TOTAL.

Measuring Relative Achievements: Percentile rank and Percentile point

Chapter 5: Methods and Philosophy of Statistical Process Control

Supplementary Online Content

Analysis of Gini s Mean Difference for Randomized Block Design

Review questions CPSC 203 midterm

Sample Final Exam MAT 128/SOC 251, Spring 2018

WATER OIL RELATIVE PERMEABILITY COMPARATIVE STUDY: STEADY VERSUS UNSTEADY STATE

Evaluation of pedestrians speed with investigation of un-marked crossing

March Madness Basketball Tournament

Fishery Resource Grant Program Final Report 2010

SUMMARIZING FROG AND TOAD COUNT DATA

THE DEVELOPMENTOF A PREDICTION MODEL OF THE PASSENGER CAR EQUIVALENT VALUES AT DIFFERENT LOCATIONS

Gerald D. Anderson. Education Technical Specialist

Comparing Generalized Variance Functions to Direct Variance Estimation for the National Crime Victimization Survey

Calculation of Trail Usage from Counter Data

The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Naval Postgraduate School, Operational Oceanography and Meteorology. Since inputs from UDAS are continuously used in projects at the Naval

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

1. In a hypothesis test involving two-samples, the hypothesized difference in means must be 0. True. False

This e-book version is for an individual. No corporation, training classes, training institutes are allowed to use this e-book.

An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables

Transcription:

Analysis of Variance 12-1

Learning Outcomes Outcome 1. Understand the basic logic of analysis of variance. Outcome 2. Perform a hypothesis test for a single-factor design using analysis of variance manually and with the aid of Excel software. Outcome 3. Conduct and interpret post-analysis of variance pairwise comparison procedures. Outcome 4. Recognize when randomized block analysis of variance is useful and be able to perform analysis of variance on a randomized block design. Outcome 5. Perform analysis of variance on a two-factor design of experiments with replications using Excel and interpret the output. 12-2

12.1 One-Way Analysis of Variance It is a common situation when someone needs to determine whether three or more populations have equal means ANOVA analysis of variance Completely Randomized Design: An experiment that consists of the independent random selection of observations representing each level of one factor 12-3

One-Way Analysis of Variance An analysis of variance design in which independent samples are obtained from two or more levels of a single factor for the purpose of testing whether the levels have equal means. Examples: Accident rates for 1 st, 2 nd, and 3 rd shift Expected mileage for five brands of tires 12-4

Introduction to One-Way ANOVA Factor: A quantity under examination in an experiment as a possible cause of variation in the response variable Levels: The categories, measurements, or strata of a factor of interest in the current experiment Balanced Design: An experiment has a balanced design if the factor levels have equal sample sizes. 12-5

One-Way ANOVA Assumptions All populations are normally distributed. The population variances are equal. The observations are independent - that is, the occurrence of any one individual value does not affect the probability that any other observation will occur. The data are interval or ratio level. 12-6

Hypotheses of One-Way ANOVA If the null hypothesis is true, the populations have identical distributions sample means for random samples from each population should be close in value The null hypothesis should be rejected only if the sample means are substantially different (some pairs may be the same) 12-7

Hypotheses of One-Way ANOVA 12-8

Partitioning the Sum of Squares Total Variation (SST): The aggregate dispersion of the individual data values across the various factor levels Within-Sample Variation (SSW): The dispersion that exists among the data values within a particular factor level Between Sample Variation (SSB): Dispersion among the factor sample means 12-9

Partitioned Sum of Squares SST - Total sum of squares SSB - Sum of squares between SSW - Sum of squares within 12-10

Total Sum of Squares i k n = j i = 1 = 1 12-11

Sum of Squares Between k = i= 1 Variation Due to Differences Among Groups 12-12

Sum of Squares Within i k n = j i = 1 = 1 Variation Due to Differences Within Groups 12-13

Mean Squares Mean Square Between Samples: Mean Square Within Samples: 12-14

One-Way ANOVA Table Source of Variation Between Samples Within Samples Total SS df MS F-Ratio SSB MSB SSW MSW SST 12-15

One-Way ANOVA - Example Company runs business in several locations. The VP of sales for the company is interested in knowing whether the dollar value for orders made by individual customers differs, on average, between the four locations. Business Locations 1 2 3 4 Mean 7.00 9.00 8.00 9.00 8.25 Variance 7.341 8.423 7.632 5.016 n 8 8 8 8 12-16

One-Way ANOVA - Example Source of Variation SS df MS F-Ratio Between Samples Within Samples Total 220.88 22.00 7.33 198.88 7.10 Draw a conclusion: 12-17

One-Way Analysis of Variance 12-18

One-Way Analysis of Variance Step 5: Determine the decision rule Step 6: Compute the total sum of squares, sum of squares between, and sum of squares within, and complete the ANOVA table. Step 7: Reach a decision Step 8: Draw a conclusion 12-19

How to Do It in Excel? 1. Open file. 2. Select Data > Data Analysis. 3. Select ANOVA: Single Factor. 4. Define data range. 5. Specify Alpha. 6. Indicate output choice. 7. Click OK. 12-20

The Tukey-Kramer Procedure for Multiple Comparisons A method for testing which populations have different means after the one-way ANOVA rejected the null hypothesis Experiment-Wide Error Rate: The proportion of experiments in which at least one of the set of confidence intervals constructed does not contain the true value of the population parameter being estimated. 12-21

The Tukey-Kramer Procedure for Multiple Comparisons 12-22

Tukey-Cramer Critical Range 12-23

The Tukey-Kramer Procedure for Multiple Comparisons - Example Step 1: Specify the parameter(s) of interest Step 2: Formulate the appropriate null and alternative hypotheses Step 3: Specify the significance level for the test Step 4: Select independent simple random samples from each population Step 5: Check to see that the normality and equal-variance assumptions have been satisfied 12-24

The Tukey-Kramer Procedure for Multiple Comparisons - Example Step 6: Determine the decision rule Step 7: Use Excel to construct the ANOVA table Step 8: Reach a decision Step 9: Draw a conclusion Step 10: Use the Tukey-Kramer test to determine which populations have different means 12-25

The Tukey-Kramer Procedure for Multiple Comparisons - Example Company distributes weight loss-enhancing products. 263 people were studied. 89 people have received a placebo,91 product 1, and 83 product 2. At the end of six weeks, the subjects weight loss was recorded. The company was hoping to find statistical evidence that at least one of the products is an effective weight-loss aid. The following absolute differences of sample means were received: 12-26

The Tukey-Kramer Procedure for Multiple Comparisons - Example Placebo vs. Product 1 Placebo vs. Product 2 Product 1 vs. Product 2 Critical Range Significant? 4.20 1.785 Yes 4.33 1.827 Yes 0.13 1.818 No 12-27

12.2 Randomized Complete Block Analysis of Variance One-way ANOVA method is appropriate as long as we are interested in analyzing one factor at a time There are situations in which another factor may affect the observed response in a one-way design When an additional factor with two or more levels is involved, a design technique called blocking can be used to eliminate the additional factor s effect on the statistical analysis of the main factor of interest 12-28

Randomized Complete Block ANOVA Assumptions: The populations are normally distributed. The populations have equal variances. The observations within samples are independent. The data measurement must be interval or ratio level. Examples: Testing 5 routes to a destination through 3 different cab companies to see if differences exist Determining the best training program (out of 4 choices) for various departments within a company 12-29

Randomized Complete Block ANOVA Sum of Squares Partitioning for Randomized Complete Block Design: SST - Total sum of squares SSB - Sum of squares between factor levels SSBL - Sum of squares between blocks SSW - Sum of squares within levels 12-30

Sum of Squares for Blocking b = j= 1 Note: If the corresponding variation in the blocks is significant, the variation within the factor levels will be significantly reduced. This can make it easier to detect a difference in the population means if such a difference actually exists. 12-31

Mean Squares F-Ratios Mean Square Blocking: Mean Square Between Samples: Mean Square Within Samples 12-32

Randomized Block ANOVA Table Source of Variation Between Blocks Between Samples Within Samples Total SS df MS F-Ratio SSBL MSBL SSB MSB SSW MSW SST 12-33

Primary and Secondary Tests Primary or Main Factor Test: Secondary or Blocking Factor Test 12-34

Randomized Complete Block ANOVA Step 1: Specify the parameter of interest and formulate the appropriate null and alternative hypotheses Step 2: Specify the level of significance for conducting the tests Step 3: Select simple random samples from each population, and compute treatment means, block means, and the grand mean Step 4: Compute the sums of squares and complete the ANOVA table Step 5: Test to determine whether blocking is effective Step 6: Conduct the main hypothesis test to determine whether the populations have equal means 12-35

Randomized Complete Block ANOVA - Example Professor has developed three different midterm exams that are to be graded on a 1,000 point scale. Before she uses the exams in a live class, she wants to determine if the tests will yield the same mean scores. To test this, a random sample of fourteen people is selected. Each student will take each test. 12-36

Randomized Complete Block ANOVA - Example 12-37

Randomized Complete Block ANOVA - Example Source of Variation Between Blocks Between Samples Within Samples SS df MS F-Ratio F-Critical 116,605.0 13 8,969.6 0.9105 2.15 241,912.7 2 120,956.4 12.2787 3.40 256,123.9 26 9,850.9 Total 614,641.6 41 12-38

How to Do It in Excel? 1. Open file. 2. Select Data > Data Analysis. 3. Select ANOVA: Two Factor Without Replication. 4. Define data range. 5. Specify Alpha. 6. Indicate output choice. 7. Click OK. Blocking Test Main Factor Test 12-39

Fisher s Least Significant Difference Test Even if the null hypothesis of equal population means is rejected, the ANOVA does not specify which population means are different Fisher s test is one test for multiple comparisons that can be used for a randomized block ANOVA design 12-40

Fisher s Least Significant Difference Test 12-41

Fisher s Least Significant Difference Test - Example Note: The previous example us used Step 1: Compute LSD Statistic: Step 2: Compute the sample means from each population 12-42

Fisher s Least Significant Difference Test - Example Step 3: Form all possible contrasts by finding the absolute differences between all pairs of sample means. Compare these to the LSD value. Comparison Significant? 55.93 55.93 < 77.11 No 125.57 125.57 > 77.11 Yes 181.50 181.50 > 77.11 Yes 12-43

12.3 Two-Factor Analysis of Variance with Replication There are many situations in which there are actually two or more factors of interest in the same study Two-factor ANOVA follows the same logic as one-way ANOVA The measurements are called replications All combinations of two factors (A and B) are considered Example: miles redemption method (A) and age group (B) 12-44

Partitioning the Total Sum of Squares One part is due to differences in the levels of factor A (SS A ). Another part is due to the levels of factor B (SS B ). Another part is due to the interaction between factor A and factor B (SS AB ). The final component making up the total sum of squares is the sum of squares due to the inherent random variation in the data (SSE ). 12-45

Partitioning the Total Sum of Squares 12-46

Assumptions for Two-Factor ANOVA The population values for each combination of pairwise factor levels are normally distributed The variances for each population are equal. The samples are independent. The data measurement is interval or ratio level. 12-47

Hypotheses in Two-Factor ANOVA Factor A: Factor B: Interaction: 12-48

Two-Factor ANOVA Table Source of Variation SS df MS F-Ratio Factor A SS A MS A Factor B SS B MS B AB Interaction SS AB MS AB Error SSE MSE Total SST 12-49

Two-Factor ANOVA Equations Total Sum of Squares: Sum of Squares Factor A: Sum of Squares Factor B: a b n i= 1 j= 1 k= 1 a i= 1 b j= 1 Sum of Squares Interaction: a b i= 1 j= 1 Sum of Squares Error: a b n i= 1 j= 1 k= 1 12-50

Two-Factor ANOVA Equations Grand mean: a b n i= 1 j= 1 k= 1 Mean of each level of factor A: b n j= 1 k= 1 Mean of each level of factor B: a n i= 1 k= 1 Mean of each cell: n k = 1 12-51

Two-Factor ANOVA Equations Mean square factor A: Mean square factor B: Mean square interaction: Mean square error: 12-52

Two-Way ANOVA: The F-Test Statistic Factor A main effect: Factor B main effect: Interaction effect: 12-53

Two-Factor ANOVA - Example Airline company is concerned because many of its frequent flier program members have accumulated large quantities of free miles. They conducted an experiment in which each of three methods for redeeming frequent flier miles was offered to a sample of 16 customers divided in four age groups. Factor A is the redemption offer type with three levels. Factor B is the age group of each customer with four levels. 12-54

Two-Factor ANOVA - Example Summary in Excel 1. Open file. 2. Select Data > Data Analysis. 3. Select ANOVA: Two Factor With Replication. 4. Define data range (include factor A and B labels). 5. Specify the number of rows per sample: 4 6. Specify Alpha. 7. Indicate output range. 8. Click OK. 12-55

Two-Factor ANOVA Table Example Results 12-56

Interaction No Interaction Interaction is present 12-57

Interaction Test for interaction If present, conduct a one-way ANOVA to test the levels of one of the other factors using only one level of the other factor If NO interaction, test Factor A and Factor B 12-58

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America. 12-59