A few things to remember about ANOVA

Similar documents
Analysis of Variance. Copyright 2014 Pearson Education, Inc.

Week 7 One-way ANOVA

MGB 203B Homework # LSD = 1 1

One-factor ANOVA by example

Unit 4: Inference for numerical variables Lecture 3: ANOVA

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

Factorial Analysis of Variance

Unit4: Inferencefornumericaldata 4. ANOVA. Sta Spring Duke University, Department of Statistical Science

PLANNED ORTHOGONAL CONTRASTS

Biostatistics & SAS programming

ANOVA - Implementation.

DOCUMENT RESUME. A Comparison of Type I Error Rates of Alpha-Max with Established Multiple Comparison Procedures. PUB DATE NOTE

Experimental Design and Data Analysis Part 2

Stat 139 Homework 3 Solutions, Spring 2015

One-way ANOVA: round, narrow, wide

Name May 3, 2007 Math Probability and Statistics

Statistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007

One Way ANOVA (Analysis of Variance)

Legendre et al Appendices and Supplements, p. 1

Setting up group models Part 1 NITP, 2011

Factorial ANOVA Problems

Class 23: Chapter 14 & Nested ANOVA NOTES: NOTES: NOTES:

Select Boxplot -> Multiple Y's (simple) and select all variable names.

Running head: DATA ANALYSIS AND INTERPRETATION 1

Example 1: One Way ANOVA in MINITAB

CHAPTER ANALYSIS AND INTERPRETATION Average total number of collisions for a try to be scored

Safety at Intersections in Oregon A Preliminary Update of Statewide Intersection Crash Rates

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Is lung capacity affected by smoking, sport, height or gender. Table of contents

Pressured Applied by the Emergency/Israeli Bandage

Navigate to the golf data folder and make it your working directory. Load the data by typing

MJA Rev 10/17/2011 1:53:00 PM

BIOL 101L: Principles of Biology Laboratory

1 Hypothesis Testing for Comparing Population Parameters

Psychology - Mr. Callaway/Mundy s Mill HS Unit Research Methods - Statistics

Math SL Internal Assessment What is the relationship between free throw shooting percentage and 3 point shooting percentages?

1. Answer this student s question: Is a random sample of 5% of the students at my school large enough, or should I use 10%?

Puyallup Tribe of Indians Shellfish Department

Stats 2002: Probabilities for Wins and Losses of Online Gambling

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

1. In a hypothesis test involving two-samples, the hypothesized difference in means must be 0. True. False

Driv e accu racy. Green s in regul ation

Ecological Archives M A2

THE DEVELOPMENTOF A PREDICTION MODEL OF THE PASSENGER CAR EQUIVALENT VALUES AT DIFFERENT LOCATIONS

Empirical Example II of Chapter 7

STAT 155 Introductory Statistics. Lecture 2: Displaying Distributions with Graphs

Chapter 2: ANOVA and regression. Caroline Verhoeven

Background Information. Project Instructions. Problem Statement. EXAM REVIEW PROJECT Microsoft Excel Review Baseball Hall of Fame Problem

Hypothesis testing: ANOVA Test of the equality of means among c groups. Flow-chart

Math 121 Test Questions Spring 2010 Chapters 13 and 14

Behavior under Social Pressure: Empty Italian Stadiums and Referee Bias

Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart?

This article has been downloaded from JPES Journal of Physical Education an Sport Vol 25, no 4, December, 2009 e ISSN: p ISSN:

Lecture 22: Multiple Regression (Ordinary Least Squares -- OLS)

Case Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0%

EFFECTS OF METHYLHEXANAMINE (DMAA) ON C2C12 AND 3T3 STEM CELLS. Cameron Franz Pittsburgh Central Catholic High School Grade 11

Policy Management: How data and information impacts the ability to make policy decisions:

A Combined Recruitment Index for Demersal Juvenile Cod in NAFO Divisions 3K and 3L

Using Markov Chains to Analyze a Volleyball Rally

Chapter 9: Hypothesis Testing for Comparing Population Parameters

AP 11.1 Notes WEB.notebook March 25, 2014

Robert Jones Bandage Report

ESP 178 Applied Research Methods. 2/26/16 Class Exercise: Quantitative Analysis

Chapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19

First Server Advantage in Tennis. Michelle Okereke

STANDARDIZED CATCH RATES OF BLUEFIN TUNA, THUNNUS THYNNUS, FROM THE ROD AND REEL/HANDLINE FISHERY OFF THE NORTHEAST UNITED STATES DURING

Chapter 12 Practice Test

Effect of homegrown players on professional sports teams

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

Fruit Fly Exercise 1- Level 1

Calculation of Trail Usage from Counter Data

A Comparative Study of Running Agility, Jumping Ability and Throwing Ability among Cricket Players

An Analysis of the Components of Sport Imagery in Basketball Players

Chapter 7. Comparing Two Population Means. Comparing two population means. T-tests: Independent samples and paired variables.

Using Actual Betting Percentages to Analyze Sportsbook Behavior: The Canadian and Arena Football Leagues

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

WATER OIL RELATIVE PERMEABILITY COMPARATIVE STUDY: STEADY VERSUS UNSTEADY STATE

A Comparative Analysis of Motor Fitness Components among Sprinters, Throwers and Jumpers

STUDY PERFORMANCE REPORT

This page intentionally left blank

An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables

Analysis of AGFC Historical Crappie Trap-Netting Data. Aaron Kern and Andy Yung Arkansas Game and Fish Commission District 6 Fisheries Camden, AR

Midterm Exam 1, section 2. Thursday, September hour, 15 minutes

Journal of Emerging Trends in Computing and Information Sciences

Evaluation and further development of car following models in microscopic traffic simulation

An investigation of the variability of start-up lost times and departure headways at signalized intersections in urban areas

Fishery Resource Grant Program Final Report 2010

Improve pipetting results in pharmaceutical formulation by using MICROMAN E

NCSS Statistical Software

Section I: Multiple Choice Select the best answer for each problem.

Guide to Computing Minitab commands used in labs (mtbcode.out)

Confidence Interval Notes Calculating Confidence Intervals

STATISTICS BASED SYSTEM DESIGN FOR PERFORATED CLUSTERS

Appendix: Tables. Table XI. Table I. Table II. Table XII. Table III. Table IV

EXPLORING MOTIVATION AND TOURIST TYPOLOGY: THE CASE OF KOREAN GOLF TOURISTS TRAVELLING IN THE ASIA PACIFIC. Jae Hak Kim

A COMPARATIVE STUDY OF PHYSICAL DIFFERENCES BETWEEN ATHLETES OF SELECTED EVENTS IN TRACK AND FIELD

THE BEHAVIOR OF GASES

STANDARDIZED CATCH RATE OF SAILFISH (Istiophorus platypterus) CAUGHT BY BRAZILIAN LONGLINERS IN THE ATLANTIC OCEAN ( )

TECHNICAL REPORT STANDARD TITLE PAGE

STATISTICS ELEMENTARY MARIO F. TRIOLA. Descriptive Statistics EIGHTH EDITION

Transcription:

A few things to remember about ANOVA 1) The F-test that is performed is always 1-tailed. This is because your alternative hypothesis is always that the between group variation is greater than the within group variation (it s impossible for it to be smaller). 2) Interactions: these are hard to conceptualize; best example is a synergistic or antagonistic drug-drug interaction. 3) An ANOVA technically cannot tell you where the differences are (although often they are quite obvious). Always best to follow-up with a posthoc test. 4) Biggest problems encoutered with ANOVAs in Excel is setting up the rows and columns correctly.

Chemometrics Lecture 2.3: ANOVA Jonathan Benskin

Learning Objectives Understand when and why to use ANOVA (including assumptions). Understand difference between one-way vs two-way ANOVA (with and without replication). Be able to calculate one-way ANOVA by hand and using Excel s DataAnalysis add-on. Be aware of the various post-hoc tests which follow ANOVA.

Introduction Analysis of variance (ANOVA) is used to test the hypothesis that there is no difference between two or more group means. Recall that in the case of two group means, we use the two-sample t- test (shown in this case for equal variances): t calc

Why can t we compare multiple groups with the t-test? Every time you conduct a t-test there is a chance of making a Type I error (5% assuming α=0.05). Conducting multiple t-tests can lead to an increase in in the I error. An ANOVA controls for these errors so the Type I error remains at 5%.

Overview of ANOVA ANOVAs compare the variation between groups versus the variation within groups to assess whether there are differences in the means. Important: An ANOVA cannot tell you which groups are signficantly different, only that a signficant difference exists.

One-way ANOVA-assumptions Assumptions: Dependent variable should be continuous; independent variable should consist of 2 or more independent groups. All observations should be independent, groups should have equal variance (homoscedasticity; rule of thumb: ratio of largest to smallest sample st. dev. should be less than 2:1) and be approximately normally distributed.

Hypotheses one-way ANOVA H0 : μ1 μ2 μ3 μc All population means are equal H 1 : Not allof the populationmeans are the same At least one population mean is different. Does not mean that all population means are different.

One-way ANOVA 1) start by determining the variation within and between each group. BETWEEN groups variation for each data value look at the difference between its group mean and the overall mean Mean for group i x x 2 i Mean for entire dataset WITHIN groups variation for each data value we look at the difference between that value and the mean of its group Mean for individual j in group i x 2 ij x i Mean for group i

One-way ANOVA 2) Determine the sum of squares Total variance Degrees of freedom total SST SSW obs obs ( x ij ( x ij x) x i 2 ) 2 s 2 ( DFT ) groups s 2 i ( df i Sum of squares TOTAL Variance of group i Degrees of freedom group i ) Sum of squares WITHIN groups SSB groups n i ( x i x) 2 Sum of squares BETWEEN groups

One-way ANOVA 2) Determine the sum of squares SST SSW obs obs 2 2 ( x x) s ( DFT ) (DF = total # observations 1) ij ( x ij x i ) 2 groups s 2 i ( df Degrees of freedom assuming equal group sizes i ) (DF = total # observations # groups) SSB groups n i ( x i x) 2 (DF = total # groups 1) Note that: SST = SSW + SSB

One-way ANOVA 3) Calculate mean squares Mean square between Mean square within MSB SSB DFB MSW SSW DFW Degrees of freedom within The ANOVA F-statistic is a ratio of the Between Group Variation divided by the Within Group Variation: F calc Between Within MSB MSW A large F calc is evidence against H 0, since it indicates that there is more difference between groups than within groups. F crit (which we get from F-tables) is determined using DF between and DF within. If F calc >F crit, the null hypothesis is rejected.

One-way ANOVA Note that the equations are different when dealing with different sample sizes. MSW = 2 s w = within groups mean square = within groups variance MSB = 2 s b = between groups mean square = between groups variance Note: Excel and Unscrambler automatically adjust the equations depending on whether you are doing a one-way ANOVA with equivalent or different sample sizes.

Example 1-Excel Workbook One-way ANOVA The following data shows serum uric acid levels for 3 populations. Test whether the means are signficantly different. Group 1 Group 2 Group 3 1.2 1.7 1.3 0.8 1.5 1.5 1.1 2.0 1.4 0.7 2.1 1.0 0.9 1.1 1.8 1.1 0.9 1.4 1.5 2.2 1.9 0.8 1.8 0.9 1.6 1.3 1.9 0.9 1.5 1.8

Example 1-Excel Workbook One-way ANOVA 1. The hypothesis: H 0 : µ 1 =µ 2 =µ 3 vs. H 1 : µ 1 µ 2 µ 3 2. The assumptions: Independent random samples, 2 2 2 normal distributions, 3. The -level : = 0.05 4. The test statistic: ANOVA 1 2 3

SSW Calculating sum of squares within (SSW) Group 1 x - mean (x-mean) 2 Group 2 x - mean (x-mean) 2 Group 3 x - mean (x-mean) 2 1.2 0.14 0.0196 1.7 0.09 0.0081 1.3-0.19 0.0361 0.8-0.26 0.0676 1.5-0.11 0.0121 1.5 0.01 0.0001 1.1 0.04 0.0016 2 0.39 0.1521 1.4-0.09 0.0081 0.7-0.36 0.1296 2.1 0.49 0.2401 1-0.49 0.2401 0.9-0.16 0.0256 1.1-0.51 0.2601 1.8 0.31 0.0961 1.1 0.04 0.0016 0.9-0.71 0.5041 1.4-0.09 0.0081 1.5 0.44 0.1936 2.2 0.59 0.3481 1.9 0.41 0.1681 0.8-0.26 0.0676 1.8 0.19 0.0361 0.9-0.59 0.3481 1.6 0.54 0.2916 1.3-0.31 0.0961 1.9 0.41 0.1681 0.9-0.16 0.0256 1.5-0.11 0.0121 1.8 0.31 0.0961 Sum 10.6-2.2E-16 0.824 16.1-5.6E-16 1.669 14.9 0 1.169 Mean 1.06-2.2E-17 0.0824 1.61-5.6E-17 0.1669 1.49 0 0.1169 2 i groups SSW = 0.824 + 1.669 + 1.169 = 3.662 2 ( x x ) s ( df ) DF = total # observations # groups = 27 obs ij i Example 1-Excel Workbook One-way ANOVA i

Example 1-Excel Workbook One-way ANOVA Calculating sum of squares total (SST) SST ( x obs SST = 5.334 ij x) 2 s 2 ( DFT DF = total # observations 1 = 29 ) Observations x - mean (x-mean) 2 1.2-0.19 0.03 0.8-0.59 0.34 1.1-0.29 0.08 0.7-0.69 0.47 0.9-0.49 0.24 1.1-0.29 0.08 1.5 0.11 0.01 0.8-0.59 0.34 1.6 0.21 0.05 0.9-0.49 0.24 1.7 0.31 0.10 1.5 0.11 0.01 2 0.61 0.38 2.1 0.71 0.51 1.1-0.29 0.08 0.9-0.49 0.24 2.2 0.81 0.66 1.8 0.41 0.17 1.3-0.09 0.01 1.5 0.11 0.01 1.3-0.09 0.01 1.5 0.11 0.01 1.4 0.01 0.00 1-0.39 0.15 1.8 0.41 0.17 1.4 0.01 0.00 1.9 0.51 0.26 0.9-0.49 0.24 1.9 0.51 0.26 1.8 0.41 0.17 Sum 41.6 5.22E-15 5.3346667 mean 1.39

Example 1-Excel Workbook One-way ANOVA Calculating sum of squares between (SSB): SST = SSW + SSB SSB ni ( xi groups x) 2 Sum of Squares Between (SSB): Mean x - mean (x-mean) 2 n(x-mean) 2 Group 1 1.06-0.33 0.11 1.067111 Group 2 1.61 0.22 0.05 0.498778 Group 3 1.49 0.10 0.01 0.106778 Sum 4.16 6.66E-16 0.17 1.67 mean 1.39 0.00 0.06 0.56 SSB = 1.67 DF = total # groups 1 = 2

Example 1-Excel Workbook One-way ANOVA Calculating mean squares MSB SSB DFB MSW SSW DFW = 1.67/2 = 0.835 = 3.66/27 = 0.135 Between MSB 0.835 F 6.16 F crit(2,27) = 3.35 Within MSW 0.135 F calc >F crit, therefore null hypothesis is rejected.

Problem 1-Excel Workbook 1) Solve Example 2 in ANOVA Excel workbook by hand. 2) Use Excel s DataAnalysis add-on and compare your result.

SST Anova: Single Factor Example 2-Excel Workbook SUMMARY Groups Count Sum Average Variance Group 1 5 20 4 5.5 Group 2 5 40 8 4.5 Group 3 5 65 13 3.5 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 203.3333 2 101.6667 22.59259 8.54E-05 3.885294 Within Groups 54 12 4.5 Total 257.3333 14 SSB SSW MSB MSW F calc

Two-way ANOVA When two factors may affect the results of an experiment, two-way (twofactor) ANOVA must be used to study their effects. Most common is with replication. Two-way ANOVA is commonly used for analyzing data generated from a repeated measures study (i.e. where an observation has been made on the same individual more than once).

Two-way ANOVA-without replication We use a 2-way ANOVA without replication when there is a single observation for each combination of the nominal variables. The hypotheses are that: 1) the means of observations grouped by one factor are the same, and 2) the means of observations grouped by the other factor are the same. Example 7.3.1 (M&M): Here we are testing whether (i) the different chelating agents have significantly different efficiencies, and (ii) whether the day-to-day variation is significantly greater than the variation due to the random error of measurement.

Two-way ANOVA-without replication Anova: Two-Factor Without Replication SUMMARY Count Sum Average Variance Day 1 4 326 81.5 5.666667 Day 2 4 315 78.75 1.583333 Day 3 4 319 79.75 5.583333 A 3 246 82 7 B 3 235 78.33333 2.333333 C 3 243 81 3 D 3 236 78.66667 0.333333 ANOVA Source of Variation SS df MS F P-value F crit Rows 15.5 2 7.75 4.728814 0.058482 5.143253 Columns 28.66667 3 9.555556 5.830508 0.032756 4.757063 Error 9.833333 6 1.638889 Total 54 11 Two values of F calc Tells you whether there are differences between days Tells you whether there are differences between chelating agents.

Two-way ANOVA-with replication A two-way ANOVA with replication tests 3 null hypotheses: 1) that the means of observations grouped by one factor are the same 2) that the means of observations grouped by the other factor are the same; 3) That there is no interaction between the two factors (i.e. effects of one factor which depend on the other). Modified chelating agent example Chelating Agents Day A B C D Day 1 84 80 83 79 Day 1 82 81 82 80 Day 1 83 80 81 79 Day 2 84 79 80 79 Day 2 81 70 81 77 Day 2 81 81 84 78 Day 3 83 78 80 78 Day 3 80 78 81 77 Day 3 82 80 81 79

Two-way ANOVA-with replication Anova: Two-Factor With Replication SUMMARY A B C D Total Day 1 Count 3 3 3 3 12 Sum 249 241 246 238 974 Average 83 80.33333 82 79.33333 81.16667 Variance 1 0.333333 1 0.333333 2.69697 Day 2 Count 3 3 3 3 12 Sum 246 230 245 234 955 Average 82 76.66667 81.66667 78 79.58333 Variance 3 34.33333 4.333333 1 13.53788 Day 3 Count 3 3 3 3 12 Sum 245 236 242 234 957 Average 81.66667 78.66667 80.66667 78 79.75 Variance 2.333333 1.333333 0.333333 1 3.295455 Total Count 9 9 9 9 Sum 740 707 733 706 Average 82.22222 78.55556 81.44444 78.44444 Variance 1.944444 11.52778 1.777778 1.027778 ANOVA Source of Variation SS df MS F P-value F crit Sample 18.16667 2 9.083333 2.165563 0.136574 3.402826 Columns 102.7778 3 34.25926 8.16777 0.000639 3.008787 Interaction 11.38889 6 1.898148 0.452539 0.835998 2.508189 Within 100.6667 24 4.194444 Now we have a 3rd value for F crit, which is the interaction between days and chelating agent. Total 233 35

Problems 2 and 3-Excel Workbook 1) Solve Problem 2 using the DataAnalysis tool. 2) Open Problem 2 by hand and fill in the missing data in the yellow boxes. Compare the results to those generated by the DataAnalysis tool (i.e. results from step 1). 3) Complete problem 3.

Posthoc tests ANOVA tests whether you have an overall difference between your groups, but it does not tell you which specific groups were different. This is where posthoc tests come in. Most common are Fisher s LSD, Tukey, and Scheffe. The procedures differ in the amount and kind of adjustment to alpha provided. Scheffe: most likely to lead to type 2 errors/least likely to lead to type 1 errors. Tukey: moderate chance of type 1 and 2 errors. Fisher s LSD: Least likely to lead to type 2 errors/most likely to lead to type 1 errors.

Tukey s HSD Also known as Tukey s Range test, Tukey s test, or Tukey-Kramer method. Assuming you have already performed an ANOVA and found that there is a statistically significant difference among your groups Step 1. Select two means and note the relevant variables (Means, Mean Square Within, and number per condition/group). Step 3. Calculate Tukey's test for each mean comparison using the following equation: Check to see if Tukey's score is statistically significant with Tukey's probability/critical value table taking into account appropriate df within and number of treatments.

Review of what we covered When and why to use ANOVA (including assumptions). The difference between one-way vs two-way ANOVA (with and without replication). We calculated one-way ANOVA by hand and using Excel s DataAnalysis add-on. We reviewed various post-hoc tests which follow ANOVA.