Unit 4: Inference for numerical variables Lecture 3: ANOVA

Size: px
Start display at page:

Download "Unit 4: Inference for numerical variables Lecture 3: ANOVA"

Transcription

1 Unit 4: Inference for numerical variables Lecture 3: ANOVA Statistics 101 Thomas Leininger June 10, 2013

2 Announcements Announcements Proposals due tomorrow. Will be returned to you by Wednesday. You MUST complete the proposal process. A few things to watch out for: Data is plural, data set is singular. Avoid using population data - if you have population data, you might consider taking a random sample. Exploratory analysis: should include some summary statistics and some graphics AND interpretations. If using existing data, find out how your data were collected, and discuss the sampling method as well as any possible biases. Scope of inference: generalizability & causality. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

3 Aldrin in the Wolf River The Wolf River in Tennessee flows past an abandoned site once used by the pesticide industry for dumping wastes, including chlordane (pesticide), aldrin, and dieldrin (both insecticides). These highly toxic organic compounds can cause various cancers and birth defects. The standard methods to test whether these substances are present in a river is to take samples at six-tenths depth. But since these compounds are denser than water and their molecules tend to stick to particles of sediment, they are more likely to be found in higher concentrations near the bottom than near mid-depth. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

4 Aldrin in the Wolf River Data Aldrin concentration (nanograms per liter) at three levels of depth. aldrin depth bottom bottom bottom middepth middepth middepth surface surface surface Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

5 Aldrin in the Wolf River Exploratory analysis Aldrin concentration (nanograms per liter) at three levels of depth. surface middepth bottom n mean sd bottom middepth surface overall Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

6 Aldrin in the Wolf River Research question Is there a difference between the mean aldrin concentrations among the three levels? To compare means of 2 groups we use a Z or a T statistic. To compare means of 3+ groups we use a new test called ANOVA and a new statistic called F. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

7 Aldrin in the Wolf River Recap: 2-sample CIs and HTs n mean sd bottom middepth surface overall HT: T df = ( x 1 x 2 ) null value SE where SE = df = min(n 1 1, n 2 1) CI: ( x 1 x 2 ) ± t df SE s 2 1 n 1 + s2 2 n 2 Application exercise: Perform a HT and construct a CI for each difference. and Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

8 Aldrin in the Wolf River ANOVA ANOVA is used to assess whether the mean of the outcome variable is different for different levels of a categorical variable. H 0 : The mean outcome is the same across all categories, µ 1 = µ 2 = = µ k, where µ i represents the mean of the outcome for observations in category i. H A : At least one mean is different than others. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

9 Aldrin in the Wolf River Conditions 1 The observations should be independent within and between groups If the data are a simple random sample, this condition is satisfied. Carefully consider whether the between-group data is independent (e.g. no pairing). Always important, but sometimes difficult to check. 2 The observations within each group should be nearly normal. Especially important when the sample sizes are small. How do we check for normality? 3 The variability across the groups should be about equal. Especially important when the sample sizes differ between groups. How can we check this condition? Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

10 Aldrin in the Wolf River z/t test vs. ANOVA - Purpose z/t test ANOVA Compare means from two groups to see whether they are so far apart that the observed difference cannot reasonably be attributed to sampling variability. H 0 : µ 1 = µ 2 H A : µ 1 µ 2 H A : µ 1 < µ 2 Compare the means from two or more groups to see whether they are so far apart that the observed differences cannot all reasonably be attributed to sampling variability. H 0 : µ 1 = µ 2 = = µ k H A : At least one mean is different H A : µ 1 > µ 2 Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

11 Aldrin in the Wolf River z/t test vs. ANOVA - Method z/t test Compute a test statistic (a ratio). z/t = ( x 1 x 2 ) (µ 1 µ 2 ) SE( x 1 x 2 ) ANOVA Compute a test statistic (a ratio). F = variability bet. groups variability w/in groups Large test statistics lead to small p-values. If the p-value is small enough H 0 is rejected, and we conclude that the population means are not equal. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

12 Aldrin in the Wolf River z/t test vs. ANOVA With only two groups t-test and ANOVA are equivalent, but only if we use a pooled standard variance in the denominator of the test statistic. With more than two groups, ANOVA compares the sample means to an overall grand mean. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

13 Aldrin in the Wolf River Hypotheses Question What are the correct hypotheses for testing for a difference between the mean aldrin concentrations among the three levels? (a) H 0 : µ B = µ M = µ S H A : µ B µ M µ S (b) H 0 : µ B µ M µ S H A : µ B = µ M = µ S (c) H 0 : µ B = µ M = µ S H A : At least one mean is different. (d) H 0 : µ B = µ M = µ S = 0 H A : At least one mean is different. (e) H 0 : µ B = µ M = µ S H A : µ B > µ M > µ S Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

14 ANOVA and the F test Test statistic Does there appear to be a lot of variability within groups? How about between groups? F = variability bet. groups variability w/in groups Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

15 ANOVA and the F test F distribution and p-value F = variability bet. groups variability w/in groups In order to be able to reject H 0, we need a small p-value, which requires a large F statistic. In order to obtain a large F statistic, variability between sample means needs to be greater than variability within sample means. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

16 ANOVA output, deconstructed Df Sum Sq Mean Sq F value Pr(>F) (Group) depth (Error) Residuals Total Degrees of freedom associated with ANOVA groups: df G = k 1, where k is the number of groups total: df T = n 1, where n is the total sample size error: df E = df T df G df G = k 1 = 3 1 = 2 df T = n 1 = 30 1 = 29 df E = 29 2 = 27 Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

17 ANOVA output, deconstructed Df Sum Sq Mean Sq F value Pr(>F) (Group) depth (Error) Residuals Total Sum of squares between groups, SSG Measures the variability between groups k SSG = n i ( x i x) 2 i=1 where n i is each group size, x i is the average for each group, x is the overall (grand) mean. n mean bottom middepth surface overall SSG = ( 10 ( ) 2) + ( 10 ( ) 2) + ( 10 ( ) 2) = Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

18 ANOVA output, deconstructed Df Sum Sq Mean Sq F value Pr(>F) (Group) depth (Error) Residuals Total Sum of squares total, SST Measures the total variability n SST = (x i x) i=1 where x i represents each observation in the dataset. SST = ( ) 2 + ( ) 2 + ( ) ( ) 2 = ( 1.3) 2 + ( 0.3) 2 + ( 0.2) (0.1) 2 = = Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

19 ANOVA output, deconstructed Df Sum Sq Mean Sq F value Pr(>F) (Group) depth (Error) Residuals Total Sum of squares error, SSE Measures the variability within groups: SSE = SST SSG SSE = = Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

20 ANOVA output, deconstructed Df Sum Sq Mean Sq F value Pr(>F) (Group) depth (Error) Residuals Total Mean square error Mean square error is calculated as sum of squares divided by the degrees of freedom. MSG = 16.96/2 = 8.48 MSE = 37.33/27 = 1.38 Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

21 ANOVA output, deconstructed Df Sum Sq Mean Sq F value Pr(>F) (Group) depth (Error) Residuals Total Test statistic, F value The F statistic is the ratio of the between group and within group variability. F = MSG MSE F = = 6.14 Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

22 ANOVA output, deconstructed p-value Df Sum Sq Mean Sq F value Pr(>F) (Group) depth (Error) Residuals Total p-value is the probability of at least as large a ratio between the between group and within group variability, if in fact the means of all groups are equal. It s calculated as the area under the F curve, with degrees of freedom df G and df E, above the observed F statistic. dfg = 2 ; dfe = Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

23 ANOVA output, deconstructed Conclusion If p-value is small (less than α), reject H 0. The data provide convincing evidence that at least one mean is different from (but we can t tell which one). If p-value is large, fail to reject H 0. The data do not provide convincing evidence that at least one pair of means are different from each other, the observed differences in sample means are attributable to sampling variability (or chance). Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

24 ANOVA output, deconstructed Conclusion - in context Question What is the conclusion of the hypothesis test for α = 0.05? (p-value = ) The data provide convincing evidence that the average aldrin concentration (a) is different for all groups. (b) on the surface is lower than the other levels. (c) is different for at least one group. (d) is the same for all groups. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

25 Checking conditions (1) independence Does this condition appear to be satisfied? Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

26 Checking conditions (2) approximately normal Does this condition appear to be satisfied? 9 8 bottom middepth 5.0 surface Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

27 Checking conditions (3) constant variance Does this condition appear to be satisfied? bottom sd=1.58 middepth sd=1.10 surface sd=0.66 Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

28 Multiple comparisons & Type 1 error rate Which means differ? Earlier we concluded that at least one pair of means differ. The natural question that follows is which ones? We can do two sample t tests for differences in each possible pair of groups. Can you see any pitfalls with this approach? When we run too many tests, the Type 1 Error rate increases. This issue is resolved by using a modified significance level. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

29 Multiple comparisons & Type 1 error rate Multiple comparisons The scenario of testing many pairs of groups is called multiple comparisons. The Bonferroni correction suggests that a more stringent significance level is more appropriate for these tests: α = α/k where K is the number of comparisons being considered. If there are k groups, then usually all possible pairs are compared and K = k(k 1) 2. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

30 Multiple comparisons & Type 1 error rate Determining the modified α Question In the aldrin data set depth has 3 levels: bottom, mid-depth, and surface. If α = 0.05, what should be the modified significance level for two sample t tests for determining which pairs of groups have significantly different means? (a) α = 0.05 (b) α = 0.05/2 = (c) α = 0.05/3 = (d) α = 0.05/6 = Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

31 Multiple comparisons & Type 1 error rate Which means differ? Question Based on the box plots below, which means would you expect to be significantly different? 9 (a) bottom & surface 8 (b) bottom & mid-depth 7 (c) mid-depth & surface 6 5 (d) bottom & mid-depth; mid-depth & surface 4 3 bottom sd=1.58 middepth sd=1.10 surface sd=0.66 (e) bottom & mid-depth; bottom & surface; mid-depth & surface Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

32 Multiple comparisons & Type 1 error rate Which means differ? (cont.) If the ANOVA assumption of equal variability across groups is satisfied, we can use the data from all groups to estimate variability: Estimate any within-group standard deviation with MSE, which is s pooled Use the error degrees of freedom, n k, for t-distributions Difference in two means: after ANOVA σ 2 1 SE = + σ2 2 MSE n 1 n 2 n 1 + MSE n 2 Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

33 Multiple comparisons & Type 1 error rate Is there a difference between the average aldrin concentration at the bottom and at mid depth? n mean sd bottom middepth surface overall Df Sum Sq Mean Sq F value Pr(>F) depth Residuals Total T dfe = ( x bottom x middepth ) MSE n bottom + MSE n middepth T 27 = ( ) = = < p value < 0.10 (two-sided) α = 0.05/3 = Fail to reject H 0, the data do not provide convincing evidence of a difference between the average aldrin concentrations at bottom and mid depth. Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

34 Multiple comparisons & Type 1 error rate Application exercise: Post-hoc comparison Is there evidence of a difference between the average aldrin concentration at the bottom and at surface? (a) yes (b) no Statistics 101 (Thomas Leininger) U4 - L3: ANOVA June 10, / 34

Unit4: Inferencefornumericaldata 4. ANOVA. Sta Spring Duke University, Department of Statistical Science

Unit4: Inferencefornumericaldata 4. ANOVA. Sta Spring Duke University, Department of Statistical Science Unit4: Inferencefornumericaldata 4. ANOVA Sta 101 - Spring 2016 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_s16 Outline 1. Housekeeping

More information

Analysis of Variance. Copyright 2014 Pearson Education, Inc.

Analysis of Variance. Copyright 2014 Pearson Education, Inc. Analysis of Variance 12-1 Learning Outcomes Outcome 1. Understand the basic logic of analysis of variance. Outcome 2. Perform a hypothesis test for a single-factor design using analysis of variance manually

More information

PLANNED ORTHOGONAL CONTRASTS

PLANNED ORTHOGONAL CONTRASTS PLANNED ORTHOGONAL CONTRASTS Please note: This handout is useful background for the workshop, not what s covered in it. Basic principles for contrasts are the same in repeated measures. Planned orthogonal

More information

A few things to remember about ANOVA

A few things to remember about ANOVA A few things to remember about ANOVA 1) The F-test that is performed is always 1-tailed. This is because your alternative hypothesis is always that the between group variation is greater than the within

More information

Week 7 One-way ANOVA

Week 7 One-way ANOVA Week 7 One-way ANOVA Objectives By the end of this lecture, you should be able to: Understand the shortcomings of comparing multiple means as pairs of hypotheses. Understand the steps of the ANOVA method

More information

ANOVA - Implementation.

ANOVA - Implementation. ANOVA - Implementation http://www.pelagicos.net/classes_biometry_fa17.htm Doing an ANOVA With RCmdr Categorical Variable One-Way ANOVA Testing a single Factor dose with 3 treatments (low, mid, high) Doing

More information

Stat 139 Homework 3 Solutions, Spring 2015

Stat 139 Homework 3 Solutions, Spring 2015 Stat 39 Homework 3 Solutions, Spring 05 Problem. Let i Nµ, σ ) for i,..., n, and j Nµ, σ ) for j,..., n. Also, assume that all observations are independent from each other. In Unit 4, we learned that the

More information

Chapter 12 Practice Test

Chapter 12 Practice Test Chapter 12 Practice Test 1. Which of the following is not one of the conditions that must be satisfied in order to perform inference about the slope of a least-squares regression line? (a) For each value

More information

Section I: Multiple Choice Select the best answer for each problem.

Section I: Multiple Choice Select the best answer for each problem. Inference for Linear Regression Review Section I: Multiple Choice Select the best answer for each problem. 1. Which of the following is NOT one of the conditions that must be satisfied in order to perform

More information

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions Announcements Announcements Lecture 19: Inference for SLR & Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner will be announced

More information

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

MGB 203B Homework # LSD = 1 1

MGB 203B Homework # LSD = 1 1 MGB 0B Homework # 4.4 a α =.05: t = =.05 LSD = α /,n k t.05, 7 t α /,n k MSE + =.05 700 + = 4.8 n i n j 0 0 i =, j = 8.7 0.4 7. i =, j = 8.7.7 5.0 i =, j = 0.4.7. Conclusion: µ differs from µ and µ. b

More information

One-factor ANOVA by example

One-factor ANOVA by example ANOVA One-factor ANOVA by example 2 One-factor ANOVA by visual inspection 3 4 One-factor ANOVA H 0 H 0 : µ 1 = µ 2 = µ 3 = H A : not all means are equal 5 One-factor ANOVA but why not t-tests t-tests?

More information

Experimental Design and Data Analysis Part 2

Experimental Design and Data Analysis Part 2 Experimental Design and Data Analysis Part 2 Assump@ons for Parametric Tests t- test and ANOVA Independence Variance Normality t-test Yes Yes Yes ANOVA Yes Yes Yes Lecture 7 AEC 460 Assume homogeneity

More information

Factorial Analysis of Variance

Factorial Analysis of Variance Factorial Analysis of Variance Overview of the Factorial ANOVA Factorial ANOVA (Two-Way) In the context of ANOVA, an independent variable (or a quasiindependent variable) is called a factor, and research

More information

Select Boxplot -> Multiple Y's (simple) and select all variable names.

Select Boxplot -> Multiple Y's (simple) and select all variable names. One Factor ANOVA in Minitab As an example, we will use the data below. A study looked at the days spent in the hospital for different regions of the United States. Can the company reject the claim the

More information

AP 11.1 Notes WEB.notebook March 25, 2014

AP 11.1 Notes WEB.notebook March 25, 2014 11.1 Chi Square Tests (Day 1) vocab *new seats* examples Objectives Comparing Observed & Expected Counts measurements of a categorical variable (ex/ color of M&Ms) Use Chi Square Goodness of Fit Test Must

More information

Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday.

Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday. Announcements Announcements UNIT 7: MULTIPLE LINEAR REGRESSION LECTURE 1: INTRODUCTION TO MLR STATISTICS 101 Problem Set 10 Due Wednesday Nicole Dalzell June 15, 2015 Statistics 101 (Nicole Dalzell) U7

More information

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question: Data Set 7: Bioerosion by Parrotfish Background Bioerosion of coral reefs results from animals taking bites out of the calcium-carbonate skeleton of the reef. Parrotfishes are major bioerosion agents,

More information

Stats 2002: Probabilities for Wins and Losses of Online Gambling

Stats 2002: Probabilities for Wins and Losses of Online Gambling Abstract: Jennifer Mateja Andrea Scisinger Lindsay Lacher Stats 2002: Probabilities for Wins and Losses of Online Gambling The objective of this experiment is to determine whether online gambling is a

More information

Name May 3, 2007 Math Probability and Statistics

Name May 3, 2007 Math Probability and Statistics Name May 3, 2007 Math 341 - Probability and Statistics Long Exam IV Instructions: Please include all relevant work to get full credit. Encircle your final answers. 1. An article in Professional Geographer

More information

Legendre et al Appendices and Supplements, p. 1

Legendre et al Appendices and Supplements, p. 1 Legendre et al. 2010 Appendices and Supplements, p. 1 Appendices and Supplement to: Legendre, P., M. De Cáceres, and D. Borcard. 2010. Community surveys through space and time: testing the space-time interaction

More information

One-way ANOVA: round, narrow, wide

One-way ANOVA: round, narrow, wide 5/4/2009 9:19:18 AM Retrieving project from file: 'C:\DOCUMENTS AND SETTINGS\BOB S\DESKTOP\RJS\COURSES\MTAB\FIRSTBASE.MPJ' ========================================================================== This

More information

Navigate to the golf data folder and make it your working directory. Load the data by typing

Navigate to the golf data folder and make it your working directory. Load the data by typing Golf Analysis 1.1 Introduction In a round, golfers have a number of choices to make. For a particular shot, is it better to use the longest club available to try to reach the green, or would it be better

More information

Statistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007

Statistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007 Statistical Analysis of PGA Tour Skill Rankings 198-26 USGA Research and Test Center June 1, 27 1. Introduction The PGA Tour has recorded and published Tour Player performance statistics since 198. All

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 256 Introduction This procedure computes summary statistics and common non-parametric, single-sample runs tests for a series of n numeric, binary, or categorical data values. For numeric data,

More information

5.1 Introduction. Learning Objectives

5.1 Introduction. Learning Objectives Learning Objectives 5.1 Introduction Statistical Process Control (SPC): SPC is a powerful collection of problem-solving tools useful in achieving process stability and improving capability through the

More information

Chapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19

Chapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19 Chapter 13 Factorial ANOVA Patrick Mair 2015 Psych 1950 13 Factorial ANOVA 0 / 19 Today s Menu Now we extend our one-way ANOVA approach to two (or more) factors. Factorial ANOVA: two-way ANOVA, SS decomposition,

More information

Guide to Computing Minitab commands used in labs (mtbcode.out)

Guide to Computing Minitab commands used in labs (mtbcode.out) Guide to Computing Minitab commands used in labs (mtbcode.out) A full listing of Minitab commands can be found by invoking the HELP command while running Minitab. A reference card, with listing of available

More information

Running head: DATA ANALYSIS AND INTERPRETATION 1

Running head: DATA ANALYSIS AND INTERPRETATION 1 Running head: DATA ANALYSIS AND INTERPRETATION 1 Data Analysis and Interpretation Final Project Vernon Tilly Jr. University of Central Oklahoma DATA ANALYSIS AND INTERPRETATION 2 Owners of the various

More information

Biostatistics & SAS programming

Biostatistics & SAS programming Biostatistics & SAS programming Kevin Zhang March 6, 2017 ANOVA 1 Two groups only Independent groups T test Comparison One subject belongs to only one groups and observed only once Thus the observations

More information

Example 1: One Way ANOVA in MINITAB

Example 1: One Way ANOVA in MINITAB Example : One Way ANOVA in MINITAB A consumer group wants to compare a new brand of wax (Brand-X) to two leading brands (Sureglow and Microsheen) in terms of Effectiveness of wax. Following data is collected

More information

Factorial ANOVA Problems

Factorial ANOVA Problems Factorial ANOVA Problems Q.1. In a 2-Factor ANOVA, measuring the effects of 2 factors (A and B) on a response (y), there are 3 levels each for factors A and B, and 4 replications per treatment combination.

More information

Confidence Interval Notes Calculating Confidence Intervals

Confidence Interval Notes Calculating Confidence Intervals Confidence Interval Notes Calculating Confidence Intervals Calculating One-Population Mean Confidence Intervals for Quantitative Data It is always best to use a computer program to make these calculations,

More information

Distancei = BrandAi + 2 BrandBi + 3 BrandCi + i

Distancei = BrandAi + 2 BrandBi + 3 BrandCi + i . Suppose that the United States Golf Associate (USGA) wants to compare the mean distances traveled by four brands of golf balls when struck by a driver. A completely randomized design is employed with

More information

The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Outline Definition. Deriving the Estimates. Properties of the Estimates. Units of Measurement and Functional Form. Expected

More information

Driv e accu racy. Green s in regul ation

Driv e accu racy. Green s in regul ation LEARNING ACTIVITIES FOR PART II COMPILED Statistical and Measurement Concepts We are providing a database from selected characteristics of golfers on the PGA Tour. Data are for 3 of the players, based

More information

Setting up group models Part 1 NITP, 2011

Setting up group models Part 1 NITP, 2011 Setting up group models Part 1 NITP, 2011 What is coming up Crash course in setting up models 1-sample and 2-sample t-tests Paired t-tests ANOVA! Mean centering covariates Identifying rank deficient matrices

More information

Analysis of recent swim performances at the 2013 FINA World Championship: Counsilman Center, Dept. Kinesiology, Indiana University

Analysis of recent swim performances at the 2013 FINA World Championship: Counsilman Center, Dept. Kinesiology, Indiana University Analysis of recent swim performances at the 2013 FINA World Championship: initial confirmation of the rumored current. Joel M. Stager 1, Andrew Cornett 2, Chris Brammer 1 1 Counsilman Center, Dept. Kinesiology,

More information

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present) Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present) Jonathan Tung University of California, Riverside tung.jonathanee@gmail.com Abstract In Major League Baseball, there

More information

Building an NFL performance metric

Building an NFL performance metric Building an NFL performance metric Seonghyun Paik (spaik1@stanford.edu) December 16, 2016 I. Introduction In current pro sports, many statistical methods are applied to evaluate player s performance and

More information

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together Statistics 111 - Lecture 7 Exploring Data Numerical Summaries for Relationships between Variables Administrative Notes Homework 1 due in recitation: Friday, Feb. 5 Homework 2 now posted on course website:

More information

Math 121 Test Questions Spring 2010 Chapters 13 and 14

Math 121 Test Questions Spring 2010 Chapters 13 and 14 Math 121 Test Questions Spring 2010 Chapters 13 and 14 1. (10 pts) The first-semester enrollment at HSC was 1120 students. If all entering freshmen classes were the same size and there were no attrition,

More information

1. Answer this student s question: Is a random sample of 5% of the students at my school large enough, or should I use 10%?

1. Answer this student s question: Is a random sample of 5% of the students at my school large enough, or should I use 10%? Econ 57 Gary Smith Fall 2011 Final Examination (150 minutes) No calculators allowed. Just set up your answers, for example, P = 49/52. BE SURE TO EXPLAIN YOUR REASONING. If you want extra time, you can

More information

Safety at Intersections in Oregon A Preliminary Update of Statewide Intersection Crash Rates

Safety at Intersections in Oregon A Preliminary Update of Statewide Intersection Crash Rates Portland State University PDXScholar Civil and Environmental Engineering Master's Project Reports Civil and Environmental Engineering 2015 Safety at Intersections in Oregon A Preliminary Update of Statewide

More information

Confidence Intervals with proportions

Confidence Intervals with proportions Confidence Intervals with proportions a.k.a., 1-proportion z-intervals AP Statistics Chapter 19 1-proportion z-interval Statistic + Critical value Standard deviation of the statistic POINT ESTIMATE STANDARD

More information

Lab 11: Introduction to Linear Regression

Lab 11: Introduction to Linear Regression Lab 11: Introduction to Linear Regression Batter up The movie Moneyball focuses on the quest for the secret of success in baseball. It follows a low-budget team, the Oakland Athletics, who believed that

More information

One Way ANOVA (Analysis of Variance)

One Way ANOVA (Analysis of Variance) One Wa ANOVA (Analsis of Variance) The one-wa analsis of variance (ANOVA) is used to determine whether there are an significant differences between the means of two or more independent (unrelated) groups

More information

CHAPTER ANALYSIS AND INTERPRETATION Average total number of collisions for a try to be scored

CHAPTER ANALYSIS AND INTERPRETATION Average total number of collisions for a try to be scored CHAPTER 8 8.1 ANALYSIS AND INTERPRETATION As mentioned in the previous chapter, four key components have been identified as indicators of the level of significance of dominant collisions when evaluating

More information

Lesson 22: Average Rate of Change

Lesson 22: Average Rate of Change Student Outcomes Students know how to compute the average rate of change in the height of water level when water is poured into a conical container at a constant rate. MP.1 Lesson Notes This lesson focuses

More information

Chapter 10 Gases. Characteristics of Gases. Pressure. The Gas Laws. The Ideal-Gas Equation. Applications of the Ideal-Gas Equation

Chapter 10 Gases. Characteristics of Gases. Pressure. The Gas Laws. The Ideal-Gas Equation. Applications of the Ideal-Gas Equation Characteristics of Gases Chapter 10 Gases Pressure The Gas Laws The Ideal-Gas Equation Applications of the Ideal-Gas Equation Gas mixtures and partial pressures Kinetic-Molecular Theory Real Gases: Deviations

More information

Is lung capacity affected by smoking, sport, height or gender. Table of contents

Is lung capacity affected by smoking, sport, height or gender. Table of contents Sample project This Maths Studies project has been graded by a moderator. As you read through it, you will see comments from the moderator in boxes like this: At the end of the sample project is a summary

More information

Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart?

Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart? Warm-up The number of deaths among persons aged 15 to 24 years in the United States in 1997 due to the seven leading causes of death for this age group were accidents, 12,958; homicide, 5,793; suicide,

More information

Psychology - Mr. Callaway/Mundy s Mill HS Unit Research Methods - Statistics

Psychology - Mr. Callaway/Mundy s Mill HS Unit Research Methods - Statistics Psychology - Mr. Callaway/Mundy s Mill HS Unit 2.3 - Research Methods - Statistics How do psychologists ask & answer questions? Last time we asked that we were discussing Research Methods. This time we

More information

Lesson 4: Describing the Center of a Distribution

Lesson 4: Describing the Center of a Distribution : Describing the Center of a Distribution In previous work with data distributions, you learned how to derive the mean and the median (the center) of a data distribution. In this lesson, we ll compare

More information

BBS Fall Conference, 16 September Use of modeling & simulation to support the design and analysis of a new dose and regimen finding study

BBS Fall Conference, 16 September Use of modeling & simulation to support the design and analysis of a new dose and regimen finding study BBS Fall Conference, 16 September 211 Use of modeling & simulation to support the design and analysis of a new dose and regimen finding study Didier Renard Background (1) Small molecule delivered by lung

More information

1. In a hypothesis test involving two-samples, the hypothesized difference in means must be 0. True. False

1. In a hypothesis test involving two-samples, the hypothesized difference in means must be 0. True. False STAT 350 (Spring 2016) Homework 9 Online 1 1. In a hypothesis test involving two-samples, the hypothesized difference in means must be 0. 2. The two-sample Z test can be used only if both population variances

More information

Empirical Example II of Chapter 7

Empirical Example II of Chapter 7 Empirical Example II of Chapter 7 1. We use NBA data. The description of variables is --- --- --- storage display value variable name type format label variable label marr byte %9.2f =1 if married wage

More information

PGA Tour Scores as a Gaussian Random Variable

PGA Tour Scores as a Gaussian Random Variable PGA Tour Scores as a Gaussian Random Variable Robert D. Grober Departments of Applied Physics and Physics Yale University, New Haven, CT 06520 Abstract In this paper it is demonstrated that the scoring

More information

The Reliability of Intrinsic Batted Ball Statistics Appendix

The Reliability of Intrinsic Batted Ball Statistics Appendix The Reliability of ntrinsic Batted Ball Statistics Appendix Glenn Healey, EECS Department University of California, rvine, CA 92617 Given information about batted balls for a set of players, we review

More information

STANDARD SCORES AND THE NORMAL DISTRIBUTION

STANDARD SCORES AND THE NORMAL DISTRIBUTION STANDARD SCORES AND THE NORMAL DISTRIBUTION REVIEW 1.MEASURES OF CENTRAL TENDENCY A.MEAN B.MEDIAN C.MODE 2.MEASURES OF DISPERSIONS OR VARIABILITY A.RANGE B.DEVIATION FROM THE MEAN C.VARIANCE D.STANDARD

More information

Statistical Theory, Conditional Probability Vaibhav Walvekar

Statistical Theory, Conditional Probability Vaibhav Walvekar Statistical Theory, Conditional Probability Vaibhav Walvekar # Load some helpful libraries library(tidyverse) If a baseball team scores X runs, what is the probability it will win the game? Baseball is

More information

Taking Your Class for a Walk, Randomly

Taking Your Class for a Walk, Randomly Taking Your Class for a Walk, Randomly Daniel Kaplan Macalester College Oct. 27, 2009 Overview of the Activity You are going to turn your students into an ensemble of random walkers. They will start at

More information

Chapter 1: Why is my evil lecturer forcing me to learn statistics?

Chapter 1: Why is my evil lecturer forcing me to learn statistics? Chapter 1: Why is my evil lecturer forcing me to learn statistics? Labcoat Leni s Real Research Is Friday the 13th Unlucky? Problem Scanlon, T. J., et al. (1993). British Medical Journal, 307, 1584 158.

More information

Chapter 1: Why is my evil lecturer forcing me to learn statistics?

Chapter 1: Why is my evil lecturer forcing me to learn statistics? Chapter : Why is my evil lecturer forcing me to learn statistics? Labcoat Leni s Real Research Is Friday the 3th Unlucky? Problem Scanlon, T. J., et al. (3). British Medical Journal, 3, 8 8. Many of us

More information

Measuring Relative Achievements: Percentile rank and Percentile point

Measuring Relative Achievements: Percentile rank and Percentile point Measuring Relative Achievements: Percentile rank and Percentile point Consider an example where you receive the same grade on a test in two different classes. In which class did you do better? Why do we

More information

C R I TFC. Columbia River Inter-Tribal Fish Commission

C R I TFC. Columbia River Inter-Tribal Fish Commission Columbia River Inter-Tribal Fish Commission 700 NE Multnomah, Suite 1200 503.238.0667 Portland, OR 97232 www.critfc.org C R I TFC T E CHNI C AL R E P O R T 13-07 Analyses for Effect of Survey Week and

More information

Chapter 5: Methods and Philosophy of Statistical Process Control

Chapter 5: Methods and Philosophy of Statistical Process Control Chapter 5: Methods and Philosophy of Statistical Process Control Learning Outcomes After careful study of this chapter You should be able to: Understand chance and assignable causes of variation, Explain

More information

Hitting with Runners in Scoring Position

Hitting with Runners in Scoring Position Hitting with Runners in Scoring Position Jim Albert Department of Mathematics and Statistics Bowling Green State University November 25, 2001 Abstract Sportscasters typically tell us about the batting

More information

DOCUMENT RESUME. A Comparison of Type I Error Rates of Alpha-Max with Established Multiple Comparison Procedures. PUB DATE NOTE

DOCUMENT RESUME. A Comparison of Type I Error Rates of Alpha-Max with Established Multiple Comparison Procedures. PUB DATE NOTE DOCUMENT RESUME ED 415 284 TM 028 030 AUTHOR Barnette, J. Jackson; McLean, James E. TITLE A Comparison of Type I Error Rates of Alpha-Max with Established Multiple Comparison Procedures. PUB DATE 1997-11-13

More information

Political Science 30: Political Inquiry Section 5

Political Science 30: Political Inquiry Section 5 Political Science 30: Political Inquiry Section 5 Taylor Carlson tncarlson@ucsd.edu Link to Stats Motivation of the Week They ve done studies, you know. 60% of the time, it works every time. Brian, Anchorman

More information

POTENTIAL ENERGY BOUNCE BALL LAB

POTENTIAL ENERGY BOUNCE BALL LAB Energy cannot be created or destroyed. Stored energy is called potential energy, and the energy of motion is called kinetic energy. Potential energy changes as the height of an object changes due to gravity;

More information

Case Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0%

Case Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0% GET FILE='C:\Users\acantrell\Desktop\demo5.sav'. DATASET NAME DataSet1 WINDOW=FRONT. EXAMINE VARIABLES=PASSYDSPG RUSHYDSPG /PLOT BOXPLOT HISTOGRAM /COMPARE GROUPS /STATISTICS DESCRIPTIVES /CINTERVAL 95

More information

ISyE 6414 Regression Analysis

ISyE 6414 Regression Analysis ISyE 6414 Regression Analysis Lecture 2: More Simple linear Regression: R-squared (coefficient of variation/determination) Correlation analysis: Pearson s correlation Spearman s rank correlation Variable

More information

How Effective is Change of Pace Bowling in Cricket?

How Effective is Change of Pace Bowling in Cricket? How Effective is Change of Pace Bowling in Cricket? SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries.

More information

WATER OIL RELATIVE PERMEABILITY COMPARATIVE STUDY: STEADY VERSUS UNSTEADY STATE

WATER OIL RELATIVE PERMEABILITY COMPARATIVE STUDY: STEADY VERSUS UNSTEADY STATE SCA2005-77 1/7 WATER OIL RELATIVE PERMEABILITY COMPARATIVE STUDY: STEADY VERSUS UNSTEADY STATE 1 Marcelo M. Kikuchi, 1 Celso C.M. Branco, 2 Euclides J. Bonet, 2 Rosângela M.Zanoni, 1 Carlos M. Paiva 1

More information

A Statistical Analysis of the Factors that Potentially Affect the Price of A Horse

A Statistical Analysis of the Factors that Potentially Affect the Price of A Horse The College at Brockport: State University of New York Digital Commons @Brockport Senior Honors Theses Master's Theses and Honors Projects 3-26-2013 A Statistical Analysis of the Factors that Potentially

More information

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves

More information

A COMPARATIVE STUDY OF PHYSICAL DIFFERENCES BETWEEN ATHLETES OF SELECTED EVENTS IN TRACK AND FIELD

A COMPARATIVE STUDY OF PHYSICAL DIFFERENCES BETWEEN ATHLETES OF SELECTED EVENTS IN TRACK AND FIELD A COMPARATIVE STUDY OF PHYSICAL DIFFERENCES BETWEEN ATHLETES OF SELECTED EVENTS IN TRACK AND FIELD Singh Somanpreet Research Scholar, Center For Advanced Studies, LNUPE, Gwalior,M.P, India soman53preet17@gmail.com

More information

GENETICS OF RACING PERFORMANCE IN THE AMERICAN QUARTER HORSE: II. ADJUSTMENT FACTORS AND CONTEMPORARY GROUPS 1'2

GENETICS OF RACING PERFORMANCE IN THE AMERICAN QUARTER HORSE: II. ADJUSTMENT FACTORS AND CONTEMPORARY GROUPS 1'2 GENETICS OF RACING PERFORMANCE IN THE AMERICAN QUARTER HORSE: II. ADJUSTMENT FACTORS AND CONTEMPORARY GROUPS 1'2 S. T. Buttram 3, R. L. Willham 4 and D. E. Wilson 4 Iowa State University, Ames 50011 ABSTRACT

More information

Comparison of recruitment tile materials for monitoring coralline algae responses to a changing climate

Comparison of recruitment tile materials for monitoring coralline algae responses to a changing climate The following supplement accompanies the article Comparison of recruitment tile materials for monitoring coralline algae responses to a changing climate Emma V. Kennedy*, Alexandra Ordoñez, Bonnie E. Lewis,

More information

COMPLETING THE RESULTS OF THE 2013 BOSTON MARATHON

COMPLETING THE RESULTS OF THE 2013 BOSTON MARATHON COMPLETING THE RESULTS OF THE 2013 BOSTON MARATHON Dorit Hammerling 1, Matthew Cefalu 2, Jessi Cisewski 3, Francesca Dominici 2, Giovanni Parmigiani 2,4, Charles Paulson 5, Richard Smith 1,6 1 Statistical

More information

Class 23: Chapter 14 & Nested ANOVA NOTES: NOTES: NOTES:

Class 23: Chapter 14 & Nested ANOVA NOTES: NOTES: NOTES: Slide 1 Chapter 13: ANOVA for 2-way classifications (2 of 2) Fixed and Random factors, Model I, Model II, and Model III (mixed model) ANOVA Chapter 14: Unreplicated Factorial & Nested Designs Slide 2 HW

More information

Midterm Exam 1, section 2. Thursday, September hour, 15 minutes

Midterm Exam 1, section 2. Thursday, September hour, 15 minutes San Francisco State University Michael Bar ECON 312 Fall 2018 Midterm Exam 1, section 2 Thursday, September 27 1 hour, 15 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. You can

More information

Lecture 16: Chapter 7, Section 2 Binomial Random Variables

Lecture 16: Chapter 7, Section 2 Binomial Random Variables Lecture 16: Chapter 7, Section 2 Binomial Random Variables!Definition!What if Events are Dependent?!Center, Spread, Shape of Counts, Proportions!Normal Approximation Cengage Learning Elementary Statistics:

More information

Averages. October 19, Discussion item: When we talk about an average, what exactly do we mean? When are they useful?

Averages. October 19, Discussion item: When we talk about an average, what exactly do we mean? When are they useful? Averages October 19, 2005 Discussion item: When we talk about an average, what exactly do we mean? When are they useful? 1 The Arithmetic Mean When we talk about an average, we can mean different things

More information

Finding your feet: modelling the batting abilities of cricketers using Gaussian processes

Finding your feet: modelling the batting abilities of cricketers using Gaussian processes Finding your feet: modelling the batting abilities of cricketers using Gaussian processes Oliver Stevenson & Brendon Brewer PhD candidate, Department of Statistics, University of Auckland o.stevenson@auckland.ac.nz

More information

Citation for published version (APA): Canudas Romo, V. (2003). Decomposition Methods in Demography Groningen: s.n.

Citation for published version (APA): Canudas Romo, V. (2003). Decomposition Methods in Demography Groningen: s.n. University of Groningen Decomposition Methods in Demography Canudas Romo, Vladimir IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please

More information

Sample Final Exam MAT 128/SOC 251, Spring 2018

Sample Final Exam MAT 128/SOC 251, Spring 2018 Sample Final Exam MAT 128/SOC 251, Spring 2018 Name: Each question is worth 10 points. You are allowed one 8 1/2 x 11 sheet of paper with hand-written notes on both sides. 1. The CSV file citieshistpop.csv

More information

PradiptaArdiPrastowo Sport Science. SebelasMaret University. Indonesia

PradiptaArdiPrastowo Sport Science. SebelasMaret University. Indonesia The Influence of the Volley Ball Serve Training Methods to the Overhand Serve Skills from Gender Consideration (An Experiment Research using the Near Target to Far Target and the Far Target to Nearer Target

More information

Transportation Research Forum

Transportation Research Forum Transportation Research Forum Modeling through Traffic Speed at Roundabouts along Urban and Suburban Street Arterials Author(s): Bashar H. Al-Omari, Khalid A. Ghuzlan, and Lina B. Al-Helo Source: Journal

More information

Youngs Creek Hydroelectric Project

Youngs Creek Hydroelectric Project Youngs Creek Hydroelectric Project (FERC No. 10359) Resident Trout Monitoring Plan Annual Report 2014 Survey Prepared by: Everett, WA November 2014 Final This document has been prepared for the District.

More information

Modelling and Simulation of Environmental Disturbances

Modelling and Simulation of Environmental Disturbances Modelling and Simulation of Environmental Disturbances (Module 5) Dr Tristan Perez Centre for Complex Dynamic Systems and Control (CDSC) Prof. Thor I Fossen Department of Engineering Cybernetics 18/09/2007

More information

CHAPTER 10 TOTAL RECREATIONAL FISHING DAMAGES AND CONCLUSIONS

CHAPTER 10 TOTAL RECREATIONAL FISHING DAMAGES AND CONCLUSIONS CHAPTER 10 TOTAL RECREATIONAL FISHING DAMAGES AND CONCLUSIONS 10.1 INTRODUCTION This chapter provides the computation of the total value of recreational fishing service flow losses (damages) through time

More information

Multilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11

Multilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11 Multilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11 Study Overview: These data come from a daily diary study that followed 41 male and female college students over a six-week period to examine

More information

An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables

An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables Kromrey & Rendina-Gobioff An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables Jeffrey D. Kromrey Gianna Rendina-Gobioff University of South Florida The Type I error

More information

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com Naïve Bayes These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides

More information

Lesson 14: Modeling Relationships with a Line

Lesson 14: Modeling Relationships with a Line Exploratory Activity: Line of Best Fit Revisited 1. Use the link http://illuminations.nctm.org/activity.aspx?id=4186 to explore how the line of best fit changes depending on your data set. A. Enter any

More information

Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils

Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils 86 Pet.Sci.(29)6:86-9 DOI 1.17/s12182-9-16-x Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils Ehsan Khamehchi 1, Fariborz Rashidi

More information

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement SPORTSCIENCE sportsci.org Original Research / Performance Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement Carl D Paton, Will G Hopkins Sportscience

More information