The Normal Distribution, Margin of Error, and Hypothesis Testing. Additional Resources

Similar documents
Political Science 30: Political Inquiry Section 5

Confidence Intervals with proportions

Exploring Measures of Central Tendency (mean, median and mode) Exploring range as a measure of dispersion

Probability & Statistics - Solutions

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

MATH 114 QUANTITATIVE REASONING PRACTICE TEST 2

Confidence Interval Notes Calculating Confidence Intervals

Running head: DATA ANALYSIS AND INTERPRETATION 1

Chapter 12 Practice Test

Do Steph Curry and Klay Thompson Have Hot Hands?

Stats 2002: Probabilities for Wins and Losses of Online Gambling

If a fair coin is tossed 10 times, what will we see? 24.61% 20.51% 20.51% 11.72% 11.72% 4.39% 4.39% 0.98% 0.98% 0.098% 0.098%

Name May 3, 2007 Math Probability and Statistics

Section I: Multiple Choice Select the best answer for each problem.

Appendix: Tables. Table XI. Table I. Table II. Table XII. Table III. Table IV

FINAL EXAM MATH 111 FALL 2009 TUESDAY 8 DECEMBER AM-NOON

0460 GEOGRAPHY. 0460/41 Paper 4 (Alternative to Coursework), maximum raw mark 60

Statistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007

Distancei = BrandAi + 2 BrandBi + 3 BrandCi + i

The Impact of Star Power and Team Quality on NBA Attendance THESIS

Marist College Institute for Public Opinion Poughkeepsie, NY Phone Fax

EVALUATION OF THE EFFECTIVENESS OF PEDESTRIAN COUNTDOWN SIGNALS

Measuring Relative Achievements: Percentile rank and Percentile point

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

(c) The hospital decided to collect the data from the first 50 patients admitted on July 4, 2010.

DOUGLAS COFFEE COUNTY PARKS AND RECREATION 2019 (11 & 12) MIDGET LEAGUE BASKETBALL RULES AND REGULATIONS

arxiv: v2 [stat.ap] 4 Nov 2017

Math 146 Statistics for the Health Sciences Additional Exercises on Chapter 2

ABOUT THE REPORT. This is a sample report. Report should be accurate but is not

FIRST NAME: (PRINT ABOVE (UNDERNEATH LAST NAME) IN CAPITALS)

Foundations of Data Science. Spring Midterm INSTRUCTIONS. You have 45 minutes to complete the exam.

PradiptaArdiPrastowo Sport Science. SebelasMaret University. Indonesia

Stat 139 Homework 3 Solutions, Spring 2015

Analysis of Variance. Copyright 2014 Pearson Education, Inc.

Endangered Species in the Big Woods of Arkansas Public Opinion Survey March 2008

Probability: Bernoulli Trials, Expected Value, and More About Human Beings!

Idea-66: Westbound I-66 Inside the Beltway

THE NEW YORK OLYMPIC GAMES 2012?

Extreme Shooters in the NBA

Psychology - Mr. Callaway/Mundy s Mill HS Unit Research Methods - Statistics

Marist College Institute for Public Opinion Poughkeepsie, NY Phone Fax

SPATIAL STATISTICS A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS. Introduction

Wildlife Ad Awareness & Attitudes Survey 2015

NAME: Math 403 Final Exam 12/10/08 15 questions, 150 points. You may use calculators and tables of z, t values on this exam.

1 Hypothesis Testing for Comparing Population Parameters

Taking Your Class for a Walk, Randomly

California By Michael Stahl

Average Runs per inning,

3.3 - Measures of Position

Sample Final Exam MAT 128/SOC 251, Spring 2018

Marist College Institute for Public Opinion Poughkeepsie, NY Phone Fax

Statistics Unit Statistics 1A

RUTGERS FOOTBALL MAJORITY SAY IT CAN IMPROVE STATE IMAGE WANT BIG TIME FOOTBALL

A SURVEY OF 1997 COLORADO ANGLERS AND THEIR WILLINGNESS TO PAY INCREASED LICENSE FEES

Practice Test Unit 6B/11A/11B: Probability and Logic

Business Statistics Homework #4, Fall 2017

Building an NFL performance metric

GALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORLD AFFAIRS

Marist College Institute for Public Opinion Poughkeepsie, NY Phone Fax

Unit 4: Inference for numerical variables Lecture 3: ANOVA

Chapter 9: Hypothesis Testing for Comparing Population Parameters

Why so blue? The determinants of color pattern in killifish, Part II Featured scientist: Becky Fuller from The University of Illinois

GALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORLD AFFAIRS

(JUN10SS0501) General Certificate of Education Advanced Level Examination June Unit Statistics TOTAL.

GALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORK AND EDUCATION

HEALTH INSURANCE COVERAGE STATUS American Community Survey 1-Year Estimates

Business Statistics Homework #4, Fall 2011

Slam-dunk Activities and Projects for Grades 4 8

Among the key specific findings from the survey are the following:

STT 315 Section /19/2014

GALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORLD AFFAIRS

Online Appendix: Goals as Reference Points in Marathon Running: A Novel Test of Reference-Dependence

LAKE DIANE Hillsdale County (T8-9S, R3W, Sections 34, 3, 4) Surveyed May Jeffrey J. Braunscheidel

Picking a number from 0 to 10, Event A: getting a prime number and Event B: Getting an even number

One-way ANOVA: round, narrow, wide

Crooked Lake Oakland County (T4N, R9E, Sections 3, 4, 9) Surveyed May James T. Francis

Descriptive Statistics Project Is there a home field advantage in major league baseball?

Electronic Supplementary Material: Goals as Reference Points in Marathon Running: A Novel Test of Reference Dependence

I. Project Title: Yampa River northern pike and smallmouth bass removal and translocation

Walking and Biking in California

Q1A. Did you personally attend any Major League Baseball games LAST year, or not?

MATH 118 Chapter 5 Sample Exam By: Maan Omran

Quantitative Literacy: Thinking Between the Lines

SPORTS STARS WARS: WHERE TO BUILD THE NEW ARENA(S) MEADOWLANDS PREFERRED

Tennessee Black Bear Public Opinion Survey

Week 7 One-way ANOVA

% per year Age (years)

Lecture 16: Chapter 7, Section 2 Binomial Random Variables

Marist College Institute for Public Opinion Poughkeepsie, NY Phone Fax

Calculation of Trail Usage from Counter Data

Chapter 2 - Displaying and Describing Categorical Data

Chapter 3 - Displaying and Describing Categorical Data

Walking and Biking in California: Summary of Findings

b) (2 pts.) Does the study show that drinking 4 or more cups of coffee a day caused the higher death rate?

Women in Shooting Sports Survey Results

Chapter 1: Introduction to the study. 1

Dangerously bold Featured scientist: Melissa Kjelvik from Michigan State University

1.1 The size of the search space Modeling the problem Change over time Constraints... 21

Copyright NEWSPOLL Any reproduction of this material must credit both NEWSPOLL and THE AUSTRALIAN

Section 3.3: The Empirical Rule and Measures of Relative Standing

Transcription:

The Normal Distribution, Margin of Error, and Hypothesis Testing Additional Resources The Normal Distribution and Central Limit Theorem Explanations and Visuals o http://www.statisticshowto.com/central-limit-theorem-examples/ o https://www.cliffsnotes.com/study-guides/statistics/sampling/central-limittheorem o http://sphweb.bumc.bu.edu/otlt/mph- Modules/BS/BS704_Probability/BS704_Probability12.html Videos o https://www.youtube.com/watch?v=xgqheffoxrm o https://www.youtube.com/watch?v=c11d3vvm5v8 o https://www.youtube.com/watch?v=pm7z_03o_kk o https://www.youtube.com/watch?v=jvoxeymqhnm o https://www.youtube.com/watch?v=lvfc2f9khq4 o https://www.youtube.com/watch?v=hqimchqlz4s Margin of Error and Confidence Intervals Explanations and Visuals o http://www.statisticshowto.com/probability-and-statistics/hypothesistesting/margin-of-error/ o http://www.dummies.com/education/math/statistics/how-to-calculate-the-marginof-error-for-a-sample-proportion/ Videos o https://www.youtube.com/watch?v=owqng8-42la o https://www.youtube.com/watch?v=vw7-hz9g8gs o https://www.youtube.com/watch?v=uogojhgjdqs Hypothesis Testing Explanations and Visuals o http://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/ (ignore Bayesian hypothesis testing for now it s beyond this class) o http://stattrek.com/hypothesis-test/hypothesis-testing.aspx o https://onlinecourses.science.psu.edu/stat502/node/139 o http://onlinestatbook.com/2/logic_of_hypothesis_testing/steps.html Videos o https://www.youtube.com/watch?v=k1at8vukibw o https://www.youtube.com/watch?v=w3l_uw3vdjm o https://www.youtube.com/watch?v=beknkjoxybq o https://www.youtube.com/watch?v=xay9swflvys 1

Practice Problems Answer Key on Page 4 1. Surf s Up! a. Suppose I conducted a survey of a simple random sample of 1,000 San Diego residents to find out what percentage of San Diegans know how to surf. The results from my survey indicate that 35% of my sample of San Diego residents knows how to surf. What is the margin of error for this estimate? What is the 95 percent confidence interval? b. Suppose I ran out of time and could only interview a simple random sample of 300 San Diego residents. I found that 35% of my sample knew how to surf. i. What is the margin of error of this statistic? ii. Is the margin of error bigger in (a) or (b)? Why do you think that is? c. Suppose that instead of conducting a simple random sample, I asked 200 people at the beach whether they know how to surf. I found that 80% of my sample knew how to surf. What problems might arise because of my sampling procedure? 2. Significance Tests for Proportion a. Suppose I conducted a survey on a simple random sample of 1,000 residents of Los Angeles and asked them if they knew how to surf. I want to compare the proportion of LA residents to the proportion of San Diego residents (from the simple random sample of 1,000 described earlier) who know how to surf. Below is the data from my survey: City Knows how to surf Doesn t know how to surf Total San Diego 350 650 1000 Los Angeles 150 850 1000 Total 500 1500 2000 b. What is the proportion of respondents in San Diego who know how to surf? c. What is the proportion of respondents in LA who know how to surf? d. What is the standard error of the proportion of respondents in San Diego who know how to surf? e. What is the standard error of the proportion of respondents in LA who know how to surf? f. What is the 95% confidence interval around the difference of the two proportions? g. What is the null hypothesis? 2

h. What is the alternate hypothesis? i. Do we reject or fail to reject the null hypothesis? Why? 3. Significance Tests for Difference of Means a. Are the Warriors significantly taller than the Lakers? Below is some data on the heights of the players in the starting lineup for the Golden State Warriors and the Lakers: Warriors Lakers Stephen Curry 75 inches Lonzo Ball 78 inches Klay Thompson 79 inches Kentavious Caldwell- 77 inches Pope Kevin Durant 81 inches Julius Randall 81 inches Draymond Green 79 inches Brook Lopez 84 inches Zaza Pachulia 83 inches Brandon Ingram 81 inches i. What is the mean height of the Warriors starting lineup? ii. What is the mean height of the Lakers starting lineup? iii. What is the standard deviation for the Warriors? iv. What is the standard deviation for the Lakers? v. What is the standard error for the Warriors? vi. What is the standard error for the Lakers? vii. What is the 95% confidence interval for the difference in means between the Lakers and the Warriors? viii. What is the null hypothesis? ix. What is the alternative hypothesis? x. Do we reject or fail to reject the null hypothesis? Why? 3

Practice Problems Answer Key 1. Surf s Up! a. Suppose I conducted a survey of a simple random sample of 1,000 San Diego residents to find out what percentage of San Diegans know how to surf. The results from my survey indicate that 35% of my sample of San Diego residents knows how to surf. What is the margin of error for this estimate? What is the 95 percent confidence interval? pˆ ± 2 ( pˆ)(1 N pˆ) 0.35 ± 2 0.35 ± 2 (.35)(1.35) 1000. 35 (.65) 1000 0.35 ± 2 0.48 31.6 0.35 ± 2 0.02 0.35 +.04 = 0.39 0.35 -.04 = 0.31 95% Confidence Interval: (.31,.39) Interpretation: We can be 95% confident that the true value of the population proportion of San Diegans who know how to surf is between.31 and.39. A more sophisticated interpretation is that if we collected infinite sample proportions, 95% of the sample proportions would fall between.31 and.39. b. Suppose I ran out of time and could only interview a simple random sample of 300 San Diego residents. I found that 35% of my sample knew how to surf. i. What is the margin of error of this statistic? 4

0.35 ± 2 0.35 ± 2 (.35)(1.35) 300. 35 (.65) 300 0.35 ± 2 0.48 17.3 0.35 ± 2 0.03 0.35 +.06 = 0.41 0.35 -.06 = 0.29 95% Confidence Interval: (.29,.41) Interpretation: We can be 95% confident that the true value of the population proportion of San Diegans who know how to surf is between.29 and.41. A more sophisticated interpretation is that if we collected infinite sample proportions, 95% of the sample proportions would fall between.29 and.41. ii. Is the margin of error bigger in (a) or (b)? Why do you think that is? The margin of error is bigger in (b) when we only had a sample of 300 respondents. This is because our sample size is smaller. c. Suppose that instead of conducting a simple random sample, I asked 200 people at the beach whether they know how to surf. I found that 80% of my sample knew how to surf. What problems might arise because of my sampling procedure? 2. Significance Tests for Proportion a. Suppose I conducted a survey on a simple random sample of 1,000 residents of Los Angeles and asked them if they knew how to surf. I want to compare the proportion of LA residents to the proportion of San Diego residents (from the simple random sample of 1,000 described earlier) who know how to surf. Below is the data from my survey: City Knows how to surf Doesn t know how to surf Total San Diego 350 650 1000 Los Angeles 150 850 1000 Total 500 1500 2000 b. What is the proportion of respondents in San Diego who know how to surf? 5

350/1000 = 0.35 c. What is the proportion of respondents in LA who know how to surf? 150/1000 = 0.15 d. What is the standard error of the proportion of respondents in San Diego who know how to surf? std. error!"#!"#$% = = 0.02 (.35)(1.35) 1000 e. What is the standard error of the proportion of respondents in LA who know how to surf? std. error!"#!"#$%$& = = 0.01 (.15)(1.15) 1000 f. What is the 95% confidence interval around the difference of the two proportions?. 15.35 ± 2. 02! +. 01!.2 ± 2. 0004 +.0001.2 ± 2. 0005.2 ± 2.022.2 ±.044 -.2+.044 = -0.156 -.2 -.044 = -0.244 6

g. What is the null hypothesis? 95% Confidence Interval: (-0.244, -0.156) There is no difference in the proportion of residents who know how to surf in LA and the proportion of residents who know how to surf in San Diego. Written differently: h. What is the alternate hypothesis? p!" p!"#!"#$% = 0 p!" p!"#!!"#$ 0 i. Do we reject or fail to reject the null hypothesis? Why? We reject the null hypothesis. This is because 0 is not inside our 95% confidence interval. This means that if we drew infinite samples from LA and San Diego and calculated the proportion that know how to surf, the difference between two sample proportions would not include 0 in 95% of these samples. 3. Significance Tests for Difference of Means a. Are the Warriors significantly taller than the Lakers? Below is some data on the heights of the players in the starting lineup for the Golden State Warriors and the Lakers: Warriors Lakers Stephen Curry 75 inches Lonzo Ball 78 inches Klay Thompson 79 inches Kentavious Caldwell- 77 inches Pope Kevin Durant 81 inches Julius Randall 81 inches Draymond Green 79 inches Brook Lopez 84 inches Zaza Pachulia 83 inches Brandon Ingram 81 inches i. What is the mean height of the Warriors starting lineup? (75+79+81+79+83)/5 = 79.4 inches ii. What is the mean height of the Lakers starting lineup? (78+77+81+84+81)/5 = 80.2 inches iii. What is the standard deviation for the Warriors? 7

X i Mean X i Mean (X i Mean) 2 75 79.4-4.4 19.4 79 79.4-0.4 0.16 81 79.4 1.6 2.6 79 79.4-0.4 0.16 83 79.4 3.6 13.0 19.4+0.16+2.6+0.16+13= 35.3 35.3/(5-1) 35.3/4 = 8.825 8.825 = 2.97 iv. What is the standard deviation for the Lakers? X i Mean X i Mean (X i Mean) 2 78 80.2-2.2 4.8 77 80.2-3.2 10.2 81 80.2 0.8 0.64 84 80.2 3.8 14.4 81 80.2 0.8 0.64 4.8+10.2+.64+14.4+.64 = 30.7 30.7/(5-1) 30.7/4 = 7.675 7.675 = 2.77 v. What is the standard error for the Warriors? std. error = 1 2.97 5 = 1.33 ˆ σ 1 N 1 vi. What is the standard error for the Lakers? 2.77 5 = 1.24 8

vii. What is the 95% confidence interval for the difference in means between the Lakers and the Warriors? ( X + error 2 2 2 X1) ± 2 ( std. error 1) ( std. 2) 80.2 79.4 ± 2 1.33! + 1.24! 0.8 ± 2 1.77 + 1.54 0.8 ± 2 3.31 0.8 ± 2 1.82 0.8 ± 3.64 0.8+3.64 = 4.44 0.8 3.64 = -2.84 95% Confidence Interval of the Difference of Means: (-2.84, 4.44) viii. What is the null hypothesis? There is no difference in height between the Lakers and the Warriors. X!"#$%& = X!"##$%#& OR, written differently: X!"#$%& X!"##$%#& = 0 ix. What is the alternative hypothesis? X!"#$%& X!"##$%#& OR, written differently: X!"#$%& X!"##$%#& 0 x. Do we reject or fail to reject the null hypothesis? Why? We fail to reject the null hypothesis that the difference in mean height of the Lakers and Warriors is equal to zero. This is because 0 lies within our 95% confidence interval of the difference of the means. 9