Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday.
|
|
- Alice Boone
- 5 years ago
- Views:
Transcription
1 Announcements Announcements UNIT 7: MULTIPLE LINEAR REGRESSION LECTURE 1: INTRODUCTION TO MLR STATISTICS 101 Problem Set 10 Due Wednesday Nicole Dalzell June 15, 2015 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Recap % College graduate vs. % Hispanic in LA What can you say about the relationship between of % college graduate and % Hispanic in a sample of 100 zip code areas in LA? Recap % College educated vs. % Hispanic in LA What can you say about the relationship between of % college graduate and % Hispanic in a sample of 100 zip code areas in LA? Education: College graduate 1.0 Race/Ethnicity: Hispanic % % College graduate 75% 50% 25% % Freeways No data 0.0 Freeways No data 0.0 0% 25% 50% 75% 100% % Hispanic Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
2 Recap % College educated vs. % Hispanic in LA - linear model Recap % College educated vs. % Hispanic in LA - linear model Participation question Which of the below is the best interpretation of the slope? (Intercept) %Hispanic (a) A 1% increase in Hispanic residents in a zip code area in LA is associated with a 75% decrease in % of college grads. (b) A 1% increase in Hispanic residents in a zip code area in LA is associated with a 0.75% decrease in % of college grads. (c) An additional 1% of Hispanic residents decreases the % of college graduates in a zip code area in LA by 0.75%. (d) In zip code areas with no Hispanic residents, % of college graduates is expected to be 75%. Do these data provide convincing evidence that there is a statistically significant relationship between % Hispanic and % college graduates in zip code areas in LA? (Intercept) hispanic How reliable is this p-value if these zip code areas are not randomly selected? Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Recap Recap Inference for the slope for a SLR model (only one explanatory variable): Hypothesis test: T = b 1 null value SE b1 df = n 2 Dinosaur Weight SLR: Categorical Predictors What relationship do you see between the weight of dinosaurs and the type of dinosaur? Dinosaur Weight by Type Confidence interval: b 1 ± t df=n 2 SE b 1 The null value is often 0 since we are usually checking for any relationship between the explanatory and the response variable. The regression output gives b 1, SE b1, and two-tailed p-value for the t-test for the slope where the null value is 0. We rarely do inference on the intercept, so we ll be focusing on the estimates and inference for the slope. Weight (kg) 0e+00 4e+04 8e+04 Ornithischian Saurischian Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
3 SLR: Categorical Predictors SLR: Categorical Predictors Dinosaur Weight Dinosaurs! What relationship do you see between the weight of dinosaurs and the type of dinosaur? (Intercept) dino$typesaurischian Weight = TypeSaurischian Type of dinosaur is a categorical variable with two levels: Ornithischian and Saurischian For Ornithischian dinosaurs: plug in 0 for TypeSaurischian For Saurischian dinosaurs: plug in 1 for TypeSaurischian Slope b 1 : We expect that Saurischian dinosaurs weighed on average 13,652 kilograms more than Ornithischian dinosaurs. Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Return to the scene of the crime Murder Rates by Country Last class, we used the poverty rate in a district to help predict the number of annual murders per million in that district. murders = poverty country-info/ stats/ Crime/ Murders/ Per-capita (Intercept) percpov e-06 Do we think that poverty rates are the only thing that influence the number of annual murders per million in a district? Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
4 Data from the ACS Data from the ACS A random sample of 783 observations from the 2012 ACS. 1 income: Yearly income (wages and salaries) 2 employment: Employment status, not in labor force, unemployed, or employed 3 hrs work: Weekly hours worked 4 race: Race, White, Black, Asian, or other 5 age: Age 6 gender: gender, male or female 7 citizens: Whether respondent is a US citizen or not 8 time to work: Travel time to work 9 lang: Language spoken at home, English or other 10 married: Whether respondent is married or not 11 edu: Education level, hs or lower, college, or grad 12 disability: Whether respondent is disabled or not 13 birth qrtr: Quarter in which respondent is born, jan thru mar, apr thru jun, jul thru sep, or oct thru dec We have 1 response variable (income) and 12 potential explanatory variables. How do we fit a model and interpret the coefficients with so many predictors? What do we do with a mix of categorical and numerical explanatory variables? How do we deterimine which (if any) of them are important in our model? How do we determine if our model is any good? Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Everybody in the Pool Examples How would we interpret the coefficient for hrs work? How do we interpret all of this? Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
5 In MLR everything is conditional on all other variables in the model Examples MLR (Multiple Linear Regression) How would we interpret the coefficient for hrs work? In MLR everything is conditional on all other variables in the model. All estimates in a MLR for a given variable are conditional on all other variables being in the model. Slope: Numerical x: All else held constant, for one unit increase in x i, y is expected to be higher / lower on average by b i units. Categorical x: All else held constant, the predicted difference in y for the baseline and given levels of x i is b i. Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 In MLR everything is conditional on all other variables in the model In MLR everything is conditional on all other variables in the model Examples How would we interpret the coefficient for genderfemale? Examples How would we interpret the coefficient for genderfemale? Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
6 In MLR everything is conditional on all other variables in the model In MLR everything is conditional on all other variables in the model Categorical Predictors with multiple levels We have several categorical variables in this study. Some are binary (ie have only two levels) but others are not. employment: Employment status, not in labor force, unemployed, or employed race: Race, White, Black, Asian, or other gender: gender, male or female citizens: Whether respondent is a US citizen or not lang: Language spoken at home, English or other married: Whether respondent is married or not edu: Education level, hs or lower, college, or grad disability: Whether respondent is disabled or not birth qrtr: Quarter in which respondent is born, jan thru mar, apr thru jun, jul thru sep, or oct thru dec Birth Quarter Coefficients birth qrtr: Quarter in which respondent is born, jan thru mar, apr thru jun, jul thru sep, or oct thru dec How many coefficients do we see for birth qrtr? Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 In MLR everything is conditional on all other variables in the model Categorical predictors and slopes for (almost) each level Race Coefficients Categorical predictors and slopes for (almost) each level How many coefficients do we see for birth qrtr? When we are working with a categorical variable with k levels, we only see k 1 parameters being estimated in the model. This happens because one of the levels of the variable is consumed by the intercept of the model. This level of the categorical variable is called the baseline. So, what happened to the folks born in January through March? Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
7 Categorical predictors and slopes for (almost) each level Categorical predictors and slopes for (almost) each level Gender: male / female (k = 2) Respondent gender:female Female 1 Male 0 Estimate Std. Error t value Pr(> t ) Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Birth Quarter: (k = 4) Baseline: Jan thru Mar Respondent birth qrt:apr thru jun birth qrtr:jul thru sep birth qrtr:oct thru dec 1, jan thru mar , apr thru jun , jul thru sep , oct thru dec Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Participation question Categorical predictors and slopes for (almost) each level All else held constant, how do incomes of those born January thru March compare to those born April thru June? All else held constant, those born Jan thru Mar make, on average, (a) $2, less (b) $2, more than those born Apr thru Jun. (c) $ less (d) $ more Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Prediction with MLR Return to the scene of the crime Predict the annual murders per million in a district with a poverty rate of 24%. murders = percpov (Intercept) percpov e-06 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
8 Prediction with MLR Prediction with MLR Weights of books Predicting for MLR weight (g) volume (cm 3 ) cover hc hc hc hc hc hc hc pb pb pb pb pb pb pb pb l w h Write down the linear model for book weight based on volume and cover type. (Intercept) volume cover:pb Interpret the coefficients for volume and cover:pb. Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Prediction with MLR Interpretation of the regression coefficients Prediction Prediction with MLR (Intercept) volume cover:pb Participation question Which of the following is the correct calculation for the predicted weight of a paperback book that is 600 cm 3? Slope of volume: All else held constant, for each 1 cm 3 increase in volume we would expect weight to increase on average by 0.72 grams. Slope of cover: All else held constant, the model predicts that paperback books weigh on average grams less than hardcover books. Intercept: Hardcover books with no volume are expected on average to weigh 198 grams. Obviously, the intercept does not make sense in context. It only serves to adjust the height of the line. Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 (Intercept) volume cover:pb (a) (b) (c) (d) Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
9 Examining our Model How do we determine if the model is significant? We can now interpret the coefficients, but how do we know if the model is any good? To be more specific, how can we show that the model is significant? Inference for the model as a whole: F-test Degrees of Freedom: df 1 = p, df 2 = n k 1 H 0 : β 1 = β 2 = = β k = 0 H A : At least one of the β i 0 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Model output Coefficients: (Intercept) hrs_work e-12 *** raceblack raceasian ** raceother age e-05 *** genderfemale e-06 *** citizenyes time_to_work langother marriedyes educollege *** edugrad < 2e-16 *** disabilityyes * birth_qrtrapr thru jun birth_qrtrjul thru sep birth_qrtroct thru dec Residual standard error: on 766 degrees of freedom (60 observations deleted due to missingness) Multiple R-squared: , Adjusted R-squared: F-statistic: on 16 and 766 DF, p-value: < 2.2e-16 Participation question True / False: The F test yielding a significant result means the model fits the data well. (a) True (b) False Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
10 Significance also depends on what else is in the model Weights of books Model 1: (Intercept) hrs_work e-12 raceblack raceasian raceother age e-05 genderfemale e-06 citizenyes time_to_work langother marriedyes <---- educollege edugrad < 2e-16 disabilityyes birth_qrtrapr thru jun birth_qrtrjul thru sep birth_qrtroct thru dec Model 2: (Intercept) hrs_work e-15 raceblack raceasian e-06 raceother age e-05 genderfemale e-05 marriedyes <---- weight (g) volume (cm 3 ) cover hc hc hc hc hc hc hc pb pb pb pb pb pb pb pb l w h Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Weights of hard cover and paperback books Modeling weights of books using volume and cover type Can you identify a trend in the relationship between volume and weight of hardcover and paperback books? # load data library(daag) data(allbacks) weight (g) hardcover paperback # fit model book_mlr = lm(weight volume + cover, data = allbacks) summary(book_mlr) Coefficients: (Intercept) ** volume e-08 *** cover:pb *** volume (cm 3 ) Residual standard error: 78.2 on 12 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 12 DF, p-value: 1.455e-07 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
11 Linear model Visualising the linear model (Intercept) volume cover:pb weight = volume cover : pb 1 For hardcover books: plug in 0 for cover weight = volume = volume 2 For paperback books: plug in 1 for cover weight = volume = volume weight (g) hardcover paperback volume (cm 3 ) Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Interpretation of the regression coefficients Prediction (Intercept) volume cover:pb Participation question Which of the following is the correct calculation for the predicted weight of a paperback book that is 600 cm 3? Slope of volume: All else held constant, for each 1 cm 3 increase in volume we would expect weight to increase on average by 0.72 grams. Slope of cover: All else held constant, the model predicts that paperback books weigh 184 grams lower than hardcover books, on average. Intercept: Hardcover books with no volume are expected on average to weigh 198 grams. Obviously, the intercept does not make sense in context. It only serves to adjust the height of the line. Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 (Intercept) volume cover:pb (a) (b) (c) (d) Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
12 A note on interaction variables weight = volume cover : pb volume (cm 3 ) weight (g) hardcover paperback This model assumes that hardcover and paperback books have the same slope for the relationship between their volume and weight. If this isn t reasonable, then we would include an interaction variable in the model (beyond the scope of this course). Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Revisit: Modeling poverty poverty metro_res white hs_grad female_house Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Predicting poverty using % female householder # load data poverty = read.csv(" mc301/data/poverty.csv") # fit model pov_slr = lm(poverty female_house, data = poverty) summary(pov_slr) Linear model: (Intercept) female house % female householder % in poverty R = 0.53 R 2 = = 0.28 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Another look at R 2 - from last time anova(pov_slr) ANOVA: Df Sum Sq Mean Sq F value Pr(>F) female house Residuals Total SS of y: SS Tot = (y ȳ) 2 = total variability SS of residuals: SS Err = e 2 i = unexplained variability SS of regression: SS Reg = SS Total SS Error explained variability = = R 2 = explained variability total variability = = 0.28 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
13 Predicting poverty using % female hh + % white pov_mlr = lm(poverty female_house + white, data = poverty) summary(pov_mlr) anova(pov_mlr) Linear model: (Intercept) female house white ANOVA: Df Sum Sq Mean Sq F value Pr(>F) female house white Residuals Total ( R 2 SSError adj = 1 n 1 ) SS Total n k 1 where n is the number of cases and k is the number of predictors (explanatory variables) in the model. R 2 = explained variability total variability = = 0.29 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Application exercise: Calculate adjusted R 2 for the multiple linear regression model predicting % living in poverty from % female householders and % white. Remember n = 51, 50 states + DC. (a) 0.26 (b) 0.29 (c) 0.32 (d) 0.71 ANOVA: Df Sum Sq Mean Sq F value Pr(>F) female house white Residuals Total R 2 vs. adjusted R 2 R 2 Model 1 (poverty vs. female house) Model 2 (poverty vs. female house + white) 0.29 When any variable is added to the model R 2 increases. But if the added variable doesn t really provide any new information, or is completely unrelated, adjusted R 2 does not increase. Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
14 - properties R 2 adj = 1 ( SSError SS Total n 1 n k 1 ) Because k is never negative, R 2 adj will always be smaller than R 2. R 2 adj applies a penalty for the number of predictors included in the model. Therefore, we choose models with higher R 2 adj over others. Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Participation question True or false: tells us the percentage of variability in the response variable explained by the model. (a) True (b) False Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Collinearity and parsimony We saw that adding the variable white to the model did not increase adjusted R 2, i.e. did not add any valuable information to the model. Why? poverty metro_res white hs_grad female_house Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1 Collinearity and parsimony Collinearity between explanatory variables (cont.) Two predictor variables are said to be collinear when they are correlated, and this collinearity (also called multicollinearity) complicates model estimation. Remember: Predictors are also called explanatory or independent variables, so they should be independent of each other. We don t like adding predictors that are associated with each other to the model, because often times the addition of such variable brings nothing to the table. Instead, we prefer the simplest best model, i.e. parsimonious model. In addition, addition of collinear variables can result in biased estimates of the slope parameters. While it s impossible to avoid collinearity from arising in observational data, experiments are usually designed to control for correlated predictors. Statistics 101 (Nicole Dalzell) U7 - L1: Multiple Linear Regression June 15, / 1
Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions
Announcements Announcements Lecture 19: Inference for SLR & Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner will be announced
More informationAnnouncements. Unit 7: Multiple Linear Regression Lecture 3: Case Study. From last lab. Predicting income
Announcements Announcements Unit 7: Multiple Linear Regression Lecture 3: Case Study Statistics 101 Mine Çetinkaya-Rundel April 18, 2013 OH: Sunday: Virtual OH, 3-4pm - you ll receive an email invitation
More informationSection I: Multiple Choice Select the best answer for each problem.
Inference for Linear Regression Review Section I: Multiple Choice Select the best answer for each problem. 1. Which of the following is NOT one of the conditions that must be satisfied in order to perform
More informationChapter 12 Practice Test
Chapter 12 Practice Test 1. Which of the following is not one of the conditions that must be satisfied in order to perform inference about the slope of a least-squares regression line? (a) For each value
More informationUnit 4: Inference for numerical variables Lecture 3: ANOVA
Unit 4: Inference for numerical variables Lecture 3: ANOVA Statistics 101 Thomas Leininger June 10, 2013 Announcements Announcements Proposals due tomorrow. Will be returned to you by Wednesday. You MUST
More informationDistancei = BrandAi + 2 BrandBi + 3 BrandCi + i
. Suppose that the United States Golf Associate (USGA) wants to compare the mean distances traveled by four brands of golf balls when struck by a driver. A completely randomized design is employed with
More informationNavigate to the golf data folder and make it your working directory. Load the data by typing
Golf Analysis 1.1 Introduction In a round, golfers have a number of choices to make. For a particular shot, is it better to use the longest club available to try to reach the green, or would it be better
More informationEmpirical Example II of Chapter 7
Empirical Example II of Chapter 7 1. We use NBA data. The description of variables is --- --- --- storage display value variable name type format label variable label marr byte %9.2f =1 if married wage
More informationRunning head: DATA ANALYSIS AND INTERPRETATION 1
Running head: DATA ANALYSIS AND INTERPRETATION 1 Data Analysis and Interpretation Final Project Vernon Tilly Jr. University of Central Oklahoma DATA ANALYSIS AND INTERPRETATION 2 Owners of the various
More information1. Answer this student s question: Is a random sample of 5% of the students at my school large enough, or should I use 10%?
Econ 57 Gary Smith Fall 2011 Final Examination (150 minutes) No calculators allowed. Just set up your answers, for example, P = 49/52. BE SURE TO EXPLAIN YOUR REASONING. If you want extra time, you can
More informationLab 11: Introduction to Linear Regression
Lab 11: Introduction to Linear Regression Batter up The movie Moneyball focuses on the quest for the secret of success in baseball. It follows a low-budget team, the Oakland Athletics, who believed that
More informationa) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5
Prof. C. M. Dalton ECN 209A Spring 2015 Practice Problems (After HW1, HW2, before HW3) CORRECTED VERSION Question 1. Draw and describe a relationship with heteroskedastic errors. Support your claim with
More informationCorrelates of Nonresponse in the 2012 and 2014 Medical Expenditure Panel Survey
Correlates of Nonresponse in the 2012 and 2014 Medical Expenditure Panel Survey Frances M. Chevarley, Ph.D. William D. Mosher, Ph.D. 2018 FCSM Session I-4 March 9, 2018 1:45 to 3:30PM MEPS Key Features
More informationName May 3, 2007 Math Probability and Statistics
Name May 3, 2007 Math 341 - Probability and Statistics Long Exam IV Instructions: Please include all relevant work to get full credit. Encircle your final answers. 1. An article in Professional Geographer
More informationThe Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD
The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Outline Definition. Deriving the Estimates. Properties of the Estimates. Units of Measurement and Functional Form. Expected
More informationDISMAS Evaluation: Dr. Elizabeth C. McMullan. Grambling State University
DISMAS Evaluation 1 Running head: Project Dismas Evaluation DISMAS Evaluation: 2007 2008 Dr. Elizabeth C. McMullan Grambling State University DISMAS Evaluation 2 Abstract An offender notification project
More informationMidterm Exam 1, section 2. Thursday, September hour, 15 minutes
San Francisco State University Michael Bar ECON 312 Fall 2018 Midterm Exam 1, section 2 Thursday, September 27 1 hour, 15 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. You can
More informationIntroduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only
More informationQuantitative Methods for Economics Tutorial 6. Katherine Eyal
Quantitative Methods for Economics Tutorial 6 Katherine Eyal TUTORIAL 6 13 September 2010 ECO3021S Part A: Problems 1. (a) In 1857, the German statistician Ernst Engel formulated his famous law: Households
More informationLecture 22: Multiple Regression (Ordinary Least Squares -- OLS)
Statistics 22_multiple_regression.pdf Michael Hallstone, Ph.D. hallston@hawaii.edu Lecture 22: Multiple Regression (Ordinary Least Squares -- OLS) Some Common Sense Assumptions for Multiple Regression
More informationHEALTH INSURANCE COVERAGE STATUS American Community Survey 1-Year Estimates
S2701 HEALTH INSURANCE COVERAGE STATUS 2014 American Community Survey 1-Year Estimates Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on
More informationy ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together
Statistics 111 - Lecture 7 Exploring Data Numerical Summaries for Relationships between Variables Administrative Notes Homework 1 due in recitation: Friday, Feb. 5 Homework 2 now posted on course website:
More informationGALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORK AND EDUCATION
GALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORK AND EDUCATION -- FINAL TOPLINE -- Timberline: 937008 IS: 786 Princeton Job #: 16-08-012 Jeff Jones, Lydia Saad August 3-7, 2016 Results are based on
More informationLesson 14: Modeling Relationships with a Line
Exploratory Activity: Line of Best Fit Revisited 1. Use the link http://illuminations.nctm.org/activity.aspx?id=4186 to explore how the line of best fit changes depending on your data set. A. Enter any
More informationUniversity Of Maryland
2000 Census Census Data 200 Census Change 2000 to 200 SUBJECT Number Percent SUBJECT Number Percent Number Percent TOTAL POPULATION 437 TOTAL POPULATION 246-9 -43.7 White 283 64.8 White 65 67. -8-4.7 Black
More informationLiberals with steady 10 point lead on Conservatives
FOR IMMEDIATE RELEASE Liberals with steady 10 point lead on Conservatives NDP trails FEBRUARY 20 th, 2014 In a random sampling of public opinion taken by the Forum Poll among 1824 Canadian voters, 4 in
More informationRice Yield And Dangue Haemorrhagic Fever(DHF) Condition depend upon Climate Data
Rice Yield And Dangue Haemorrhagic Fever(DHF) Condition depend upon Climate Data Dr Lai Lai Aung, Assistant Director( Met Service) Dr Khaing Khaing Soe Assistant Director(Public Health) Dr Thin Nwe htwe
More informationGALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORLD AFFAIRS
GALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORLD AFFAIRS -- FINAL TOPLINE -- Timberline: 937008 IS: 954 Princeton Job #: 17-02-002 Jeff Jones, Lydia Saad February 1-5, 2017 Results are based on telephone
More informationOakmont: Who are we?
Oakmont: Who are we? A Snapshot of our community from the April 2010 US Census Contents Age and Gender... 1 Marital Status... 2 Home Ownership and Tenure... 3 Past Demographic Characteristics... 5 Income
More informationIs lung capacity affected by smoking, sport, height or gender. Table of contents
Sample project This Maths Studies project has been graded by a moderator. As you read through it, you will see comments from the moderator in boxes like this: At the end of the sample project is a summary
More informationGALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORLD AFFAIRS
GALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: WORLD AFFAIRS -- FINAL TOPLINE -- Timberline: 937008 JT: 165 Princeton Job #: 18-02-002 Jeff Jones, Lydia Saad February 1-10, 2018 Results are based on telephone
More informationANOVA - Implementation.
ANOVA - Implementation http://www.pelagicos.net/classes_biometry_fa17.htm Doing an ANOVA With RCmdr Categorical Variable One-Way ANOVA Testing a single Factor dose with 3 treatments (low, mid, high) Doing
More informationIntroduction. Forestry, Wildlife and Fisheries Graduate Seminar Demand for Wildlife Hunting in the Southeastern United States
Forestry, Wildlife and Fisheries Graduate Seminar Demand for Wildlife Hunting in the Southeastern United States Presented by: Neelam C. Poudyal Monday, 19 November, 2007 4:40 PM 160 PBB Introduction Hunting
More informationWeek 7 One-way ANOVA
Week 7 One-way ANOVA Objectives By the end of this lecture, you should be able to: Understand the shortcomings of comparing multiple means as pairs of hypotheses. Understand the steps of the ANOVA method
More informationNeighborhood Influences on Use of Urban Trails
Neighborhood Influences on Use of Urban Trails Greg Lindsey, Yuling Han, Jeff Wilson Center for Urban Policy and the Environment Indiana University Purdue University Indianapolis Objectives Present new
More information2020 K Street NW, Suite 410 Washington, DC (202)
2020 K Street NW, Suite 410 Washington, DC 20006 (202) 463-7300 Interview dates: October 24 25, 2013 Interviews: 1,008 adults CONDUCTED BY IPSOS PUBLIC AFFAIRS These are findings of an Ipsos online poll
More information2017 North Texas Regional Bicycle Opinion Survey
2017 North Texas Regional Bicycle Opinion Survey Sustainable Development Program Kevin Kokes, AICP Public Meetings April, 2018 North Central Texas Council of Governments MPO for the Dallas-Fort Worth Region
More informationSCIENTIFIC COMMITTEE SECOND REGULAR SESSION August 2006 Manila, Philippines
SCIENTIFIC COMMITTEE SECOND REGULAR SESSION 7-18 August 2006 Manila, Philippines MEASUREMENT-POINTS EXAMINATION OF CIRCLE HOOKS FOR PELAGIC LONGLINE FISHERY TO EVALUATE EFFECTS OF HOOK DESIGN WCPFC-SC2-2006/EB
More informationEfficiency Wages in Major League Baseball Starting. Pitchers Greg Madonia
Efficiency Wages in Major League Baseball Starting Pitchers 1998-2001 Greg Madonia Statement of Problem Free agency has existed in Major League Baseball (MLB) since 1974. This is a mechanism that allows
More informationSetting up group models Part 1 NITP, 2011
Setting up group models Part 1 NITP, 2011 What is coming up Crash course in setting up models 1-sample and 2-sample t-tests Paired t-tests ANOVA! Mean centering covariates Identifying rank deficient matrices
More informationCOMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*
COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) Fixed Rates Variable Rates FIXED RATES OF THE PAST 25 YEARS AVERAGE RESIDENTIAL MORTGAGE LENDING RATE - 5 YEAR* (Per cent) Year Jan Feb Mar Apr May Jun Jul
More informationFREEDOM OF INFORMATION REQUEST
FREEDOM OF INFORMATION REQUEST REQUEST NUMBER: FOI Request 003905-16 REQUEST DETAILS: As part of my research I am requesting information on the numbers of 'strip searches' that have taken place within
More informationStatistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007
Statistical Analysis of PGA Tour Skill Rankings 198-26 USGA Research and Test Center June 1, 27 1. Introduction The PGA Tour has recorded and published Tour Player performance statistics since 198. All
More informationAccident data analysis using Statistical methods A case study of Indian Highway
Accident data analysis using Statistical methods A case study of Indian Highway Rahul Badgujar 1, Priyam Mishra 2, Mayank Chandra 3, Sayali Sandbhor 4, Humera Khanum 5 1,2,3 Undergraduate scholars, Department
More informationISyE 6414 Regression Analysis
ISyE 6414 Regression Analysis Lecture 2: More Simple linear Regression: R-squared (coefficient of variation/determination) Correlation analysis: Pearson s correlation Spearman s rank correlation Variable
More informationData Set 7: Bioerosion by Parrotfish Background volume of bites The question:
Data Set 7: Bioerosion by Parrotfish Background Bioerosion of coral reefs results from animals taking bites out of the calcium-carbonate skeleton of the reef. Parrotfishes are major bioerosion agents,
More informationMultilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11
Multilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11 Study Overview: These data come from a daily diary study that followed 41 male and female college students over a six-week period to examine
More informationINFLUENCE OF ENVIRONMENTAL PARAMETERS ON FISHERY
Chapter 5 INFLUENCE OF ENVIRONMENTAL PARAMETERS ON FISHERY 5. Introduction Environmental factors contribute to the population dynamics and abundance of marine fishery. The relationships between weather,
More informationGENETICS OF RACING PERFORMANCE IN THE AMERICAN QUARTER HORSE: II. ADJUSTMENT FACTORS AND CONTEMPORARY GROUPS 1'2
GENETICS OF RACING PERFORMANCE IN THE AMERICAN QUARTER HORSE: II. ADJUSTMENT FACTORS AND CONTEMPORARY GROUPS 1'2 S. T. Buttram 3, R. L. Willham 4 and D. E. Wilson 4 Iowa State University, Ames 50011 ABSTRACT
More informationLongitudinal analysis of young Danes travel pattern.
Aalborg trafikdage 24.08.2010 Longitudinal analysis of young Danes travel pattern. Sigrun Birna Sigurdardottir PhD student, DTU Transport. Background Limited literature regarding models or factors influencing
More informationBiostatistics & SAS programming
Biostatistics & SAS programming Kevin Zhang March 6, 2017 ANOVA 1 Two groups only Independent groups T test Comparison One subject belongs to only one groups and observed only once Thus the observations
More informationSurvey of Wave Riders at Trestles (continued from 1st survey page)
Survey of Wave Riders at Trestles (continued from 1st survey page) *** = required 1. Why did you choose to surf Trestles today? Trestles has better surf conditions than the other places This spot was the
More informationGENDER INEQUALITY IN THE LABOR MARKET
Table 1.1 Four Measures of Gender Equality, Country Rankings, Mid-1990s Full-Time Occupational Wage Employment Work Integration Equality (1 to 21) (1 to 15) (1 to 18) (1 to 12) Sweden 1 14 6 8 Finland
More informationB. Single Taxpayers (122,401 obs.) A. Married Taxpayers Filing jointly (266,272 obs.) Density Distribution. Density Distribution
A. Married Taxpayers Filing jointly (266,272 obs.) Kink 15/28% at $43,850 B. Single Taxpayers (122,401 obs.) Kink 15/28% at $26,250 Density Distribution Density Distribution $0 $20,000 $40,000 $60,000
More informationBabson Capital/UNC Charlotte Economic Forecast. May 13, 2014
Babson Capital/UNC Charlotte Economic Forecast May 13, 2014 Outline for Today Myths and Realities of this Recovery Positive Economic Signs Negative Economic Signs Outlook for 2014 The Employment Picture
More informationFactors Associated with the Bicycle Commute Use of Newcomers: An analysis of the 70 largest U.S. Cities
: An analysis of the 70 largest U.S. Cities Ryan J. Dann PhD Student, Urban Studies Portland State University May 2014 Newcomers and Bicycles Photo Credit: Daveena Tauber 2 Presentation Outline Introduction
More informationMnROAD Mainline IRI Data and Lane Ride Quality MnROAD Lessons Learned December 2006
MnROAD Mainline IRI Data and Lane Ride Quality December 2006 Derek Tompkins, John Tweet, Prof. Lev Khazanovich University of Minnesota MnDOT Contacts: Bernard Izevbekhai, Tim Clyne 1 Abstract Since 1994,
More informationDriv e accu racy. Green s in regul ation
LEARNING ACTIVITIES FOR PART II COMPILED Statistical and Measurement Concepts We are providing a database from selected characteristics of golfers on the PGA Tour. Data are for 3 of the players, based
More informationHow has the profile of student loan
Two decades of debt: How has the profile of student loan borrowers changed from 1992 to 2012? Nicholas Hillman University of Wisconsin-Madison AIR 2016 @n_hillman This material is based upon work supported
More informationAP 11.1 Notes WEB.notebook March 25, 2014
11.1 Chi Square Tests (Day 1) vocab *new seats* examples Objectives Comparing Observed & Expected Counts measurements of a categorical variable (ex/ color of M&Ms) Use Chi Square Goodness of Fit Test Must
More informationSelect Boxplot -> Multiple Y's (simple) and select all variable names.
One Factor ANOVA in Minitab As an example, we will use the data below. A study looked at the days spent in the hospital for different regions of the United States. Can the company reject the claim the
More informationMath SL Internal Assessment What is the relationship between free throw shooting percentage and 3 point shooting percentages?
Math SL Internal Assessment What is the relationship between free throw shooting percentage and 3 point shooting percentages? fts6 Introduction : Basketball is a sport where the players have to be adept
More informationFOR LEASE HARMS ROAD INDUSTRIAL PARK Harms Road, Houston Texas 77041
FOR LEASE HARMS ROAD INDUSTRIAL PARK 7206-7214 Harms Road, Houston Texas 77041 Property Statistics: Single-tenant Industrial Park Tilt wall construction with stone facade Crane ready Heavy power, grade
More information2017 Nebraska Profile
2017 Nebraska Profile State, 9 NEW Regions, 93 Counties, plus 31 Cities Three Volumes Demographic Change in the State Economic Influences at Work Housing Statistics and Trends Summary of Findings Discuss
More informationCopy of my report. Why am I giving this talk. Overview. State highway network
Road Surface characteristics and traffic accident rates on New Zealand s state highway network Robert Davies Statistics Research Associates http://www.statsresearch.co.nz Copy of my report There is a copy
More information2018 HR & PAYROLL Deadlines
th (by payment date) EPAF 3rd PARTY FEEDS WTE Approval 2018 HR & PAYROLL s Normal Payroll day s 2017 B1-26 3 * 13-Dec-17 15-Dec-17 n/a n/a n/a 28-Dec-17 29-Dec-17 11:00 AM 16-Dec-2017 29-Dec-2017 JAN 2018
More informationBuilding an NFL performance metric
Building an NFL performance metric Seonghyun Paik (spaik1@stanford.edu) December 16, 2016 I. Introduction In current pro sports, many statistical methods are applied to evaluate player s performance and
More informationJournal of Human Sport and Exercise E-ISSN: Universidad de Alicante España
Journal of Human Sport and Exercise E-ISSN: 1988-5202 jhse@ua.es Universidad de Alicante España SOÓS, ISTVÁN; FLORES MARTÍNEZ, JOSÉ CARLOS; SZABO, ATTILA Before the Rio Games: A retrospective evaluation
More informationReport to the Benjamin Hair-Just Swim For Life Foundation on JACS4 The Jefferson Area Community Survey
Report to the Benjamin Hair-Just Swim For Life Foundation on JACS4 The Jefferson Area Community Survey Prepared by: Kara Fitzgibbon, M.A. Research Analyst Matthew Braswell, M.A. Research Analyst Yuliya
More informationWHERE ARE ARIZONA DEMOGRAPHICS TAKING US? HOW GROWING SLOWER, OLDER AND MORE DIVERSE AFFECTS REAL ESTATE
WHERE ARE ARIZONA DEMOGRAPHICS TAKING US? HOW GROWING SLOWER, OLDER AND MORE DIVERSE AFFECTS REAL ESTATE March 2017 Tom Rex Office of the University Economist and Center for Competitiveness and Prosperity
More informationU.S. and Colorado Economic Outlook National Association of Industrial and Office Parks. Business Research Division Leeds School of Business
U.S. and Colorado Economic Outlook National Association of Industrial and Office Parks Presented by the Business Research Division Leeds School of Business University of Colorado at Boulder U.S. Economic
More informationLegendre et al Appendices and Supplements, p. 1
Legendre et al. 2010 Appendices and Supplements, p. 1 Appendices and Supplement to: Legendre, P., M. De Cáceres, and D. Borcard. 2010. Community surveys through space and time: testing the space-time interaction
More informationEconomic Overview. Melissa K. Peralta Senior Economist April 27, 2017
Economic Overview Melissa K. Peralta Senior Economist April 27, 2017 TTX Overview TTX functions as the industry s railcar cooperative, operating under pooling authority granted by the Surface Transportation
More informationPersistence racial difference in socioeconomic outcomes. Are Emily and Greg More Employable than Lakisha and Jamal?
Are Emily and Greg More Employable than Lakisha and Jamal? Bertrand and Mullainathan Persistence racial difference in socioeconomic outcomes Large difference in outcomes between similarly defined blacks
More informationPitching Performance and Age
Pitching Performance and Age By: Jaime Craig, Avery Heilbron, Kasey Kirschner, Luke Rector, Will Kunin Introduction April 13, 2016 Many of the oldest players and players with the most longevity of the
More informationECO 745: Theory of International Economics. Jack Rossbach Fall Lecture 6
ECO 745: Theory of International Economics Jack Rossbach Fall 2015 - Lecture 6 Review We ve covered several models of trade, but the empirics have been mixed Difficulties identifying goods with a technological
More informationEffects of Incentives: Evidence from Major League Baseball. Guy Stevens April 27, 2013
Effects of Incentives: Evidence from Major League Baseball Guy Stevens April 27, 2013 1 Contents 1 Introduction 2 2 Data 3 3 Models and Results 4 3.1 Total Offense................................... 4
More informationGALLUP NEWS SERVICE 2018 MIDTERM ELECTION
GALLUP NEWS SERVICE 2018 MIDTERM ELECTION Results are based on telephone interviews with a random sample of 1,508 -- national adults, aged 18+, living in all 50 states and the District of Columbia, conducted
More informationBusiness Cycles. Chris Edmond NYU Stern. Spring 2007
Business Cycles Chris Edmond NYU Stern Spring 2007 1 Overview Business cycle properties GDP does not grow smoothly: booms and recessions categorize other variables relative to GDP look at correlation,
More informationChapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19
Chapter 13 Factorial ANOVA Patrick Mair 2015 Psych 1950 13 Factorial ANOVA 0 / 19 Today s Menu Now we extend our one-way ANOVA approach to two (or more) factors. Factorial ANOVA: two-way ANOVA, SS decomposition,
More informationZoning for a Healthy Baltimore
Zoning for a Healthy Baltimore Results from a Health Impact Assessment of the April 2010 draft zoning code Center for Child and Community Research, Johns Hopkins University in collaboration with the Baltimore
More informationStatistical Modeling of Consumers Participation in Gambling Markets and Frequency of Gambling
Statistical Modeling of Consumers Participation in Gambling Markets and Frequency of Gambling Brad R. Humphreys University of Alberta Department of Economics Yang Seung Lee University of Alberta Department
More informationSEDAR52-WP November 2017
Using a Censored Regression Modeling Approach to Standardize Red Snapper Catch per Unit Effort Using Recreational Fishery Data Affected by a Bag Limit Skyler Sagarese and Adyan Rios SEDAR52-WP-13 15 November
More informationJanuary 2019 FY Key Performance Report
January 2019 FY 2019 - Key Performance Report Management Notes: The information in this report is based on the FY 2019 Operating Budget, adopted by the Board on June 11, 20. RT s farebox recovery ratio
More informationIDENTIFYING SUBJECTIVE VALUE IN WOMEN S COLLEGE GOLF RECRUITING REGARDLESS OF SOCIO-ECONOMIC CLASS. Victoria Allred
IDENTIFYING SUBJECTIVE VALUE IN WOMEN S COLLEGE GOLF RECRUITING REGARDLESS OF SOCIO-ECONOMIC CLASS by Victoria Allred A Senior Honors Project Presented to the Honors College East Carolina University In
More informationNational Association of REALTORS National Smart Growth Frequencies
September 520, 2017 3,000 Weighted Online Respondents National Association of REALTORS National Smart Growth Frequencies Q.2 The first question is about the quality of life in your community. How satisfied
More informationAnalysis of Variance. Copyright 2014 Pearson Education, Inc.
Analysis of Variance 12-1 Learning Outcomes Outcome 1. Understand the basic logic of analysis of variance. Outcome 2. Perform a hypothesis test for a single-factor design using analysis of variance manually
More informationDoes Gun Control Reduce Criminal Violence? An Econometric Evaluation of Canadian Firearm Laws.
Table 1. Variables Included in this Model Independent Variables 1 - IMM - percentage of the population that immigrated to Canada and settled in a province over the past three years; 2 - IPM - inter-provincial
More informationImpacts of climate change on the distribution of blue marlin (Makaira. nigricans) ) as inferred from data for longline fisheries in the Pacific Ocean
Impacts of climate change on the distribution of blue marlin (Makaira nigricans) ) as inferred from data for longline fisheries in the Pacific Ocean Nan-Jay Su 1*, Chi-Lu Sun 1, Andre Punt 2, Su-Zan Yeh
More informationTraffic Safety Barriers to Walking and Bicycling Analysis of CA Add-On Responses to the 2009 NHTS
Traffic Safety Barriers to Walking and Bicycling Analysis of CA Add-On Responses to the 2009 NHTS NHTS Users Conference June 2011 Robert Schneider, Swati Pande, & John Bigham, University of California
More informationTaking Your Class for a Walk, Randomly
Taking Your Class for a Walk, Randomly Daniel Kaplan Macalester College Oct. 27, 2009 Overview of the Activity You are going to turn your students into an ensemble of random walkers. They will start at
More informationPitching Performance and Age
Pitching Performance and Age Jaime Craig, Avery Heilbron, Kasey Kirschner, Luke Rector and Will Kunin Introduction April 13, 2016 Many of the oldest and most long- term players of the game are pitchers.
More informationSeptember 2018 FY Key Performance Report
September 20 FY 2019 - Key Performance Report Management Notes: The information in this report is based on the FY 2019 Operating Budget, adopted by the Board on June 11, 20. RT s farebox recovery ratio
More informationState of American Trucking
State of American Trucking October 11, 2018 Rod Suarez Economic Analyst American Trucking Associations rsuarez@trucking.org Business Cycles U.S. Expansions Duration October 1949 - July 1953 May 1954 -
More informationStats 2002: Probabilities for Wins and Losses of Online Gambling
Abstract: Jennifer Mateja Andrea Scisinger Lindsay Lacher Stats 2002: Probabilities for Wins and Losses of Online Gambling The objective of this experiment is to determine whether online gambling is a
More informationInternational Discrimination in NBA
International Discrimination in NBA Sports Economics Drew Zhong Date: May 7 2017 Advisory: Prof. Benjamin Anderson JEL Classification: L83; J31; J42 Abstract Entering the 21st century, the National Basketball
More informationDecember 5-8, 2013 Total N= December 4-15, 2013 Uninsured N = 702
POLL December 5-8, 2013 Total N= 1000 December 4-15, 2013 Uninsured N = 702 All trends are from New York Times/CBS News polls unless otherwise noted. Asterisk indicates registered respondents only. 1.
More informationConfidence Intervals with proportions
Confidence Intervals with proportions a.k.a., 1-proportion z-intervals AP Statistics Chapter 19 1-proportion z-interval Statistic + Critical value Standard deviation of the statistic POINT ESTIMATE STANDARD
More informationA Study of Olympic Winning Times
Connecting Algebra 1 to Advanced Placement* Mathematics A Resource and Strategy Guide Updated: 05/15/ A Study of Olympic Winning Times Objective: Students will graph data, determine a line that models
More informationFactorial Analysis of Variance
Factorial Analysis of Variance Overview of the Factorial ANOVA Factorial ANOVA (Two-Way) In the context of ANOVA, an independent variable (or a quasiindependent variable) is called a factor, and research
More information