Midterm Exam 1, section 2. Thursday, September hour, 15 minutes

Similar documents
The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

ISyE 6414 Regression Analysis

Lesson 14: Modeling Relationships with a Line

Combining Experimental and Non-Experimental Design in Causal Inference

Navigate to the golf data folder and make it your working directory. Load the data by typing

Special Topics: Data Science

ECO 745: Theory of International Economics. Jack Rossbach Fall Lecture 6

Distancei = BrandAi + 2 BrandBi + 3 BrandCi + i

A Class of Regression Estimator with Cum-Dual Ratio Estimator as Intercept

Lecture 5. Optimisation. Regularisation

Empirical Example II of Chapter 7

Chapter 12 Practice Test

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Operational Risk Management: Preventive vs. Corrective Control

Functions of Random Variables & Expectation, Mean and Variance

Section I: Multiple Choice Select the best answer for each problem.

New Class of Almost Unbiased Modified Ratio Cum Product Estimators with Knownparameters of Auxiliary Variables

Pitching Performance and Age

Running head: DATA ANALYSIS AND INTERPRETATION 1

Pitching Performance and Age

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

ISyE 6414: Regression Analysis

Use of Auxiliary Variables and Asymptotically Optimum Estimators in Double Sampling

Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday.

CS249: ADVANCED DATA MINING

Minimal influence of wind and tidal height on underwater noise in Haro Strait

Biostatistics & SAS programming

Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation

ISDS 4141 Sample Data Mining Work. Tool Used: SAS Enterprise Guide

Name May 3, 2007 Math Probability and Statistics

Analysis of Variance. Copyright 2014 Pearson Education, Inc.

Week 7 One-way ANOVA

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5

Quantitative Methods for Economics Tutorial 6. Katherine Eyal

San Francisco State University ECON 560 Summer Midterm Exam 2. Monday, July hour 15 minutes

Algebra I: A Fresh Approach. By Christy Walters

DISMAS Evaluation: Dr. Elizabeth C. McMullan. Grambling State University

Dutch Disease, Deindustrialization and Employment in South America Roberto Frenkel

1. What function relating the variables best describes this situation? 3. How high was the balloon 5 minutes before it was sighted?

What does it take to produce an Olympic champion? A nation naturally

Bayesian Methods: Naïve Bayes

The Reliability of Intrinsic Batted Ball Statistics Appendix

Applying Hooke s Law to Multiple Bungee Cords. Introduction

Journal of Human Sport and Exercise E-ISSN: Universidad de Alicante España

Lesson 18: There Is Only One Line Passing Through a Given Point with a Given Slope

Equation 1: F spring = kx. Where F is the force of the spring, k is the spring constant and x is the displacement of the spring. Equation 2: F = mg

AP Statistics Midterm Exam 2 hours

Algebra I: A Fresh Approach. By Christy Walters

Failure Data Analysis for Aircraft Maintenance Planning

GLMM standardisation of the commercial abalone CPUE for Zones A-D over the period

Lecture 10. Support Vector Machines (cont.)

ASTERISK OR EXCLAMATION POINT?: Power Hitting in Major League Baseball from 1950 Through the Steroid Era. Gary Evans Stat 201B Winter, 2010

Development of Decision Support Tools to Assess Pedestrian and Bicycle Safety: Development of Safety Performance Function

Conservation of Energy. Chapter 7 of Essential University Physics, Richard Wolfson, 3 rd Edition

THE INTEGRATION OF THE SEA BREAM AND SEA BASS MARKET: EVIDENCE FROM GREECE AND SPAIN

Modelling residential prices with cointegration techniques and automatic selection algorithms

Accident data analysis using Statistical methods A case study of Indian Highway

Lecture 22: Multiple Regression (Ordinary Least Squares -- OLS)

1. Answer this student s question: Is a random sample of 5% of the students at my school large enough, or should I use 10%?

1 Introduction. 2 EAD and Derived Factors

Unit 4: Inference for numerical variables Lecture 3: ANOVA

Math 4. Unit 1: Conic Sections Lesson 1.1: What Is a Conic Section?

Analysis of Gini s Mean Difference for Randomized Block Design

Traffic Safety Barriers to Walking and Bicycling Analysis of CA Add-On Responses to the 2009 NHTS

Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils

Factors Affecting the Probability of Arrests at an NFL Game

An Empirical Analysis of the Impact of Renewable Portfolio Standards and Feed-in-Tariffs on International Markets.

4. A student estimated a regression model using annual data for 1990 through 2015, C = β 0. Y + β 2

March Madness Basketball Tournament

Lab 11: Introduction to Linear Regression

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 AUDIT TRAIL

Economic Value of Celebrity Endorsements:

STANDARD SCORES AND THE NORMAL DISTRIBUTION

This page intentionally left blank

Below are the graphing equivalents of the above constraints.

Mining and Agricultural Productivity

Grade: 8. Author(s): Hope Phillips

CS145: INTRODUCTION TO DATA MINING

Driv e accu racy. Green s in regul ation

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

March Madness Basketball Tournament

Why so blue? The determinants of color pattern in killifish, Part II Featured scientist: Becky Fuller from The University of Illinois

Effect of homegrown players on professional sports teams

Math A Regents Exam 0806 Page 1

Sample Final Exam MAT 128/SOC 251, Spring 2018

Review of A Detailed Investigation of Crash Risk Reduction Resulting from Red Light Cameras in Small Urban Areas by M. Burkey and K.

Anabela Brandão and Doug S. Butterworth

Besides the reported poor performance of the candidates there were a number of mistakes observed on the assessment tool itself outlined as follows:

Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart?

Section 3.2: Measures of Variability

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

Introduction to Scientific Notation & Significant Figures. Packet #6

Building an NFL performance metric

Lesson 3 Pre-Visit Teams & Players by the Numbers

Exposure to External Shocks and the Geographical Diversification of Exports

Announcements. Unit 7: Multiple Linear Regression Lecture 3: Case Study. From last lab. Predicting income

NCSS Statistical Software

Estimating Paratransit Demand Forecasting Models Using ACS Disability and Income Data

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement

Evaluation of Regression Approaches for Predicting Yellow Perch (Perca flavescens) Recreational Harvest in Ohio Waters of Lake Erie

Transcription:

San Francisco State University Michael Bar ECON 312 Fall 2018 Midterm Exam 1, section 2 Thursday, September 27 1 hour, 15 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. You can use one double-sided sheet of paper, letter size (8½ 11 in or 215.9 279.4 mm), with any content you want. 3. No calculators of any kind are allowed. 4. Show all the calculations, and explain your steps. 5. If you need more space, use the back of the page. 6. Fully label all graphs. Good Luck

1. (10 points). Let XX be a random variable with mean μμ and variance 2, and let XX μμ YY = be the standardized transformation of XX. a. Using the rules of expected values show that the mean of YY is 0. XX μμ EE(YY) = EE given form of YY = 1 EE(XX μμ) constants factor out of EE = 1 (EE(XX) μμ) EE of sum = sum of EE = 1 (μμ μμ) it is given that EE(XX) = μμ = 0 b. Using the rules of variances, show that the variance of YY is 1. XX μμ vvvvvv(yy) = vvvvvv given the form of YY = 1 vvvvvv(xx μμ) constants factor out of vvvvvv squared 2 = 1 vvvvvv(xx) adding constant does not affect var 2 = 1 2 2 = 1 given vvvvvv(xx) 1

2. (10 points). Let XX and YY be two random variables. XX has mean μμ XX and variance XX 2, and YY has mean μμ YY and variance YY 2. Prove that cccccc XX μμ XX XX cccccc XX μμ XX XX, YY μμ YY = cccccccc(xx, YY) YY, YY μμ YY = 1 1 cccccc(xx μμ YY XX XX, YY μμ YY ) const. factor out of cov as product YY cccccc(xx, YY) = adding (or subtracting) const. does not affect cov XX YY = cccccccc(xx, YY) definition of correlation 2

3. (20 points). Let XX 1, XX 2,, XX be a random sample from population XX, with population mean μμ and variance 2. a. Prove that (XX ii XX ) = 0 where XX is the sample average. Your answer must start with a definition of sample average. Sample average is defined as follows: Thus, XX = 1 XX ii (XX ii XX ) = XX ii XX = XX XX = 0 a. Supposed that vvvvvv(xx 1 ) = 7. Find vvvvvv(xx 5 + XX 10 ). Since XX 1, XX 2,, XX is a random sample, all observations must be independent random variables, which means they are uncorrelated. Thus, vvvvvv(xx 5 + XX 10 ) = vvvvvv(xx 5 ) + vvvvvv(xx 10 ) + 2 cccccc(xx 5, XX 10 ) =0 = vvvvvv(xx 5 ) + vvvvvv(xx 10 ) = 7 + 7 = 14 AAAAAA XX ii ss haaaaaa ssssssee dddddddddddddddddddddddd 3

4. (20 points). In order to estimate the population mean, a random sample of observations was collected XX 1, XX 2,, XX, and the sample average XX = 1 XX ii is proposed as an estimator. a. Prove that XX is an unbiased estimator of the population mean μμ. EE(XX ) = EE 1 XX ii = 1 EE(XX ii) = 1 μμ = 1 = μμ b. Let 2 denote the population variance. Prove that XX is a consistent estimator of the population mean μμ. Since we proved that XX is unbiased, we only need to prove that lim vvvvvv(xx ) = lim 2 = 0. vvvvvv(xx ) = vvvvvv 1 XX ii = 1 2 vvvvvv(xx ii) = 1 2 2 = 2 lim vvvvvv(xx ) = lim 2 = 0 4

5. (20 points). Consider the simple regression model YY ii = ββ 1 + ββ 2 XX ii + uu ii. a. Suppose that YY ii is crime rate in state ii (number of crimes per 100,000 population), and XX ii is poverty rate in state ii (% of population below poverty rate). What is the interpretation of the error term uu ii? Your interpretation must contain one relevant example. The error term uu ii represents all the factors, other than poverty rate, which affect crime rate. For example, uu ii may characteristics of law enforcement and judicial system in state ii. a. Define the OLS estimators of the unknown parameters ββ 1, ββ 2 and denote them by bb 1 OOOOOO, bb 2 OOOOOO. Let the fitted model be YY ii = bb 1 + bb 2 XX ii, where bb 1 and bb 2 are some estimates of ββ 1 and ββ 2. The residual of observation ii (or prediction error) is ee ii = YY ii YY ii. The OLS estimators bb OOOOOO OOOOOO 1, bb 2 are values of bb 1, bb 2 which minimize the Residual Sum of Squares, i.e. solve the following problem: 2 min RRRRRR = ee ii = (YY bb 1,bb 2 ii bb 1 bb 2 XX ii ) 2 5

b. Suppose that Rihaa estimated bb 1 = 3400 and bb 2 = 100. What is the predicted crime rate of a state with poverty rate of 16%? (In the data 16% appears as 16). Substituting the given values into the fitted equation: YY ii = bb 1 + bb 2 XX ii = 3400 + 100 16 = 3400 + 1600 = 5000 c. If the average crime rate in the sample is 4000, what is the average poverty rate in the sample? Using the fact that the fitted equation must pass through the point of sample averages, YY = bb 1 + bb 2 XX 4000 = 3400 + 100 XX 100 XX = 600 XX = 6 percent 6

6. (20 points). Veronica is studying the standard of living (defined as real GDP per capita) and health across countries. She collected data on 183 countries with the following key variables: LE life expectancy at birth (in years), a common indicator of health GDP real GDP per capita (in thousands of $) Veronica s R output is presented below: lm(le ~ GDP, data = HDI) Residuals: Min 1Q Median 3Q Max -19.369-3.621 1.426 4.402 8.954 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 67.38380 0.58728 114.74 <2e-16 *** GDP 0.25899 0.02209 11.72 <2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 5.772 on 181 degrees of freedom Multiple R-squared: 0.4315, Adjusted R-squared: 0.4284 F-statistic: 137.4 on 1 and 181 DF, p-value: < 2.2e-16 a. What is the dependent variable in the above regression model? LE b. What is the independent variable (regressor) in the above regression model? GDP 7

c. Interpret the estimated regression coefficients. bb2 = 0.26 means that each additional $1000 of standard of living (real GDP per capita) is predicted to increases the life expectancy is a country by 0.26 years. So a difference in standard of living of $10,000 in standard of living translates to difference of 2.6 years in life expectancy. bb 1 =67.4 years, is the predicted life expectancy in a country with zero GDP per capita. The number 67.4 is not insensible, but a country with zero GDP is not a possibility. d. Explain the meaning of the reported RR 2, and comment on its magnitude. Your comment must contain at least one relevant example. RR 2 =0.4315 means that 43.15% of the variation in life expectancy across countries is explained by this model, with standard of living as the only regressor. This means that how rich the country is on average is very important for the average health of people in that country (e.g., rich countries can afford better nutrition, better healthcare services, etc.) Nevertheless, 57% of the variation in LE in the sample is due to factors other than standard of living. For example, the type of health insurance system that countries have, can also be important in affecting health utcomes (more people having access to basic care can improve overall health). Also, inequality of income distribution may determine of only few rich people have access to basic care, or most people have access to basic care. 8