Midterm Exam 1, section 2. Thursday, September hour, 15 minutes

San Francisco State University Michael Bar ECON 312 Fall 2018 Midterm Exam 1, section 2 Thursday, September 27 1 hour, 15 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. You can use one double-sided sheet of paper, letter size (8½ 11 in or 215.9 279.4 mm), with any content you want. 3. No calculators of any kind are allowed. 4. Show all the calculations, and explain your steps. 5. If you need more space, use the back of the page. 6. Fully label all graphs. Good Luck

1. (10 points). Let XX be a random variable with mean μμ and variance 2, and let XX μμ YY = be the standardized transformation of XX. a. Using the rules of expected values show that the mean of YY is 0. XX μμ EE(YY) = EE given form of YY = 1 EE(XX μμ) constants factor out of EE = 1 (EE(XX) μμ) EE of sum = sum of EE = 1 (μμ μμ) it is given that EE(XX) = μμ = 0 b. Using the rules of variances, show that the variance of YY is 1. XX μμ vvvvvv(yy) = vvvvvv given the form of YY = 1 vvvvvv(xx μμ) constants factor out of vvvvvv squared 2 = 1 vvvvvv(xx) adding constant does not affect var 2 = 1 2 2 = 1 given vvvvvv(xx) 1

2. (10 points). Let XX and YY be two random variables. XX has mean μμ XX and variance XX 2, and YY has mean μμ YY and variance YY 2. Prove that cccccc XX μμ XX XX cccccc XX μμ XX XX, YY μμ YY = cccccccc(xx, YY) YY, YY μμ YY = 1 1 cccccc(xx μμ YY XX XX, YY μμ YY ) const. factor out of cov as product YY cccccc(xx, YY) = adding (or subtracting) const. does not affect cov XX YY = cccccccc(xx, YY) definition of correlation 2

3. (20 points). Let XX 1, XX 2,, XX be a random sample from population XX, with population mean μμ and variance 2. a. Prove that (XX ii XX ) = 0 where XX is the sample average. Your answer must start with a definition of sample average. Sample average is defined as follows: Thus, XX = 1 XX ii (XX ii XX ) = XX ii XX = XX XX = 0 a. Supposed that vvvvvv(xx 1 ) = 7. Find vvvvvv(xx 5 + XX 10 ). Since XX 1, XX 2,, XX is a random sample, all observations must be independent random variables, which means they are uncorrelated. Thus, vvvvvv(xx 5 + XX 10 ) = vvvvvv(xx 5 ) + vvvvvv(xx 10 ) + 2 cccccc(xx 5, XX 10 ) =0 = vvvvvv(xx 5 ) + vvvvvv(xx 10 ) = 7 + 7 = 14 AAAAAA XX ii ss haaaaaa ssssssee dddddddddddddddddddddddd 3

4. (20 points). In order to estimate the population mean, a random sample of observations was collected XX 1, XX 2,, XX, and the sample average XX = 1 XX ii is proposed as an estimator. a. Prove that XX is an unbiased estimator of the population mean μμ. EE(XX ) = EE 1 XX ii = 1 EE(XX ii) = 1 μμ = 1 = μμ b. Let 2 denote the population variance. Prove that XX is a consistent estimator of the population mean μμ. Since we proved that XX is unbiased, we only need to prove that lim vvvvvv(xx ) = lim 2 = 0. vvvvvv(xx ) = vvvvvv 1 XX ii = 1 2 vvvvvv(xx ii) = 1 2 2 = 2 lim vvvvvv(xx ) = lim 2 = 0 4

5. (20 points). Consider the simple regression model YY ii = ββ 1 + ββ 2 XX ii + uu ii. a. Suppose that YY ii is crime rate in state ii (number of crimes per 100,000 population), and XX ii is poverty rate in state ii (% of population below poverty rate). What is the interpretation of the error term uu ii? Your interpretation must contain one relevant example. The error term uu ii represents all the factors, other than poverty rate, which affect crime rate. For example, uu ii may characteristics of law enforcement and judicial system in state ii. a. Define the OLS estimators of the unknown parameters ββ 1, ββ 2 and denote them by bb 1 OOOOOO, bb 2 OOOOOO. Let the fitted model be YY ii = bb 1 + bb 2 XX ii, where bb 1 and bb 2 are some estimates of ββ 1 and ββ 2. The residual of observation ii (or prediction error) is ee ii = YY ii YY ii. The OLS estimators bb OOOOOO OOOOOO 1, bb 2 are values of bb 1, bb 2 which minimize the Residual Sum of Squares, i.e. solve the following problem: 2 min RRRRRR = ee ii = (YY bb 1,bb 2 ii bb 1 bb 2 XX ii ) 2 5

b. Suppose that Rihaa estimated bb 1 = 3400 and bb 2 = 100. What is the predicted crime rate of a state with poverty rate of 16%? (In the data 16% appears as 16). Substituting the given values into the fitted equation: YY ii = bb 1 + bb 2 XX ii = 3400 + 100 16 = 3400 + 1600 = 5000 c. If the average crime rate in the sample is 4000, what is the average poverty rate in the sample? Using the fact that the fitted equation must pass through the point of sample averages, YY = bb 1 + bb 2 XX 4000 = 3400 + 100 XX 100 XX = 600 XX = 6 percent 6

6. (20 points). Veronica is studying the standard of living (defined as real GDP per capita) and health across countries. She collected data on 183 countries with the following key variables: LE life expectancy at birth (in years), a common indicator of health GDP real GDP per capita (in thousands of $) Veronica s R output is presented below: lm(le ~ GDP, data = HDI) Residuals: Min 1Q Median 3Q Max -19.369-3.621 1.426 4.402 8.954 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 67.38380 0.58728 114.74 <2e-16 *** GDP 0.25899 0.02209 11.72 <2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 5.772 on 181 degrees of freedom Multiple R-squared: 0.4315, Adjusted R-squared: 0.4284 F-statistic: 137.4 on 1 and 181 DF, p-value: < 2.2e-16 a. What is the dependent variable in the above regression model? LE b. What is the independent variable (regressor) in the above regression model? GDP 7

c. Interpret the estimated regression coefficients. bb2 = 0.26 means that each additional $1000 of standard of living (real GDP per capita) is predicted to increases the life expectancy is a country by 0.26 years. So a difference in standard of living of $10,000 in standard of living translates to difference of 2.6 years in life expectancy. bb 1 =67.4 years, is the predicted life expectancy in a country with zero GDP per capita. The number 67.4 is not insensible, but a country with zero GDP is not a possibility. d. Explain the meaning of the reported RR 2, and comment on its magnitude. Your comment must contain at least one relevant example. RR 2 =0.4315 means that 43.15% of the variation in life expectancy across countries is explained by this model, with standard of living as the only regressor. This means that how rich the country is on average is very important for the average health of people in that country (e.g., rich countries can afford better nutrition, better healthcare services, etc.) Nevertheless, 57% of the variation in LE in the sample is due to factors other than standard of living. For example, the type of health insurance system that countries have, can also be important in affecting health utcomes (more people having access to basic care can improve overall health). Also, inequality of income distribution may determine of only few rich people have access to basic care, or most people have access to basic care. 8