# Introduction to Econometric Methods – GradSchoolPapers.com

. For Prof. Vogelsang’s house in Haslett, n = 51 hourly radon measurements were taken in the winter of 2006. The sample average of those measurements is y = 1.68 and the sample variance is s 2 =

0.851. For his former house in Ithaca, NY, n = 44 hourly radon measurements were taken in the winter of 1999. The sample average of those measurements is y = 4.19 and the sample variance is s 2 =

0.3098.

1. Assuming that radon levels in the Ithaca basement follow a normal random variable with population mean po = 4 and population variance a 2 = 0.42, compute the probabilities of the following

events for Prof. Vogelsang’s house in Ithaca: (a) The radon level goes below 2. (b) The radon level goes above 6. (c) The radon level is within 0.4 of the EPA cutoff of 4. (d) The radon level does

NOT exceed the cutoff of 4. (e) The radon level is NOT within 0.4 of the EPA cutoff of 4. 2. Suppose we have a data set with only two observations, y1 and y2. The general formula for the sample

average is y= 1 n Xn i=1 yi = 1 n (y1 + y2 + y3 + + yn) which simplifies to y= 1 2 X 2 i=1 yi = 1 2

(y1 + y2) when n = 2. (a) Assuming that y1 and y2 are randomly sampled from a population with mean, 130, and variance, a 2 , derive a formula for the expectation of y. In other words, derive a

formula for E(y). Your answer should depend on p0. (b) Is y an unbiased estimator of 130. Why or why not? (c) Compute the variance of y, i.e. compute var(y). Your answer should be a formula that

depends on a 2

(d) Suppose y1 and y2 are sampled from the population in such a way so that they are correlated with each other. In particular, suppose cov(y1, y2) = y 6= 0. Does this change whether or not y is

unbiased? Why or why not? (e) Continuing (d), compute var(y). Don’t forget the cheese! Your answer should be a formula that depends on a 2 and y. (1) Continuing (e), suppose a 2 = 1.5. Consider

three possible values of y = -0.4, 0, 0.4. For which value of y will y be the most efficient estimator of 130? In other words, for what value of y will y have the smallest variance? (g) (Really

hard conceptual question). Provide intuition for your finding in part (f) In other words, explain intuitively how the value of y affects the variance of y. 3. Consider the null hypothesis that the

population mean, p0, of the radon in the Haslett house is equal to the EPA cutoff of 4. (a) Write the null hypothesis as a mathematical statement about 130. (b) Write the alternative hypothesis as

a mathematical statement about pa (c) When testing this null hypothesis, are you doing a left-tail, right-tail or two-tailed test? Why or why not? (d) What estimator of 130 (not the number for the

estimate itself) will you need to use to test the null hypothesis? What is the formula for the variance of this estimator? (Don’t derive it, just write it down). How can you estimate this variance

formula? How can you use the estimated variance to obtain a standard error for your estimator of 130? 4. Test the null hypothesis from Question 3 using a t-test. Assume you do not know the

population distribution of radon. You will have to rely on the central limit theorem and approximate the null distribution of your t-statistic using the N(0, 1) distribution. Carry out your test at

the 5% significance level (a = 0 05) Clearly explain how you compute the t-statistic Clearly state the rejection rule you are using and how you obtained your critical

irolun lAthnt is the. moult of Item ur fneff)

5. Compute the non-rejection region (the 95% confidence interval) for your test in question 4. Interpret this non-rejection region. Does is suggest my house is relatively safe, at least according

to the EPA? 6. Compute the p-value for your test in Question 4. Based on the p-value, can the null hypothesis be rejected at the 1% level? How about the 10% level? What is the largest significance

level such that you don’t reject the null hypothesis? 7. How would your calculations in Questions 4-6 have to be modified if you knew that radon is normally distributed in the population? You don’t

have to actually redo the calculations, just point out the places where you would have to do things differently and explain why. 2 8. Suppose radon observations are not a random sample and are

correlated over time. How would this cause problems for your t-test? Hint: how would the formula for var(y) be affected if the data is correlated and would your t-test in question 4 be based on the

correct formula for var(y)? If you answered question 2(e) correctly, that should give you some insight here. 9. This question is optional. If you want more practice with hypothesis testing, carry

out t-tests, compute non-rejection regions (confidence intervals) and compute p-values for the following hypotheses for both the Ithaca and Haslett radon data. Use a 5% significance level for the

t-tests and non-rejection regions. (a) HO :130 8, H1 : p0 < 8 “The null is that the radon level is very high (8 or higher).” (b) HO :130 0.5, H1 : p0 > 0.5 “The null is that the radon level is very

low (0.5 or lower).” 3