An analyst has an independent and identically distributed sample of n = 100 observations of the pair (Yi , Xi). The analyst regresses Y on X, to obtain the OLS slope coefficient estimator.
i. Suppose the analyst discovers that the 1st observation contains a coding error in terms of how Y was recorded. In particular, the recorded value Y1 was much higher than it should have been. The analyst knows that X1 − X = 0. Should the analyst be worried that this recording error has had a large impact on the OLS estimator? If so, can you determine whether the coding error makes the OLS estimator too large or too small, relative to what it would have been if there were no coding error. (In other words, how is this coding error affecting the OLS estimator?)
ii. Suppose instead that the recording error was not for the 1st observation, but for the 2nd one. In particular, the recorded value of Y2 was much higher than it should have been. In addition, the analyst knows that X2 − X is positive and large. Should the analyst be worried that this recording error has had a large impact on the OLS estimator? If so, can you determine whether the coding error makes the OLS estimator too large or too small, relative to what it would have been if there were no coding error. (In other words, how is this coding error affecting the OLS estimator?)
Solution:
Get Answers For Free
Most questions answered within 1 hours.