Describe sampling distribution of the ols estimator
Let us consider the simple linear regression model where we describe a relation between a continous variable y and a variable x of the type y=α+βx+ϵ.
This implies
E[y|x]=α+βxE[y|x]=α+βx, under the hypothesis that E[ϵ]=0E[ϵ]=0.
We want to estimate the unknown parameters α and β using a sample of n observations
Let a and b be the estimators of α and β, so that y=a+bx.
One way to estimate a and b is by minimizing ∑ni=1(yi−a−bxi)2∑i=1n(yi−a−bxi)2.
Here, we minimize the sum of the squared distances between the line y=a+bx and the points (xi,yi)(xi,yi) with respect to a and b. Such a minimization is called OLS (Ordinary Least Squares).
The estimators a and b are then b=cov(x,y)/v(x),
a=y¯+bx¯,b=cov(x,y)/v(x),a=y¯+bx¯,
where cov(x,y)cov(x,y) is the sampling covariance between xixis and yiyis, v(x)v(x) is the sampling variance of the xixis, x¯x¯ and y¯y¯ are the sample mean of the xi and yi respectively. Assuming the denominator in both cov(x,y)cov(x,y) and v(x)v(x) is n.
The estimators a and b depend on the sample observations. To make inference on the unknown parameters α and β one should know the sampling distribution of the estimators a and b, indeed actually we observe only one sample. Under some regularity conditions a and b are the Best Linear Unbiased Estimators (BLUE) for α and β.
Their sampling distribution is
b∼N(β,σ2(nv(x))−1)b∼N(β,σ2(nv(x))−1)
a∼N(α,σ2(v(x)+x¯)(nv(x))−1),a∼N(α,σ2(v(x)+x¯)(nv(x))−1),
where σ2 is the variance of the error term ϵϵ, i.e. E[(ϵ−E[ϵ])2]=E[ϵ2]=σ2E[(ϵ−E[ϵ])2]=E[ϵ2]=σ2.
The value σ2 is unknown and should be estimated. One possible estimator is s2=(n−2)−1∑ni=1(yi−y^i)2s2=(n−2)−1∑i=1n(yi−y^i)2, where yi=a+bxiyi.
Finally, it is possible to sow that the sampling distribution of a and b given the estimator s2 of σ2 is
b−βs(nv(x))−1√∼tn−2b−βs(nv(x))−1∼tn−2
a−αs(v(x)+x¯)(nv(x))−1√∼tn−2.a−αs(v(x)+x¯)(nv(x))−1∼tn−2.
The regularity conditions are
Take attention to the first regularity condition which implies central independent and homoskedastic errors. If one of this is violated then the sampling distribution of the OLS will change.
Get Answers For Free
Most questions answered within 1 hours.