5. This problem illustrates why it is better to use the squared errors rather than the absolute errors to assess a potential regression line. The Least Absolute Errors criterion says that we are looking for the regression line that minimizes the sum of the absolute errors, i.e., minimizes ∑ | ???? − ??�?? | . Consider the data: x - 0 0 1 1
y - 0 1 0 1
Here are three potential regression lines: (i) y = x
(ii) y = 0.5
(iii) y = 1 3 + 1 3 ??
(a) Calculate the sum of the absolute errors for each of these lines, and rank them from best to worst. What are your finding?
(b) Calculate the SSE for each of these lines, and rank them from best to worst by the SSE criterion.
(c) Find the Least-Squares regression line (by using the formulas). Is it one of the above three lines?
(d) What is your conclusion regarding the Least Absolute Errors criterion? i.e., if someone asks you why don’t we use the absolute errors rather than the squared errors, what would you tell them?
a) and b)
x | y | predicted y=x | Absolute error | squared error |
0 | 0 | 0 | 0 | 0 |
0 | 1 | 0 | 1 | 1 |
1 | 0 | 1 | 1 | 1 |
1 | 1 | 1 | 0 | 0 |
2 | 2 | |||
x | y | predicted y=0.5 | ||
0 | 0 | 0.5 | 0.5 | 0.25 |
0 | 1 | 0.5 | 0.5 | 0.25 |
1 | 0 | 0.5 | 0.5 | 0.25 |
1 | 1 | 0.5 | 0.5 | 0.25 |
2 | 1 | |||
x | y | predicted y=1/3 + x/3 | ||
0 | 0 | 0.333333333 | 0.333333333 | 0.111111111 |
0 | 1 | 0.333333333 | 0.666666667 | 0.444444444 |
1 | 0 | 0.666666667 | 0.666666667 | 0.444444444 |
1 | 1 | 0.666666667 | 0.333333333 | 0.111111111 |
2 | 1.111111111 |
Using Absolute error ,we see that sum is 2 for all models
Using SSE
Model 2 > 3 > 1 {better as SSE is lower in model 2 compared to model 3 }
c)
using regression
y^ = 0.5
yes, it is one of three lines it is model 2
Get Answers For Free
Most questions answered within 1 hours.