We have provided the R code that calculates the information for the 7-number summary,
histogram and boxplot for all observations and for the data with four outliers removed.
Do you think that Age is a useful variable? Yes or no and explain.
Min: 2.00
1st Qu: 19.00
Median: 20.00
Mean: 19.87
3rd Qu: 21.00
Max: 69.00
NA: 2
(after outliers removed)
Min: 17.00
1st Qu: 19.00
Median: 20.00
Mean: 19.68
3rd Qu: 21.00
Max: 23.99
NA: 2
By looking just at the 7-number summary one cannot determine if a variable (like age in this case) is significant to a model or not. We need to look at the scatterplots of (independent v/s dependent variable) or correlation coefficients and the model output summary (before and after outlier treatment).
From the given summary, we can definitely conclude that mean has improved while the median remains the same (this might be due to the fact that there are outliers on both side of the mean)
Get Answers For Free
Most questions answered within 1 hours.