In 1990 the number of driver deaths per 100,000 for the different age groups was as follows (Source: The National Highway Traffic Safety Administration's National Center for Statistics and Analysis):
Age |
Number of Driver Deaths per 100,000 |
15-24 |
28 |
25-39 |
15 |
40-69 |
10 |
70-79 |
15 |
80+ |
25 |
Complete the following using the table above:
a. For each age group, pick the midpoint of the interval for the x value. (For the 80+ group, use 85.)
b. Using ages as the independent variable and Number of driver deaths per 100,000 as the dependent variable, make a scatter plot of the data.
c. Calculate the least squares (best-fit) line. Put the equation in the form of: ^y= a + bx
d. Find the correlation coefficient. Is it significant?
e. Pick two ages and find the estimated fatality rates.
f. Use the two points in (e) to plot the least squares line on your graph from (b).
g. Based on the above data, is there a linear relationship between age of a driver and driver fatality rate?
h. What is the slope of the least squares (best-fit) line? Interpret the slope.
I have done the problem in R. Please let me know in comment section if you have any questions. Please upvote my answer if you like it.
R code-
#partA
#midpoint=(upper interval point+ lower interval point)/2
X=c(19.5,32,54.5,74.5,85)
Y=c(28,15,10,15,25)
#partB
plot(X,Y)
#partC
lm(Y~X)
#So from output we get, Y= 20.7715 + (-0.0409)*X
#partD
cor(X,Y)
#Output- correlation=-0.1492
#partE
A=30
B1=20.7715+(-0.0409)*A
B1
C=80
B2=20.7715+(-0.0409)*C
B2
#For X=30 we get Y=19.5445 and X=80, Y=17.4995
#partF
g=c(30,80)
H=c(B1,B2)
lines(g,H,type="l")
#partG
#No there are no linear relationship.
#partH
#Slope=-0.0409
#Interpretation- If we increase one unit X then Y will decrease by 0.0409 unit.
Scatter plot with least square line-
Get Answers For Free
Most questions answered within 1 hours.