A psychology instructor wants to find out a suitable predictor of the Final examination marks of his students. He thinks that the Assignment marks or the Mid-term test marks can be used for this purpose. However he is not sure which of those is more suitable. The following table shows the Assignment marks (out of 20), Mid-term test marks (out of 20) and the Final examination marks (out of 40) of 5 randomly selected students of his psychology class last year. The data in a given row are related to the same student.
student number | marks | mid term test marks | final exam marks |
1 | 14 | 11 | 23 |
2 | 17 | 15 | 40 |
3 | 20 | 20 | 40 |
4 | 10 | 11 | 29 |
5 | 16 | 13 | 35 |
Assuming that there is a linear relationship between the Assignment marks and the Final examination marks, calculate the Pearson’s correlation coefficient. Round your answer to 3 decimal places
Assuming that there is a linear relationship between the Mid-term test marks and the Final examination marks,
i. Derive the least squares prediction line to predict the Final examination marks based on the Mid-term test marks. (9 points)
ii. Calculate the coefficient of determination for the least squares prediction line. (3 points)
iii. Interpret the value of the coefficient of determination in relation to this situation. (2 points)
Out of the two variables ‘Assignment marks’ and ‘Mid-term test marks’, which variable is more suitable to use as the independent variable in a least squares prediction line to predict the Final examination marks? Explain the reason for your answer.
Formula for Pearson’s correlation coefficient is
Where n is the sample size and x, y are the variables. Hence the calulation is
student number | Assignment marks (x) | mid term test marks | final exam marks (y) | xy | x2 | y2 |
1 | 14 | 11 | 23 | 322 | 196 | 529 |
2 | 17 | 15 | 40 | 680 | 289 | 1,600 |
3 | 20 | 20 | 40 | 800 | 400 | 1,600 |
4 | 10 | 11 | 29 | 290 | 100 | 841 |
5 | 16 | 13 | 35 | 560 | 256 | 1,225 |
Total | 77 | 70 | 167 | 2,652 | 1,241 | 5,795 |
r=(13,260-12,589)/276*1086
r= 401/547.4815
r=0.732
i) Least square prediction line is Y = a + bX where Y=final exam marks X= Assignment marks , a – Intercept b – Coefficient
b=SSxy/SSxx and a=–b
where
SSxx=Σx2−1/n(*Σx)2, SSxy=Σxy−1/n(Σx)(Σy)
x~ is the mean of all the x-values, y~ is the mean of all the y-values, and n is the number of sample in the data set.
SSxy=2,652-77*167/5
= 80.2
SSxx=1,241-5,959/5
= 55.20
So b=80.2/50.2
b= 1.4529
mean of x=26, mean of y =56
a=–b
a=56-1.4529*26
a=18.38
Hence the Least square prediction line is Y = a + bX
Y=18.38+1.45X
ii) coefficient of determination =b*SSxy/SSyy
=1.45*55.20/217
=0.369
iii) About 37% of the variability in the value of this final exam marks can be explained by Assignment marks.
Get Answers For Free
Most questions answered within 1 hours.