A car insurance company performed a study to determine whether an association exists between age and the frequency of car accidents. They obtained the following sample data.
under 25 | 25-45 | over 45 | total | ||
number of | 0 | 74 | 90 | 84 | 248 |
accidents in | 1 | 19 | 8 | 12 | 39 |
past 3 years | >1 | 7 | 2 | 4 | 13 |
total | 100 | 100 | 100 | 300 |
Conduct a test, at the 5% significance level, to determine whether the data provide sufficient evidence to conclude that an association exists between age and frequency of car accidents.
(a) State the null and alternative hypotheses for this test.
(b) Calculate:
(i) the expected frequency, E31, for the number of accidents is > 1 and age is under 25 years?
(ii) the chi-square contribution, χ2, for the number of accidents is > 1 and age is under 25 years?
(iii) the p-value for this test?
(c) Given that the chi-square test statistic, χ2, is 9.273, state the conclusion for the test.
(d) The sample data was inputted into a statistical software for analysis. The output generated included the following warning: Warning: 3 cells (33.3%) have expected count less than 5. The minimum expected count is 4.33.
(i) Why was this warning displayed?
(ii) How does this affect the conclusion stated in part (c)?
(iii) Suggest how this problem could be resolved?
Given table data is as below
MATRIX | col1 | col2 | col3 | TOTALS |
row 1 | 74 | 90 | 84 | 248 |
row 2 | 19 | 8 | 12 | 39 |
row 3 | 7 | 2 | 4 | 13 |
TOTALS | 100 | 100 | 100 | N = 300 |
------------------------------------------------------------------
calculation formula for E table matrix
E-TABLE | col1 | col2 | col3 |
row 1 | row1*col1/N | row1*col2/N | row1*col3/N |
row 2 | row2*col1/N | row2*col2/N | row2*col3/N |
row 3 | row3*col1/N | row3*col2/N | row3*col3/N |
------------------------------------------------------------------
expected frequecies calculated by applying E - table matrix formulae
E-TABLE | col1 | col2 | col3 |
row 1 | 82.667 | 82.667 | 82.667 |
row 2 | 13 | 13 | 13 |
row 3 | 4.333 | 4.333 | 4.333 |
------------------------------------------------------------------
calculate chisquare test statistic using given observed frequencies, calculated expected frequencies from above
Oi | Ei | Oi-Ei | (Oi-Ei)^2 | (Oi-Ei)^2/Ei |
74 | 82.667 | -8.667 | 75.117 | 0.909 |
90 | 82.667 | 7.333 | 53.773 | 0.65 |
84 | 82.667 | 1.333 | 1.777 | 0.021 |
19 | 13 | 6 | 36 | 2.769 |
8 | 13 | -5 | 25 | 1.923 |
12 | 13 | -1 | 1 | 0.077 |
7 | 4.333 | 2.667 | 7.113 | 1.642 |
2 | 4.333 | -2.333 | 5.443 | 1.256 |
4 | 4.333 | -0.333 | 0.111 | 0.026 |
chisqr^2 o = 9.273 |
------------------------------------------------------------------
set up null vs alternative as
null, Ho: no association exists between age and frequency of car accidents
alternative, H1: exists a relation b/w age and frequency of car accidents OR an association exists between age and frequency of car accidents
level of significance, alpha = 0.05
from standard normal table, chi square value at right tailed, chisqr^2 alpha/2 =9.488
since our test is right tailed,reject Ho when chisqr^2 o > 9.488
we use test statistic chisqr^2 o = Σ(Oi-Ei)^2/Ei
from the table , chisqr^2 o = 9.273
critical value
the value of |chisqr^2 alpha| at los 0.05 with d.f (r-1)(c-1)= ( 3 -1 ) * ( 3 - 1 ) = 2 * 2 = 4 is 9.488
we got | chisqr^2| =9.273 & | chisqr^2 alpha | =9.488
make decision
hence value of | chisqr^2 o | < | chisqr^2 alpha | and here we do not reject Ho
chisqr^2 p_value =0.055
---------------
A.
null, Ho: no association exists between age and frequency of car accidents
alternative, H1: exists a relation b/w age and frequency of car accidents OR an association exists between age and frequency of car accidents
B.
(i) the expected frequency, E31 = 4.333 > 1 and age is under 25 years
(ii)test statistic: 9.273
(iii)p-value:0.055
C.
test statistic: 9.273, decision: do not reject Ho
D.
i)
Warning: 3 cells (33.3%) have expected count less than 5,
because
three of the expected values are 4.33 <5, and chi square rules
explains that
every provided cell observation values should be greater than
5
ii)
In chi-square test we compare observed frequency (that we measure
directly
from the data) with the expected frequency. We calculate expected
frequency
(in Chi-square for independence) forr each cell. The assumption
suggests that no
cell should have expected frequency of less than 5. Or else. the
test is not valid.
iii)
It is possible to ‘poor or ‘collapse’ categories into fewer. but
this must only be
done if it is meaningful to group the data in this way. Else use
Fisher’s exact
test.
Get Answers For Free
Most questions answered within 1 hours.