Two surveys are conducted by a university to obtain information on the time (in minutes) that students wait in line at the bookstore to pay for their textbooks, one at the beginning of the semester and one at the end of the semester. In each survey, a sample of 20 students is to be obtained. For the beginning of the semester, a sample B is generated while for the end of the semester, a sample E is generated.
B = 3.1 7.5 6.1 6.8 5.8 2.9 11.0 8.6 6.1 3.1 4.7 5.7 12.9 5.8 1.9 5.9 3.0 6.6 8.5 3.9
E= 0.9 2.7 2.1 2.4 2.0 0.8 4.2 3.1 2.1 0.9 1.5 1.9 5.0 2.0 0.5 2.0 4.4 3.1 1.2 2.1
a)Find the mean and median for the data obtained at the beginning of the semester. Based on these values, would you describe the distribution of the data as symmetric or skewed? Explain your answer.
b)Briefly explain the meaning of an outlier. Is the mean or the median a better measure of central tendency for a data set that contains outliers? Why?
c)Construct parallel (side-by-side) box-and-whisker plots for the two data sets.
(i) Compare the distribution shape, center and spread of the students’ waiting times at the beginning and at the end of the semester.
(ii) Do the data sets contain any outliers? Explain.
(iii) Do the students’ waiting times tend to be longer at the beginning of the semester compared to at the end of the semester? Explain.
d)Find the standard deviation for the data obtained at the beginning and at the end of the semester. Based on these values, which data set has the higher spread? Compare this answer with the answer you obtained in part (d)(i), is there any difference in your conclusion? Explain.
a)
Descriptive Statistics: B
Variable N N* Mean SE Mean StDev Minimum Q1
Median Q3 Maximum
B 20 0 5.995 0.623 2.785 1.900 3.300
5.850 7.325 12.900
Descriptive Statistics: E
Variable N N* Mean SE Mean StDev Minimum Q1
Median Q3 Maximum
E 20 0 2.245 0.274 1.225 0.500 1.275
2.050 3.000 5.000
b)
Outlier: A value that lies outside (is much smaller or larger than) most of the other value in a set of data is called an outlier.
There are many ways to define outliers like, Boxplot, Dixon test, Kurtosis statistics.
Median is the better central tendency to define outlier.
C)
1)so for these two data sets B and E there is no outlier in this data.
D)
B=2.785
E=1.225
So B dataset have higher spread than E datasset
Get Answers For Free
Most questions answered within 1 hours.