Question

# 1. For a pair of sample x- and y-values, what is the difference between the observed...

1. For a pair of sample x- and y-values, what is the difference between the observed value of y and the predicted value of y? a) An outlier b) The explanatory variable c) A residual d) The response variable

2. Which of the following statements is false:

a) The correlation coefficient is unitless. b) A correlation coefficient of 0.62 suggests a stronger correlation than a correlation coefficient of -0.82. c) The correlation coefficient, r, is always between -1 and 1. d) The correlation coefficient should only be used when the association is linear.

3. A study is done to predict the age of a certain type of tree based on the tree’s diameter. The response variable in this study is

a) the diameter b) the tree c) the type of tree d) the age

4. (3) The correlation coefficient measures

a) whether or not a scatter diagram shows an interesting pattern. b) whether a cause and effect relationship exists between two variables. c) the strength of a straight line relationship between two variables. d) whether there is a relationship between two variables.

5. For a particular regression analysis, the following regression equation is obtained: y= -32+8.3x, where x represents the number of hours studied for a test and y represents the score on the test. Which of the following is the only possible value for the correlation coefficient? a) 32 b) 8.3 c) -.8 d) .9 e) -32 f)-8.3

6. The equation below is used to predict the cost of a hospital stay (y) from the number of days a patient is in the hospital (x). y=102.40+302.4x If a person increases his stay by an additional 4 days, how much would the hospital bill increase?

a) \$302.40 b) \$1209.60 c) \$1312.00 d) \$102.40 e) \$409.60

7. Suppose that a 90% confidence interval for the difference between two proportions, p1-p2, is ( -0.013, 0.245). If we were testing the following hypotheses: H0: p1=p2 versus H1: p1≠p2 at the 0.10 significance level we would

a) Reject H0 b) Fail to reject H0

9. Results appear below for comparing average housing prices for New York and California. The researcher is interested in testing whether California houses cost more than New York houses, on average. Let µ1 equal the mean California price and µ2 be the mean New York price.

Difference Sample Mean Std. Err. DF T-Stat P-value

μ1 - μ2 149.5 239.71213 48.757023 0.62366474 0.3178

a) State the null and alternative hypotheses. b) (4) Based on the computer output above, state a conclusion in context of the problem. Use a significance level of 0.05 (α=0.05). c) (3) Based on the results of the hypothesis test our confidence interval would (circle one): a) Include the value of 0 b) Include only positive numbers c) Include only negative numbers

10. A researcher wanted to determine if carpeted or uncarpeted rooms contain more bacteria. Let µ1 be the average number of bacteria per cubic foot for uncarpeted rooms and µ2 be the average number of bacteria per cubic foot for carpeted rooms. A 95% confidence interval for µ1- µ2 is (-6.872, -0.728) bacteria per cubic foot. Based on the confidence interval, we can conclude with 95% confidence that:

a) Carpeted rooms will have more bacteria on average than uncarpeted rooms. b) Uncarpeted rooms will have more bacteria on average than carpeted rooms. c) There is no significant difference, on average, between carpeted and uncarpeted rooms. d) The confidence interval does not provide any information that compares carpeted and uncarpeted rooms. Difference Sample Mean Std. Err. DF T-Stat P-value μ1 - μ2 149.5 239.71213 48.757023 0.62366474 0.3178

11. The General Social Survey asked this question: “Have you attended religious services in the last week?”. Here are the responses for those whose highest degree was high school or above. The expected values under the null hypothesis are given in parentheses.

Highest Degree Earned

High School Junior College Bachelor’s Graduate

Attended Services 400 (437) 62 (56) 146 (129) 76 (62)

Did not attend Services 880 (843) 101(107) 232 (249) 105 (119)

a) State the appropriate null and hypothesis for the chi-square test of independence.

b) The p-value was 0.0027. Using a significance level of 0.05, state a conclusion in context of the problem.

c) In the table below are the residuals (observed-expected) for each cell.

Highest Degree Earned

High School Junior College Bachelor’s Graduate

Attended Services -37.3 6.3 16.9 14.2

Did not attend Services 37.3 -6.3 -16.9 -14.2

Write a few sentences describing the above table. Be sure to use complete sentences and write in context of the problem.

12. The dean of the Business School at a small Florida college wishes to determine whether the grade-point average (GPA) of a graduate student can be used to predict the graduates starting salary. Records for 23 of last year’s Business School graduates were selected and the following least-squares regression line was calculated to predict starting salary (y) from GPA (x) and R 2=.78. Circle the correct interpretation of R2 .

a) 78% of the variation in GPA can be explained by the regression line. b) 78% of the variation in starting salary can be explained by the regression line. c) 78% of the observations will fall on the regression line. d) The regression line will be correct in predicting salary 78% of the time.

13. An insurance company wants to relate the amount of fire damage in major residential fires to distance to the nearest fire station. The insurance company collected data on fire damage (in thousands of dollars) and distance to the nearest fire station (in miles). The least squares regression equation for predicting fire damage from distance is y=13.783+3.646 x.

i) The correct interpretation of the slope is (circle one): a) For every 3.646 miles increase in distance, the average fire damage increases by \$1000. b) For every additional mile traveled, the average fire damage increases by \$3646. c) For every additional 3.646 miles traveled, the average fire damage increases by \$13,783. d) For every additional thousand dollars of damage, the distance traveled increases by 3.646 miles.

ii) What is the predicted fire damage for a fire that occurred 6 miles away. Show work.

iii) Suppose the observed fire damage for a fire that occurred 6 miles away was \$40,000. Calculate the residual for this observation.

iv) If you were a homeowner included in this study, would it be better to have a negative or positive residual? Circle one: Negative Positive

14. Subjects in a study were categorized in terms of whether or not they were obese and their relationship status (single, dating, or married). The table below summarizes the information:

single dating married Total

obese 81 103 147 331

not obese 359 326   277 962

Total 440 429 424 1293

a) Give the marginal distribution of the relationship status variable.

b) What percent of obese individuals were married.

c) What percent of the individuals in this study were dating and not obese?

d) Give the conditional distribution of relationship status for obese group.

e) Give the conditional distribution of relationship status for the not obese group.

f) Write a few sentences comparing the relationship status for obese and not obese individuals in this study

Parameter Method Confidence Interval Hypothesis Test

2 Proportions, p1-p2 Formula ( ) 1 1 2 2 2 2 2 1 1 1 1 2 2 ˆ 1 ˆ , ˆ 1 ˆ ˆ ˆ ˆ ˆ ˆ ˆ q p q p n p q n p q p p z = − = − −   + q p n n x x p n pq n pq p p z = − + + = + − = 1 ˆ ˆ 1 2 1 2 1 2 1 2

Calculator B:2-PropZInt 6:2-PropZTest

2 Means, µ1-µ2 Formula ( ) 2 2 2 1 2 1 1 2 2 n s n s x − x  t + 2 2 2 1 2 1 1 2 n s n s x x t + − = Calculator 10:2-SampTInt 4: 2-SampTTest ( )  −  = E O E 2 2 Total RowTotalColumnTotal E = Equation of a line : y b b x 0 1 ˆ = + where b0 is the y-intercept and b1 is the slope. Resid = y - y ˆ Confidence Levels 80% 90% 95% 98% 99% Value for  2 z 1.282 1.645 1.96 2.326 2.576

Solution :

y = observed sample value of y .

and = predicted by using regression equation

the the difference between the observed sample value of y and the​ y-value that is predicted by using the regression equation is which is residual. i.e . Correct option is (c) residual .

There are more than 1 questions, as per the Q&A guidelines i am answering first question. If you want to get the answers for the rest of the parts, please post the question in a new post.

Please give me a thumbs-up if this helps you out. Thank you!