Use the dependent variable (labeled Y) and the independent variables (labeled X1, X2, and X3) in the data file. Use Excel to perform the regression and correlation analysis to answer the following.
Generate a scatterplot for the specified dependent variable (Y) and the X1 independent variable, including the graph of the "best fit" line. Interpret.
Determine the equation of the "best fit" line, which describes the relationship between the dependent variable and the selected independent variable.
Determine the coefficient of correlation. Interpret.
Determine the coefficient of determination. Interpret.
Test the utility of this regression model. Interpret results, including the p-value.
Based on the findings in Steps 1-5, analyze the ability of the independent variable to predict the designated dependent variable.
Compute the confidence interval for β1 (the population slope) using a 95% confidence level. Interpret this interval.
Using an interval, estimate the average for the dependent variable for a selected value of the independent variable. Interpret this interval.
Using an interval, predict the particular value of the dependent variable for a selected value of the independent variable. Interpret this interval.
What can be said about the value of the dependent variable for values of the independent variable that are outside the range of the sample values? Explain.
In an attempt to improve the model, use a multiple regression model to predict the dependent variable .Y, based on all of the independent variables. X1, X2, and X3.
Using Excel, run the multiple regression analysis using the designated dependent and three independent variables. State the equation for this multiple regression model.
Perform the Global Test for Utility (F-Test). Explain the conclusion.
Perform the t-test on each independent variable. Explain the conclusions and clearly state how the analysis should proceed. In particular, which independent variables should be kept and which should be discarded. If any independent variables are to be discarded, re-run the multiple regression, including only the significant independent variables, and summarize results with discussion of analysis.
Is this multiple regression model better than the linear model generated in parts 1-10? Explain.
Please use the data below, thank you for helping me with this in advance, I really appreciate it.
Sales (Y) | Calls (X1) | Time (X2) | Years (X3) | Type |
51 | 167 | 12.6 | 5 | ONLINE |
34 | 133 | 15.2 | 4 | GROUP |
49 | 161 | 16.1 | 3 | NONE |
45 | 185 | 13.3 | 1 | ONLINE |
47 | 176 | 14.1 | 2 | ONLINE |
47 | 183 | 12.8 | 2 | ONLINE |
38 | 122 | 19.3 | 3 | GROUP |
44 | 171 | 13.6 | 3 | GROUP |
47 | 157 | 14.3 | 1 | GROUP |
37 | 148 | 15.7 | 3 | GROUP |
51 | 177 | 11.4 | 4 | NONE |
40 | 144 | 17.4 | 0 | NONE |
48 | 136 | 13.3 | 2 | ONLINE |
52 | 197 | 14 | 2 | ONLINE |
46 | 145 | 16.8 | 0 | ONLINE |
42 | 167 | 17.7 | 3 | ONLINE |
37 | 120 | 12 | 2 | NONE |
42 | 148 | 16.9 | 1 | NONE |
43 | 131 | 18.5 | 1 | NONE |
49 | 184 | 16.7 | 2 | ONLINE |
44 | 150 | 18.4 | 1 | NONE |
43 | 148 | 15.9 | 1 | ONLINE |
55 | 189 | 12 | 1 | ONLINE |
37 | 152 | 19.8 | 0 | GROUP |
44 | 148 | 13.5 | 3 | GROUP |
43 | 169 | 13.3 | 4 | NONE |
49 | 188 | 20.4 | 1 | NONE |
45 | 164 | 16.7 | 3 | NONE |
45 | 146 | 12 | 3 | GROUP |
43 | 173 | 19.8 | 2 | ONLINE |
47 | 164 | 15.3 | 0 | ONLINE |
48 | 177 | 13.9 | 3 | ONLINE |
49 | 160 | 13.6 | 3 | GROUP |
51 | 190 | 11.3 | 1 | ONLINE |
42 | 135 | 16.1 | 0 | NONE |
37 | 137 | 18.1 | 1 | ONLINE |
51 | 167 | 16.2 | 1 | ONLINE |
44 | 169 | 8.9 | 0 | ONLINE |
46 | 149 | 17.8 | 3 | NONE |
42 | 153 | 15.5 | 2 | GROUP |
45 | 140 | 11 | 3 | GROUP |
37 | 133 | 19.8 | 2 | NONE |
52 | 173 | 18.6 | 0 | ONLINE |
39 | 156 | 13.3 | 4 | NONE |
45 | 130 | 20.6 | 3 | GROUP |
37 | 130 | 15.6 | 1 | GROUP |
40 | 125 | 12.2 | 4 | NONE |
44 | 182 | 15.5 | 4 | NONE |
48 | 165 | 19.8 | 5 | ONLINE |
42 | 154 | 14.8 | 2 | ONLINE |
53 | 178 | 13.2 | 2 | ONLINE |
37 | 142 | 18.5 | 1 | NONE |
46 | 153 | 14.1 | 1 | ONLINE |
43 | 166 | 17.6 | 3 | ONLINE |
45 | 138 | 18.9 | 2 | NONE |
42 | 167 | 18 | 2 | NONE |
48 | 171 | 13 | 2 | GROUP |
39 | 149 | 18.8 | 1 | GROUP |
46 | 151 | 16 | 1 | GROUP |
46 | 162 | 16.2 | 2 | ONLINE |
45 | 158 | 13.9 | 1 | ONLINE |
44 | 188 | 12.9 | 3 | GROUP |
49 | 149 | 21.1 | 2 | GROUP |
41 | 157 | 11.5 | 3 | ONLINE |
48 | 156 | 15.1 | 4 | ONLINE |
46 | 172 | 12.5 | 1 | ONLINE |
48 | 174 | 18.6 | 2 | GROUP |
47 | 188 | 16.3 | 1 | NONE |
54 | 180 | 11.8 | 4 | GROUP |
45 | 173 | 17.6 | 2 | ONLINE |
53 | 184 | 15.2 | 0 | ONLINE |
37 | 148 | 16.2 | 1 | GROUP |
45 | 155 | 18.9 | 2 | GROUP |
44 | 159 | 18.1 | 2 | ONLINE |
46 | 162 | 12.1 | 1 | GROUP |
52 | 177 | 14.5 | 1 | ONLINE |
54 | 174 | 10.8 | 2 | NONE |
48 | 175 | 13.7 | 1 | ONLINE |
44 | 139 | 15.2 | 2 | NONE |
41 | 158 | 19.3 | 2 | ONLINE |
43 | 145 | 18.6 | 2 | NONE |
40 | 150 | 10.8 | 1 | GROUP |
53 | 182 | 10.5 | 1 | ONLINE |
47 | 193 | 13.5 | 2 | ONLINE |
43 | 148 | 14.5 | 4 | ONLINE |
38 | 145 | 17.1 | 2 | NONE |
50 | 184 | 15.6 | 2 | ONLINE |
39 | 138 | 17.7 | 3 | GROUP |
54 | 197 | 11.8 | 1 | ONLINE |
41 | 155 | 13.6 | 3 | GROUP |
41 | 128 | 15.5 | 2 | NONE |
42 | 160 | 10.6 | 3 | NONE |
46 | 148 | 13.1 | 1 | GROUP |
45 | 177 | 14.2 | 2 | GROUP |
43 | 153 | 15.2 | 3 | GROUP |
41 | 153 | 14.7 | 1 | GROUP |
49 | 152 | 22.3 | 0 | ONLINE |
44 | 169 | 13.6 | 1 | ONLINE |
49 | 166 | 16.2 | 0 | ONLINE |
37 | 145 | 18 | 3 | NONE |
Data
Type is made cateogrical
1 for online, 2 for group , 0 for none
result for multiple regression
SUMMARY OUTPUT | |||||||
Regression Statistics | |||||||
Multiple R | 0.693545406 | ||||||
R Square | 0.48100523 | ||||||
Adjusted R Square | 0.459152819 | ||||||
Standard Error | 3.480531964 | ||||||
Observations | 100 | ||||||
ANOVA | |||||||
df | SS | MS | F | Significance F | |||
Regression | 4 | 1066.600238 | 266.6500595 | 22.01154018 | 7.04189E-13 | ||
Residual | 95 | 1150.839762 | 12.11410276 | ||||
Total | 99 | 2217.44 | |||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | |
Intercept | 20.07981678 | 4.515261657 | 4.447099261 | 2.36175E-05 | 11.11588904 | 29.04374451 | 11.11588904 |
TYPE | -0.085965801 | 0.463839992 | -0.185335036 | 0.853361379 | -1.006804611 | 0.834873008 | -1.006804611 |
Calls (X1) | 0.171766931 | 0.020281269 | 8.469239868 | 3.06065E-13 | 0.131503522 | 0.21203034 | 0.131503522 |
Time (X2) | -0.133904863 | 0.131953162 | -1.014790864 | 0.312783101 | -0.395865011 | 0.128055284 | -0.395865011 |
Years (X3) | -0.257125411 | 0.294252939 | -0.873824446 | 0.384417446 | -0.841291353 | 0.327040531 | -0.841291353 |
significance F = 7.04189E-13 << 0.05
hence this model is significance
if p-value < 0.05 , that variable is significant
here only X1 has p-value (3.06065E-13) < 0.05
we can keep X1 and remove X2,x3 and type
Get Answers For Free
Most questions answered within 1 hours.