Question

Part II

We will use the “Twins Data” tab in the workbook.

1) Single Variable

a) Create a Scatterplot of “Wins” and “Runs” (You might need to rescale the axis for each)

b) Run a Regression with “Wins” as y and “Runs” as x

c) What is your model? Slope t-value? F-Value? R squared?

2) Multivariable

a) Traditional Stats

Run a regression with “Wins” as the y variable and both “Batting Average” and “ERA”

as the two x variables

What is your model? Slope t-values? F-Value? R squared?

b) Moneyball Stats

Run a regression with “Wins” as the y variable and “OPS” and “WHIP” as the x variables

What is your model? Slope t-value? F-Value? R squared?

3) Of the 3 options which model do you feel works the best? Explain.

Base Data | Tradititonal | SABREmetric | ||||||

Year | Wins | Runs | Batting Average | ERA | OPS | WHIP | ||

2000 | 69 | 748 | 0.270 | 5.140 | 0.744 | 1.501 | ||

2001 | 85 | 771 | 0.272 | 4.510 | 0.770 | 1.345 | ||

2002 | 94 | 768 | 0.272 | 4.120 | 0.769 | 1.310 | ||

2003 | 90 | 801 | 0.277 | 4.410 | 0.772 | 1.319 | ||

2004 | 92 | 780 | 0.266 | 4.030 | 0.763 | 1.324 | ||

2005 | 83 | 688 | 0.259 | 3.710 | 0.714 | 1.233 | ||

2006 | 96 | 801 | 0.287 | 3.950 | 0.771 | 1.283 | ||

2007 | 79 | 718 | 0.264 | 4.150 | 0.721 | 1.340 | ||

2008 | 88 | 829 | 0.279 | 4.160 | 0.748 | 1.353 | ||

2009 | 87 | 817 | 0.274 | 4.500 | 0.774 | 1.382 | ||

2010 | 94 | 781 | 0.273 | 3.950 | 0.762 | 1.291 | ||

2011 | 63 | 619 | 0.247 | 4.580 | 0.666 | 1.438 | ||

2012 | 66 | 701 | 0.260 | 4.770 | 0.715 | 1.391 | ||

2013 | 66 | 614 | 0.242 | 4.550 | 0.692 | 1.413 | ||

2014 | 70 | 715 | 0.254 | 4.570 | 0.713 | 1.391 | ||

2015 | 83 | 696 | 0.247 | 4.070 | 0.704 | 1.330 | ||

2016 | 59 | 722 | 0.251 | 5.080 | 0.738 | 1.453 |

Answer #1

1) Here is the regression analysis for Win as response and Runs as predictor.

The model is Wins = -23.889 + 0.1408*Runs, slope t-value = 4.153, F-value = 17.25, R-squared = 53.48%

2)a) Here is the regression analysis

The model is Wins = -14.3375 + 549.75*Batting Average - 18.186*ERA, slope t-values = 6.131(Batting Average), -6.315(ERA), F-value = 54.37, R-squared = 88.59%

b) Moneyball Stats:

The model is Wins = -90.1849+ 190.828*OPS - 110.896*WHIP, slope t-values = 4.784(OPS), -5.732(WHIP), F-value = 41.97, R-squared = 85.71%

3) Based on R-square I think Traditional model is most useful.

Multiple linear regression results:
Dependent Variable: Cost
Independent Variable(s): Summated Rating
Cost = -43.111788 + 1.468875 Summated Rating
Parameter estimates:
Parameter
Estimate
Std. Err.
Alternative
DF
T-Stat
P-value
Intercept
-43.111788
10.56402
≠ 0
98
-4.0810021
<0.0001
Summated Rating
1.468875
0.17012937
≠ 0
98
8.633871
<0.0001
Analysis of variance table for multiple regression model:
Source
DF
SS
MS
F-stat
P-value
Model
1
8126.7714
8126.7714
74.543729
<0.0001
Error
98
10683.979
109.02019
Total
99
18810.75
Summary of fit:
Root MSE: 10.441273
R-squared: 0.432...

Q1.
Model 1: OLS, using observations 1-832
Dependent variable: VALUE
Coefficient
Std. Error
t-ratio
p-value
const
597.865
7.72837
77.36
<0.0001
***
LOT
30.8658
4.64595
6.644
<0.0001
***
Mean dependent var
610.3780
S.D. dependent var
221.7390
Sum squared resid
38795690
S.E. of regression
216.1985
R-squared
0.050492
Adjusted R-squared
0.049348
F(1, 830)
44.13736
P-value(F)
5.54e-11
Log-likelihood
−5652.552
Akaike criterion
11309.10
Schwarz criterion
11318.55
Hannan-Quinn
11312.73
2-. For the estimated regression in activity #1 above, provide
appropriate interpretations for the estimated
intercept and...

Here is the data Stat7_prob2.txt :
"Team","WINS","HR","BA","ERA"
"Anaheim Angels",99,152,.282,3.69
"Baltimore Orioles",67,165,.246,4.46
"Boston Red Sox",93,177,.277,3.75
"Chicago White Sox",81,217,.268,4.53
"Cleveland Indians",74,192,.249,4.91
"Detroit Tigers",55,124,.248,4.93
"Kansas City Royals",62,140,.256,5.21
"Minnesota Twins",94,167,.272,4.12
"New York Yankees",103,223,.275,3.87
"Oakland Athletics",103,205,.261,3.68
"Seattle Mariners",93,152,.275,4.07
"Tampa Bay Devil Rays",55,133,.253,5.29
"Texas Rangers",72,230,.269,5.15
"Toronto Blue Jays",78,187,.261,4.8
"Arizona Diamondbacks",98,165,.267,3.92
"Atlanta Braves",101,164,.26,3.13
"Chicago Cubs",67,200,.246,4.29
"Cincinnati Reds",78,169,.253,4.27
"Colorado Rockies",73,152,.274,5.2
"Florida Marlins",79,146,.261,4.36
"Houston Astros",84,167,.262,4
"Los Angeles Dodgers",92,155,.264,3.69
"Milwaukee Brewers",56,139,.253,4.73
"Montreal Expos",83,162,.261,3.97
"New York Mets",75,160,.256,3.89
"Philadelphia Phillies",80,165,.259,4.17
"Pittsburgh Pirates",72,142,.244,4.23
"St. Louis Cardinales",97,175,.268,3.7
"San Diego Padres",66,136,.253,4.62
"San Francisco Giants",95,198,.267,3.54
Here...

Model Summary
Model
R
R Square
Adjusted R Square
Std. Error of the Estimate
1
.816
.666
.629
1.23721
a. Predictors:
(Constant),x
ANOVA
Model
Sum of Squares
df
Mean Square
F
Sig
Regression
Residual
Total
27.500
13.776
41.276
1
9
10
27.500
1.531
17.966
.002b
a. Dependent Variable: Y
b. Predictors: (Constant), X
Coefficients
Model
Understand Coefficients
B
Std Error
Standardized
Coefficients
Beta
t
Sig
1 (Constant)
x
3.001
1.125
.500
.118
.816
2.667...

Marketing date on sales is presented for youtube. data are the
advertising budget in thousands of dollars along with the sales.
The experiment has been repeated 200 times with different budgets
and the observed sales have been recorded. The simple linear
regression model was fitted:
##
## Call:
## lm(formula = sales ~ youtube, data = marketing)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.06 -2.35 -0.23 2.48 8.65
##
## Coefficients:
## Estimate Std. Error t...

The following data represents the winning percentage (the number
of wins out of 162 games in a season) as well as the teams Earned
Run Average, or ERA.
The ERA is a pitching statistic. The lower the ERA, the less runs
an opponent will score per game. Smaller ERA's reflect (i) a good
pitching staff and (ii) a good team defense. You are to investigate
the relationship between a team's winning percentage - ?Y, and its
Earned Run Average (ERA)...

Foreign
Direct Investment and Economic Growth
Economic theory
suggests that foreign direct investment affect the economic growth
(the growth of the Gross DomesticProduct (GDP)) in developing
countries. The objective of this project is to carry out a simple
linear regression analysisto examine this theory. Your independent
and dependent variables are the growth of the foreign direct
investment andthe economic growth (the growth of the Gross Domestic
Product (GDP)) respectively.
Required Tasks:
State the regression model and determine the least
squares...

(1 point) College Graduation
Rates. Data from the College Results Online
website compared the 2011 graduation rate and school size for 92
similar-sized public universities and colleges in the United
States. Statistical software was used to create the linear
regression model using size as the explanatory variable and
graduation rate as the response variable. Summary output from the
software and the scatter plot are shown below. Round all calculated
results to four decimal places.
Coefficients
Estimate
Std. Error
t value
Pr(>|t|)...

You use the data from the Medical Expenditure Panel Survey to
analyse the medical expen- diture of those individuals, who are 65
years and older. This group of people are qualified for health care
under the Australian Medicare program. You run a regression of
total medi- cal expenditure (totexp:) against a dummy variable
private, which takes the value of 1 if individual i has private
health insurance and 0, otherwise. The regression model is .
totexpi = Bo + B1...

Burmer Co. has accumulated data to use in preparing its annual
profit plan for the upcoming year. The cost behavior pattern of the
maintenance costs must be determined. Data regarding the machine
hours and maintenance costs for the last year and the results of
the regression analysis are as follows:
Month
Maintenance Cost
Machine Hours
Jan.
$
5,040
620
Feb.
3,600
420
Mar.
4,320
520
Apr.
3,380
390
May
5,220
650
June
3,550
400
July
3,640
430
Aug.
5,360
680...

ADVERTISEMENT

Get Answers For Free

Most questions answered within 1 hours.

ADVERTISEMENT

asked 7 minutes ago

asked 7 minutes ago

asked 7 minutes ago

asked 29 minutes ago

asked 42 minutes ago

asked 50 minutes ago

asked 50 minutes ago

asked 1 hour ago

asked 1 hour ago

asked 1 hour ago

asked 1 hour ago

asked 1 hour ago