Question

1) When we fit a model to data, which is typically larger? a) Test Error b)...

1) When we fit a model to data, which is typically larger?

a) Test Error b) Training Error

2) What are reasons why test error could be LESS than training error? (Pick all that applies)

a) By chance, the test set has easier cases than the training set.

b) The model is highly complex, so training error systematically overestimates test error

c) The model is not very complex, so training error systematically overestimates test error

3) Suppose we want to use cross-validation to estimate the error of the following procedure:

Step 1: Find the k variables most correlated with y

Step 2: Fit a linear regression using those variables as predictors

We will estimate the error for each k from 1 to p, and then choose the best k.

True or false: a correct cross-validation procedure will possibly choose a different set of k variables for every fold.

4) Suppose that we perform forward stepwise regression and use cross-validation to choose the best model size.

Using the full data set to choose the sequence of models is the WRONG way to do cross-validation (we need to redo the model selection step within each training fold). If we do cross-validation the WRONG way, which of the following is true?

a) The selected model will probably be too complex

b) The selected model will probably be too simple

5) One way of carrying out the bootstrap is to average equally over all possible bootstrap samples from the original data set (where two bootstrap data sets are different if they have the same data points but in different order). Unlike the usual implementation of the bootstrap, this method has the advantage of not introducing extra noise due to resampling randomly. (You can use "^" to denote power, as in "n^2")

To carry out this implementation on a data set with n data points, how many bootstrap data sets would we need to average over?

6) If we have n data points, what is the probability that a given data point does not appear in a bootstrap sample?

7) If we use ten-fold cross-validation as a means of model selection, the cross-validation estimate of test error is:

a) biased upward

b) biased downward

c) unbiased

d) potentially any of the above

8) Why can't we use the standard bootstrap for some time series data? (Pick all that applies)

a) The data points in most time series aren't i.i.d.

b) Some points will be used twice in the same sample

c) The standard bootstrap doesn't accurately mimic the real-world data-generating mechanism

Homework Answers

Answer #1

Answer 7. is (d)

If we use ten-fold cross-validation as a means of model selection, the cross-validation estimate of test error is potentially biased upward, downward or unbiased.

There are competing biases: on one hand, the cross-validated estimate is based on models trained on smaller training sets than the full model, which means we will tend to overestimate test error for the full model.

On the other hand, cross-validation gives a noisy estimate of test error for each candidate model, and we select the model with the best estimate. This means we are more likely to choose a model whose estimate is smaller than its true test error rate, hence, we may underestimate test error. In any given case, either source of bias may dominate the other.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
(1) A Chi-squared test is typically used to test for any of the following except which...
(1) A Chi-squared test is typically used to test for any of the following except which of the following? (A) If a mathematical model accurately predicts our observed frequencies of data values. (B) If a mathematical model accurately predicts the total number of observed data values. (C) If a mathematical model accurately predicts the pattern of our observed data values. (D) Whether two factors present in a population are independent of one another. (E) Whether a series of populations experience...
Which type of hypothesis test would we use to analyze data from the following scenarios? Why?...
Which type of hypothesis test would we use to analyze data from the following scenarios? Why? (Choose from paired t-test, 2-sample t-test, ANOVA, or Chi-square test) A researcher wants to know if there is an association of gender (male, female) and eye color (brown, blue, grey, green). Which test would he use to determine if gender and eye color are independent in his data set? A business owner wants to determine if her seminar is effective in training new employees....
1. Management of a fast-food chain proposed the following regression model to predict sales at outlets:...
1. Management of a fast-food chain proposed the following regression model to predict sales at outlets: y = β0 + β1x1 + β2x2 + β3x3 + ε, where y = sales ($1000s) x1= number of competitors within one mile x2= population (in 1000s) within one mile x3is 1 if a drive-up window is present, 0 otherwise The following estimated regression equation was developed after 20 outlets were surveyed: = 12.6 − 3.6x1+ 7.0x2+ 14.1x3 Use this equation to predict sales...
Regression Analysis 1. At the end of the Regression Analysis with Categorical Data lecture, there was...
Regression Analysis 1. At the end of the Regression Analysis with Categorical Data lecture, there was a prompt about a multiple regression analysis conducted to examine the factors influencing police arrests. There are two competing theories of when the police make arrests: Situational Threats: police only make arrests when protestors use violent or illegal tactics. When demonstrators step out of line, the police respond accordingly. Non-Behavioral Threats: while the tactics protestors use are certainly important, the police are more aggressive...
PUBH 6033—Week 7 Assignment 1 Comparing two means: When drink drove a student to statistics (Rubric...
PUBH 6033—Week 7 Assignment 1 Comparing two means: When drink drove a student to statistics (Rubric included)                                              Instructions For this assignment, you review this week’s Learning Resources and then perform a two-sample independent t test and an ANOVA related to the dataset that was utilized in the week 2 SPSS application assignment. Import the data into SPSS; or, if you correctly saved the data file in Week 2, you may open and use that saved file to complete this...
1. General features of economic time series: trends, cycles, seasonality. 2. Simple linear regression model and...
1. General features of economic time series: trends, cycles, seasonality. 2. Simple linear regression model and multiple regression model: dependent variable, regressor, error term; fitted value, residuals; interpretation. 3. Population VS sample: a sample is a subset of a population. 4. Estimator VS estimate. 5. For what kind of models can we use OLS? 6. R-squared VS Adjusted R-squared. 7. Model selection criteria: R-squared/Adjusted R-squared; residual variance; AIC, BIC. 8. Hypothesis testing: p-value, confidence interval (CI), (null hypothesis , significance...
1.    In a multiple regression model, the following coefficients were obtained: b0 = -10      b1 =...
1.    In a multiple regression model, the following coefficients were obtained: b0 = -10      b1 = 4.5     b2 = -6.0 a.    Write the equation of the estimated multiple regression model. (3 pts) b     Suppose a sample of 25 observations produces this result, SSE = 480. What is the estimated standard error of the estimate? (5 pts) 2.    Consider the following estimated sample regression equation: Y = 12 + 6X1 -- 3 X2 Determine which of the following statements are true,...
1. After performing an ANOVA test, with (3,4) degrees of freedom, for data collected during an...
1. After performing an ANOVA test, with (3,4) degrees of freedom, for data collected during an experiment trying to determine if there is at least one difference between groups. You get a calculated F value of 7.52. Using the table below, find the appropriate critical F value. What should be your conclusion(s), based on those 2 F values and α? Select ALL that apply Critical values of F (α= 0.05) Group of answer choices: A. My calculated F value is...
The data set (Canvas: body.csv) contains records of CHEST_DIAM, , CHEST_DEPTH, ANKLE_DIAM,WAIST_GIRTH, WRIST_GIRTH, WRIST_DIAM (all in...
The data set (Canvas: body.csv) contains records of CHEST_DIAM, , CHEST_DEPTH, ANKLE_DIAM,WAIST_GIRTH, WRIST_GIRTH, WRIST_DIAM (all in cm.), AGE (years), WEIGHT (kg.), HEIGHT (cm.), andGENDER (1=male) for 108 individuals. We will be looking for the best set of variables to (parsimoniously?) modelWEIGHT. Even though 6 explanatory variables only gives 29=512 possibilities for “all possible” regressions, we’lltry to be more methodical about it. ##question2 library(MASS) ## ## Attaching package: 'MASS' ## The following object is masked from 'package:olsrr': ## ##     cement body =...
QUESTION 1 1. Brianna is trying to increase her chances of being promoted to vice president...
QUESTION 1 1. Brianna is trying to increase her chances of being promoted to vice president by working to build good work relationships with other managers outside her own department. Brianna's behavior should be viewed as dysfunctional politics. functional politics. coercive power. functional influence. 2 points QUESTION 2 1. The Gingerbread Factory has a separate unit that makes their chocolate crunch cookies and another unit that is completely responsible for all operations in producing their ginger snap cookies. The Gingerbread...