using r (include all code and packages used)
Generate 25 variables, each of which consists of 25 random samples from a standard normal. Store these variables in a data frame – call it df.train – and randomly select one variable to be the response – rename it y. (The end result should be a data frame with 25 observations on 25 variables but with no relationships between any of the variables.)
Repeat step (a) to create a test set called df.test.
Write a loop that will successively linearly regress y on one additional predictor each time through. That is, the first time through the loop you should build a linear model with only one predictor (the first one in your data frame). The ith time through the loop, you should build a linear model where y is regressed on the first i predictors. Record the training and test error each time so that at the end of the procedure you have two vectors (call them MSE.train and MSE.test) that contain the MSEs from each model
Solution:
#train
n=25
k=c(1:n)
x=function(k)rnorm(n,mean=0,sd=1)
H=lapply(k,x)
A=data.frame(H)
S=sample(k,1)
y=as.vector(A[,S])
f=function(l){
xx=as.matrix(cbind(1,A[,1:l]))
beta=solve(t(xx)%*%xx)%*%t(xx)%*%(y)
e=y-(xx%*%beta)
(sum(e^2)/n)
}
MSE.train=array(NA,n)
for(i in 1:n)MSE.train[i]=f(i)
#test
n=25
k=c(1:n)
x=function(k)rnorm(n,mean=0,sd=1)
H=lapply(k,x)
A=data.frame(H)
S=sample(k,1)
y=as.vector(A[,S])
f=function(l){
xx=as.matrix(cbind(1,A[,1:l]))
beta=solve(t(xx)%*%xx)%*%t(xx)%*%(y)
e=y-(xx%*%beta)
(sum(e^2)/n)
}
MSE.test=array(NA,n)
for(i in 1:n)MSE.test[i]=f(i)
par(mfrow=c(1,1))
plot(k,MSE.train)
plot(k,MSE.test)
please give me thumb up ......
Get Answers For Free
Most questions answered within 1 hours.