Question

Problem 1. (Bootstrap tests for goodness-of-fit)We saw in lecture that when it comes togoodness-of-fit (GOF) testing,...

Problem 1. (Bootstrap tests for goodness-of-fit)We saw in lecture that when it comes togoodness-of-fit (GOF) testing, it is quite “natural” to obtain a p-value by permutation. It is alsopossible, however, to use the bootstrap for that purpose. Consider the two-sample situation forsimplicity, although this generalizes to any number of samples. Thus assume a situation where weobserveX1, . . . , Xmiid fromFand (independently)Y1, . . . , Yniid fromG, whereFandGare twodistributions on the real line. We want to testF=GversusF6=G. We may want to use a statisticT=T(X1, . . . , Xm, Y1, . . . , Yn) for that purpose, and the question is how to obtain a p-value forTvia a bootstrap. The idea is, as usual, to estimate the “best” null distribution and bootstrap fromthat distribution. A natural approach to estimate the null distribution is to simply combine thetwo samples as one, and estimate the corresponding distribution via the empirical distribution. Wethus use the empirical distribution from thecombined sampleto bootstrap from.A.

Write a functionbootGOFdiff(x, y, B= 2000) that takes in two samples as vectorsxandy,and a number of replicatesB(Monte Carlo samples from the estimated null distribution),and returns the bootstrap GOF p-value for the difference in meansT=|X̄−Ȳ|.

Homework Answers

Answer #1

Here I take samples of size 10 from standard normal distribution and the write the function for calculating the boostrap P-value. One can start with any arbitrary sample of arbitrary size. The function will work properly and it will return the P-value of the test. But every time you have to choose the critical region properly.

##.. boostrap sampling

x=rnorm(10) ##..sample X
y=rnorm(10) ##..sample Y
B=2000 ##.. no. of boostrap iteration
T=mean(x)-mean(y)
T=abs(T) ##value of the test statistic for the original sample

##.. function for boostrap

bootGOFdiff=function(x,y,B)
{
p=0
for(i in 1:B)
{
   x.boot=sample(x,size=10,replace=T)
   y.boot=sample(y,size=10,replace=T)
   T.boot=mean(x.boot)-mean(y.boot)
   T.boot=abs(T.boot)
   if(abs(T-T.boot)>0.5) p=p+1
}
p=p/B
return(p)
}

bootGOFdiff(x,y,B) ##.. required P-value

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
1) When we fit a model to data, which is typically larger? a) Test Error b)...
1) When we fit a model to data, which is typically larger? a) Test Error b) Training Error 2) What are reasons why test error could be LESS than training error? (Pick all that applies) a) By chance, the test set has easier cases than the training set. b) The model is highly complex, so training error systematically overestimates test error c) The model is not very complex, so training error systematically overestimates test error 3) Suppose we want to...
I. Solve the following problem: For the following data: 1, 1, 2, 2, 3, 3, 3,...
I. Solve the following problem: For the following data: 1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 5, 6 n = 12 b) Calculate 1) the average or average 2) quartile-1 3) quartile-2 or medium 4) quartile-3 5) Draw box diagram (Box & Wisker) II. PROBABILITY 1. Answer the questions using the following contingency table, which collects the results of a study to 400 customers of a store where you want to analyze the payment method. _______B__________BC_____ A...
1.The sample mean is an unbiased estimator for the population mean. This means: The sample mean...
1.The sample mean is an unbiased estimator for the population mean. This means: The sample mean always equals the population mean. The average sample mean, over all possible samples, equals the population mean. The sample mean will only vary a little from the population mean. The sample mean has a normal distribution. 2.Which of the following statements is CORRECTabout the sampling distribution of the sample mean: The standard error of the sample mean will decrease as the sample size increases....