## (3.32) Case Study. Refer to the **Prostate cancer** dataset
in Appendix C.5
Build a regression model to predict PSA level $(Y)$ as a function
of cancer volume $(X)$. The analysis should include an assessment
of the degree to which the key regression assumptions are
satisfied. If the regression assumptions are not met, include and
justify appropriate remedial measures. Use the final model to
estimate mean PSA level for a patient whose cancer volume is $20$
cc. Assess the strengths and weaknesses of the final model.
Dataset can be found here:
http://users.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Appendix%20C%20Data%20Sets/APPENC05.txt
Cancer Volume is column 3 and PSA level is column 2
You may use:
df <- read.delim("APPENC05.txt", header = FALSE, sep
="")
df <- cbind(df[2],df[3])
The predictor variable(y) = PSA level, independent variable(x)= cancer volume
so the regression model is,
y = 0 +1 x
two variables are continuous so we use linear regression model.
we see the regression assumption by seeeing residual plot follows normal plot or not.
so we use R code as,
copy data on excel from given website. and then copy the data from excel and run the code as follow
data=read.table("clipboard",header = TRUE)
data
Model = lm(psa_level~cancer_volume,data = data)
Model
# Residual plot
plot(Model)
the plot is as,
the model follows normality assumption.
then the summury of model is ,
intercept(0) = 6.9586219 , coefficient(1) = 0.0008806
here p value = 0.9604
Decision Rule: If p-value greater than 0.05 level of significance then we accept the null hypothesis
here p value = 0.9604 > 0.05 los so the model is insignificant
=> to estimate mean PSA level for a patient whose cancer volume is $20$ cc.
put x= 20 and values of 0 and 1 in the model
y = 0 +1 x
y^ = 6.9586219 + 0.0008806*20
y^ = 6.976234
6.976234 is the estimated PSA level for a patient whose cancer volume is $20$ cc.
Get Answers For Free
Most questions answered within 1 hours.