Old Faithful is a geyser located in Yellowston Nation Park in Wyoming. It recieved the name “Old Faithful” by the Washburn-Langford-Doane expedition, who noted that the geyser seemed to erupt at regular intervals. In this problem, we will use data collected by park rangers to analyze whether the eruptions are as regular as originally thought. If you feel so inclined, you can watch Old Faithful in real time at the National Park Service website. The dataset is available on Canvas and has 2 columns and 272 rows. Observations are organized by row - the first column gives the length of the eruption (in minutes), while the second column gives the waiting time (in minutes) since the previous eruption
1. Use the R function read.table to import the data set. If you don’t know how it works, use help(read.table) to check the usage.
2. In R, make a boxplot of all 272 waiting times. Comment on what the boxplot shows you.
3. In general, boxplots cannot be used to determine the shape of a distribution. Describe one scenario in which a boxplot may not show a feature of the distribution (Hint: Do boxplots show multiple modes?).
4. In R, write and execute the code to make a histogram of all 272 waiting times. Set the number of breaks to be 20. Comment on what you see in the histogram. Does the histogram reveal information about the distribution of waiting times you didn’t notice in the boxplot? Include the histogram in your submission.
5. To explain the differences in waiting times, it may be interesting to look at how long each eruption lasts. The following R code creates a new categorical variable, eruption.length, labeling each eruption as long or short. eruption.length = ifelse(eruptions > 3.5, "Long", "Short") Using this new variable, create side by side boxplots of waiting times split by their eruption length category. Discuss similarities and differences between the waiting times for long and short eruptions.
Get Answers For Free
Most questions answered within 1 hours.