Question

Use R to do each of the following. Use R code instructions that are as general...

Use R to do each of the following. Use R code instructions that are as general as possible, and also as efficient as possible. Use the Quick-R website for help on finding commands. 1. Enter the following values into a data vector named Dat: 45.4 44.2 36.8 35.1 39.0 60.0 47.4 41.1 45.8 35.6 2. Calculate the difference between the 2nd and 7th entries of this vector using only reference indices. 3. Calculate the median of Dat. 4. Sort the values in this data vector from highest to lowest, and save the sorted version as a vector named sortDat. 5. What does sortDat[-4] do ? 6. What does Dat[-c(2,7,9)] do ? Why do you think the c() function necessary here ? 7. Suppose a random variable X has a normal distribution with a mean of µ = 60 and a standard deviation of σ = 4. Find the following using R functions: a. The probability that X is less than or equal to 66. b. The probability that X is between 50 and 60. c. The probability that X is greater than 68. 8. Use the Dat data vector below to answer the following questions 45.4 44.2 36.8 35.1 39.0 60.0 47.4 41.1 45.8 35.6 a. Use logical referencing to calculate the standard deviation of only those values that are less than 45. b. Write an R command that will determine how many of the vector entries are less than 45. c. Write an R command that will determine how many vector entries are greater than 40 but less than 55 (i.e. how many entries are between 40 and 55). d. Write an R command that will calculate what proportion of the data vector has values exceeding 40. 9. This exercise uses an existing R data frame called cigsales that contains state-level data regarding cigarette sales and other variables. a. The variable black indicates the percentage of a given state that is African-American. Using logical referencing, select from this data frame only those states that have over a 15% African- American population. What states get selected ? b. Extract the variable price from this data frame, and place it into a vector of its own called price.vec. c. Use logical referencing to create two separate vectors

Homework Answers

Answer #1

All R commands are shown in bold. The output of the R commands are shown in Italics.

1.

Dat = c(45.4, 44.2, 36.8, 35.1, 39.0, 60.0, 47.4, 41.1, 45.8, 35.6)

2.

Dat[2] - Dat[7]
[1] -3.2

3.

median(Dat)
[1] 42.65

4.

sortDat = sort(Dat, decreasing = TRUE)

5.

sortDat[-4]
[1] 60.0 47.4 45.8 44.2 41.1 39.0 36.8 35.6 35.1

It removes the fourth element from the sortDat vector.

6.

Dat[-c(2,7,9)]
[1] 45.4 36.8 35.1 39.0 60.0 41.1 35.6

It removes the second, seventh, ninth element from the Dat vector.

c() function necessary here because you want to remove more than one data present at different indexes.

7.

a. The probability that X is less than or equal to 66.

pnorm(66, mean = 60, sd = 4)
[1] 0.9331928

b. The probability that X is between 50 and 60.

pnorm(60, mean = 60, sd = 4) - pnorm(50, mean = 60, sd = 4)
[1] 0.4937903

c. The probability that X is greater than 68.

pnorm(68, mean = 60, sd = 4, lower.tail = FALSE)
[1] 0.02275013

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
8. The airquality data set has 153 rows, one for each day in May through September...
8. The airquality data set has 153 rows, one for each day in May through September of 1973. One of the variables is named ”Wind”, for wind speed. We will calculate some values of the distribution of Wind using R. Suppose we are interested in the proportion of days for which the wind speed was greater than 12. Attach the airquality data frame to your R workspace with the command > attach(airquality) This allows you to address the variable Wind...
The built-in R dataset swiss gives Standardized fertility measure and socio-economic indicators for each of 47...
The built-in R dataset swiss gives Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888. The dataset is a data frame containing 6 columns (variables). The column Infant.Mortality represents the average number of live births who live less than 1 year over a 3-year period. We are interested in the Infant.Mortality column. We can convert the data in this colun to an ordinary vector x by making the assignment x <- swiss$Infant.Mortality....
The built-in R dataset swiss gives Standardized fertility measure and socio-economic indicators for each of 47...
The built-in R dataset swiss gives Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888. The dataset is a data frame containing 6 columns (variables). The column Infant.Mortality represents the average number of live births who live less than 1 year over a 3-year period. We are interested in the Infant.Mortality column. We can convert the data in this colun to an ordinary vector x by making the assignment x <- swiss$Infant.Mortality....
Use the Frequencies option in SPSS to answer each of the questions based on the following...
Use the Frequencies option in SPSS to answer each of the questions based on the following scenario. Scenario A superintendent of a school district was requested to present to the school board demographic data based on the schools within the district. One item that the superintendent had to present was the percent of students who were eligible for free or reduced-price lunch (a commonly used proxy for socio-economic status). The following percentages were reported for the 40 schools in the...
A gardener plants 300 sunflower seeds (of a brand called KwikGrow) and, after 2 weeks, measures...
A gardener plants 300 sunflower seeds (of a brand called KwikGrow) and, after 2 weeks, measures the seedlings’ heights (in mm). These heights are recorded in the file Sunflower.csv. He is interested in testing whether the mean height of sunflowers grown from KwikGrow seeds is greater than 33 mm two weeks after planting. He decides to conduct a hypothesis test by assuming that the sampling distribution of the sample mean has a normal distribution. For the purposes of this question,...