Question

R Language

Use the state.x77 data matrix and the tapply() function to obtain:

(1) The number of the states in each region.

(2) The names of the states in each division.

(3) The median high school graduation rates for groups of states defined by combinations of the factors state.region and state.size.

Answer #1

(1) use tapply on the vectors state.name and state.region against the function length

```
## Northeast South North Central West
## 9 16 12 13
```

(3) use the vectors HS Grad as the first argument and the list state.region and state.size as the second argument against the function median.

```
## Small Medium Large
## Northeast 55.9 56.00 51.45
## South 48.1 41.30 47.40
## North Central 53.3 54.50 52.90
## West 62.4 61.75 62.60
```

R Language:
Examine the R expression
pairs(iris[1:4], main="Andersen's Iris Data -- 3 species",
pch=20, col=unclass(iris$Species)+2)
Use a similar expression to produce a scatter plot matrix of the
variable mpg, disp, hp, drat, and qsec in the data frame mtcars.
Use different colors to identify cars belonging to each of the
categories defined by the carsize variable.

Suppose that the following frequency table contains data on the
number of absences in a school year due to illness or injury for
100 randomly selected high school students. Use the table to
determine the median number of absences for these 100 students.
Provide your answer with precision to one decimal place.
Number of absences 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Frequency 25, 10, 10, 5, 9, 12, 9, 9, 4, 2, 3, 2...

Suppose that the following frequency table contains data on the
number of absences in a school year due to illness or injury for
100 randomly selected high school students. Use the table to
determine the median number of absences for these 100 students.
Provide your answer with precision to one decimal place.
Number of absences
Frequency
0
25
1
10
2
10
3
5
4
9
5
12
6
9
7
9
8
4
9
2
10
3
11
2...

Use the programming language R to code the following
project..
* Make sure you turn in your code and answers from
each question. (not the raw data).
1. Generate 1000 random samples of size 40 from the normal
distribution with mean µ = 3 and
standard deviation σ = 2. Compute 95% the confidence interval of
1000 samples and find the rate
of confidence interval contains the true mean. What did you learn
from this simulation study?
2. For each...

(1 point) College Graduation
Rates. Data from the College Results Online
website compared the 2011 graduation rate and school size for 92
similar-sized public universities and colleges in the United
States. Statistical software was used to create the linear
regression model using size as the explanatory variable and
graduation rate as the response variable. Summary output from the
software and the scatter plot are shown below. Round all calculated
results to four decimal places.
Coefficients
Estimate
Std. Error
t value
Pr(>|t|)...

The data frame `x77` contains data from each of the fifty United
States. First coerce the `state.x77` variable into a data frame
with:*
```{r, eval=FALSE}
x77 <- data.frame(state.x77)
```
*For the following, make a scatter plot with the regression
line:*
1. *The model of illiteracy rate (`Illiteracy`) modeled by high
school graduation rate (`HS.Grad`).*
2. *The model of life expectancy (`Life.Exp`) modeled by the
murder rate (`Murder`).*
3. *The model of income (`Income`) modeled by the illiteracy
rate (`Illiteracy`).*
*Write...

QUESTION 1
·
1 POINT
The following frequency table summarizes a set of data. What is
the five-number summary?
Value
Frequency
2
3
3
2
5
1
6
3
7
1
8
2
11
3
Select the correct answer below:
Min
Q1
Median
Q3
Max
2
7
9
10
11
Min
Q1
Median
Q3
Max
2
3
6
8
11
Min
Q1
Median
Q3
Max
2
3
4
8
11
Min
Q1
Median
Q3
Max
2
6
8
10
11...

Suppose that the following frequency table contains data on the
number of absences in a school year due to illness or injury for 80
randomly selected high school students. Use the table to determine
the median number of absences for these 80 students. Provide your
answer with precision to one decimal place.
Number of absences
Frequency
0
20
1
9
2
11
3
8
4
5
5
10
6
4
7
3
8
3
9
1
10
3
11
3...

APPLIED STATISTICS 2
USE R CODE ! SHOW R CODE!
Write a function to calculate the sum of cubes from 1 to
n, but skip the multiple of 5. Say, if n=10, the result is
1^3+2^3+3^3+4^3+6^3+7^3+8^3+9^3. The input is the value of n, the
output is the summation. (you need if statement to check whether a
number is a multiple of 5.In R, a%%b is the remainder for a divided
by b. So, we have 10%%5=0)
APPLIED STATISTICS 2
USE...

1- The following data show the number of times
a sample of students cut their hair in a year:
7,9,20,18,7,10,12,12,20,18,6,8,9,12,20,18,16,12,6,7,10,12,4,6,9
Obtain the 95% confidence interval for the mean number of times
the students cut their hair in a year.
2- A survey of 1006 Americans showed that 804
believe that Congress should consider reallocating federal
subsidies from fossil fuels to solar. Find the 94% confidence
interval for the proportion of Americans who believe that Congress
should so reallocate subsidies. Obtain...

