Question

normal quantile plot Hello! I'm supossed to make a normal quantile plot, and i read that...

normal quantile plot

Hello!

I'm supossed to make a normal quantile plot, and i read that "The basic idea of the normal quantile plot is to compare the data values with the values one would predict for a standard normal distribution".

But I thought that only the residual was normal distributed in linear regression.

Can somebody explain why we suddenly state, that the raw data is normal distributed?

Homework Answers

Answer #1

Let consider we have Simple linear regression model

that is we have one predictor only

So, our model is y_i = beta0 + beta1*x_i + epsilon_i ; i = 1, 2, ..., n

and epsilon_i follows N(0, sigma^2)

As, we know a linear combination of normal random variables is itself normal.

So, y_i also follows normal distribution with N(beta0 + beta1*x_i , sigma^2) . These are easy to comute.

That's why we are comparing the data values with the values one would predict for a standard normal distribution.

and this is true for any regression model(not just SLRM).

If you further have problems ask me in the comments.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
Creating and interpreting a Normal Quantile-Quantile (Normal Q-Q) plot. The data you will be using is...
Creating and interpreting a Normal Quantile-Quantile (Normal Q-Q) plot. The data you will be using is the distance in miles from home to campus for statistics students. 150      30        105      88        94        15        55        122      45        67        18        126      30 143      98        15        30        62        111      87        38        20        34        39        46        14 144      23        94        44        97        65        120      123      99        45        57        209      20 133      72 Column 1: Sort the data values from least to greatest. Use the...
Data on the fuel consumption yy of a car at various speeds xx is given. Fuel...
Data on the fuel consumption yy of a car at various speeds xx is given. Fuel consumption is measured in mpg, and speed is measured in miles per hour. Software tells us that the equation of the least‑squares regression line is^y=55.3286−0.02286xy^=55.3286−0.02286xUsing this equation, we can add the residuals to the original data. Speed 1010 2020 3030 4040 5050 6060 7070 8080 Fuel 38.138.1 54.054.0 68.468.4 63.663.6 60.560.5 55.455.4 50.650.6 43.843.8 Residual −17.00−17.00 −0.87−0.87 13.7613.76 9.199.19 6.316.31 1.441.44 −3.13−3.13 −9.70−9.70 To...
Hello all. I have 3 questions concerning something in python and I wanted to make sure...
Hello all. I have 3 questions concerning something in python and I wanted to make sure I was correct seeing as to how this is my first python class. Consider the following code here: def make_plot(data, x_values, y_values, save_file=True): We just have to be able to point out the arguments, optional arguments, and required arguments. I think the arguments are : data, x_values, y_values, save_file = True I think the optional argument is: save_file = True Am I correct so...
(1) A Chi-squared test is typically used to test for any of the following except which...
(1) A Chi-squared test is typically used to test for any of the following except which of the following? (A) If a mathematical model accurately predicts our observed frequencies of data values. (B) If a mathematical model accurately predicts the total number of observed data values. (C) If a mathematical model accurately predicts the pattern of our observed data values. (D) Whether two factors present in a population are independent of one another. (E) Whether a series of populations experience...
1. An accountant for a large department store has the business objective of developing a model...
1. An accountant for a large department store has the business objective of developing a model to predict the amount of time it takes to process invoices. Data are collected from the past 32 working days, and the number of invoices processed and completion time (in hours) are stored (invoice.xlsx). (a) At the level ? = 0.01, conduct a t test with the null hypothesis of zero correlation between the number of invoices and the completion time. (b) Using the...
A regional planner employed by a public university is studying the demographics of nine counties in...
A regional planner employed by a public university is studying the demographics of nine counties in the eastern region of an Atlantic seaboard state. She has gathered the following data: County Median Income Median Age Coastal A $ 49,374 58.5 0 B 46,850 46.5 1 C 47,586 48.5 1 D 47,781 45.5 1 E 33,738 37.3 0 F 35,553 43.4 0 G 39,910 45.3 0 H 37,266 34.2 0 I 34,571 36.5 0   Click here for the Excel Data File...
1.One of the conditions for the sampling distribution of the sample proportion to be normal is...
1.One of the conditions for the sampling distribution of the sample proportion to be normal is that one should observe at least 5 successes and at least 5 failures. True False 2.The standard error for the sample proportion depends on the population proportion. True False 3.If the 95% confidence interval for one population proportion is given by 0.22 to 0.33, then the sample proportion is 0.275. True False 4.As one increases the sample size, the margin of error in the...
A residual is: The difference between a data point and the regression line. A value that...
A residual is: The difference between a data point and the regression line. A value that can be 1 or zero. A value that is always negative because it is a difference The difference between two different lines. The properties of r include: r is sensitive to very high quantities The value of r is not affected if the values of either variable are converted into a different scale You must define the independent and dependent variables All of the...
1. Which of the following statements is correct? a. The median can be strongly influenced by...
1. Which of the following statements is correct? a. The median can be strongly influenced by just one or two very low or high values. b. The mode gives equal consideration to even very extreme values in the data. c. There will be just one value for the mean, median , and mode in the data set. d. The mean is able to make the most complete use of the data when compared to the median and mode. e. None...
Questions 1 through 6 work with the length of the sidereal year vs. distance from the...
Questions 1 through 6 work with the length of the sidereal year vs. distance from the sun. The table of data is shown below. Planet Distance from Sun (in millions of miles) Years (as a fraction of Earth years) ln(Dist) ln(Year) Mercury 36.19 0.2410 3.5889 -1.4229 Venus 67.63 0.6156 4.2140 -0.4851 Earth 93.50 1.0007 4.5380 0.0007 Mars 142.46 1.8821 4.9591 0.6324 Jupiter 486.46 11.8704 6.1871 2.4741 Saturn 893.38 29.4580 6.7950 3.3830 Uranus 1,794.37 84.0100 7.4924 4.4309 Neptune 2,815.19 164.7800 7.9428...
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT