Question

There is a dataset with missing values that are missing not at random (MNAR), and the...

There is a dataset with missing values that are missing not at random (MNAR), and the probability of missing is related to the values themselves.

Regarding this, what would happen when imputing the missing values with the mean strategy?

Homework Answers

Answer #1

In Mean imputation technique goal is to replace missing data with statistical estimates of the missing values.

In a mean substitution, the mean value of a variable is used in place of the missing data value for that same variable. This has the benefit of not changing the sample mean for that variable. The theoretical background of the mean substitution is that the mean is a reasonable estimate for a randomly selected observation from a normal distribution. However, with missing values that are not strictly random, especially in the presence of a great inequality in the number of missing values for the different variables, the mean substitution method may lead to inconsistent bias. Distortion of original variance and Distortion of co-variance with remaining variables within the dataset are two major drawbacks of this method.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
Should I impute missing values before clustering or not? Impute missing values will cause bias, but...
Should I impute missing values before clustering or not? Impute missing values will cause bias, but with too many missing values ,the analysis will be deficient. What the right choice? Thanks!
Consider a dataset with at least three values. Suppose the highest value is increased by 5...
Consider a dataset with at least three values. Suppose the highest value is increased by 5 and the lowest is decreased by 5. Which of the following measure(s) would change? Median Mean Mean and median Standard deviation
Consider a dataset with at least three values. Suppose the highest value is increased by 5...
Consider a dataset with at least three values. Suppose the highest value is increased by 5 and the lowest is decreased by 2. Which of the following measure(s) would not change? a. Mean b. Standard deviation c. Median d. a and b
P(Employed)= 4,125,864. P(Unemployed)= 43,564. If I were to select a random person from the dataset, what...
P(Employed)= 4,125,864. P(Unemployed)= 43,564. If I were to select a random person from the dataset, what is the probability that person is either employed or unemployed?
Note: All of the data sets associated with these questions are missing, but the questions themselves...
Note: All of the data sets associated with these questions are missing, but the questions themselves are included here for reference. Large Data Set 1 records the SAT scores of 1,000 students. Regarding it as a random sample of all high school students, use it to test the hypothesis that the population mean exceeds 1,510, at the 1% level of significance. (The null hypothesis is that μ = 1510.) answer:  H0:μ=1510H0:μ=1510 vs. Ha:μ>1510.Ha:μ>1510. Test Statistic: Z = 2.7882. Rejection Region: [2.33,∞).[2.33,∞)....
When computing the outliers of a given dataset, we could find Q1-1.5*IQR and Q3+1.5*IQR, and any...
When computing the outliers of a given dataset, we could find Q1-1.5*IQR and Q3+1.5*IQR, and any values outside the range is considered as outliers. What if we just simply remove the smallest and largest values from the dataset? If there are more than one smallest/largest values, just remove one of them. Would this be valid?
use the standard normal distribution table to determine the missing values of the following probability. P(0≤Z≤?)=0.4884
use the standard normal distribution table to determine the missing values of the following probability. P(0≤Z≤?)=0.4884
2.   A partially completed ANOVA table is shown below. Fill in the missing 7 values. F-Table...
2.   A partially completed ANOVA table is shown below. Fill in the missing 7 values. F-Table Sum of Squares DF Mean Square F-Ratio p-value Method 2 500.00 Error 27 200.00 Total
Refer to the table and fill in the values missing in the sentences below. Hourly Wage...
Refer to the table and fill in the values missing in the sentences below. Hourly Wage Quantity Workers Demanded Quantity Workers Supplied $14 12,000 6,000 $16 10,000 7,000 $18 8,000 8,000 $20 6,000 9,000 $22 4,000 10,000 $24 2,000 11,000 With no union, the equilibrium wage rate would be $   per hour and there would be  employees. If the union has enough power to raise the wage to $4 higher than under the original equilibrium, the new wage would be $...
Suppose the random variable W is the width of a washer. W takes the values 0.24,...
Suppose the random variable W is the width of a washer. W takes the values 0.24, 0.25, and 0.26 with equal probability. The mean of W is? The variance of W is? The standard deviation of W is?
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT