Question

How do data analysts measure the effect of an outlier on results? Once the analyst can...

How do data analysts measure the effect of an outlier on results? Once the analyst can measure their effects, what are the benefits of removing the outliers? Provide specific examples to illustrate your ideas.

Homework Answers

Answer #1

An outlier is a value that is very different from the other data in your data set. This can skew your results.

Let's examine what can happen to a data set with outliers. For the sample data set:

1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4

We find the following mean, median, mode, and standard deviation:

Mean = 2.58

Median = 2.5

Mode = 2

Standard Deviation = 1.08

If we add an outlier to the data set:

1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 400

The new values of our statistics are:

Mean = 35.38

Median = 2.5

Mode = 2

Standard Deviation = 114.74

As you can see, having outliers often has a significant effect on your mean and standard deviation. Because of this, we must take steps to remove outliers from our data sets.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
What is decision support? How does decision support relate to the knowledge worker's role? Provide specific...
What is decision support? How does decision support relate to the knowledge worker's role? Provide specific examples to illustrate your ideas.
Question: How would you scan for outliers in your dataset? What would you do with data...
Question: How would you scan for outliers in your dataset? What would you do with data points that are considered outliers?
Completing regression modeling can require various steps in order to obtain the best results. The process...
Completing regression modeling can require various steps in order to obtain the best results. The process can involve creating a graph to illustrate the relationship between variables, determining best-fit lines, and also approximating or producing smooth-fit lines to represent the data. What are the necessary commands required to carry out such an analysis in R? Please provide an example to illustrate your assertions.
Can you show me how to do this using Excel? Suppose we gathered the data below...
Can you show me how to do this using Excel? Suppose we gathered the data below by an unbiased method.  (The data is sorted DOWN the columns.) 111 116 119 120 123 126 128 131 136 142 115 117 119 120 123 127 129 132 136 142 115 117 120 121 123 127 129 133 138 143 115 118 120 121 123 128 131 133 138 158 116 118 120 123 125 128 131 135 141 163 (4 pts) Use the...
How would you define quality of care from the provider and patient perspectives? Why do you...
How would you define quality of care from the provider and patient perspectives? Why do you believe that quality can be viewed as a strength and a weakness of the U.S. health care system? Be sure to provide at least 2 reasons for this, and provide specific examples for discussion. Identify several difficulties and several benefits of working with licensed independent practitioners to improve quality in a typical hospital. What considerations should managers keep in mind when working with them?
The photoelectric effect can be used to measure the value of Planck’s constant. Suppose that a...
The photoelectric effect can be used to measure the value of Planck’s constant. Suppose that a photoelectric effect experiment was carried out using light with ν = 7.50×1017 s-1 and ejected electrons were detected with a kinetic energy of 2.50×10-11 J. The experiment was then repeated using light with ν = 1.00×1018 s-1 and the same metal target, and electrons were ejected with a kinetic energy of 5.00×10-11 J. Use these data to find a value for Planck’s constant. HINTS:...
Suppose a data set with 15 observations has an outlier present (i.e. the extreme observation very...
Suppose a data set with 15 observations has an outlier present (i.e. the extreme observation very far away from the rest of the data). Would the mean of the data set be affected by this outlier? Explain: Would the median of the data set be affected by this outlier? Explain: 0 A student took a Biology exam and scored 77. If the class exam scores were mound-shaped with a mean score of 70 and standard deviation of 16, use z-score...
How do I create a project to measure the surface tension of water? What example can...
How do I create a project to measure the surface tension of water? What example can I use for an in-class presentation?
Do you believe the results? Why or why not? Can a reported mean of your article...
Do you believe the results? Why or why not? Can a reported mean of your article data be treated as a value from a population having a normal distribution? Why or why not? Report either the value(s) of the reported Correlation Coefficient(s) and your interpretation, or report the reported Regression Equation and provide an example of a "Predicted" value.
How are data used in your business / organization? Is it basic reporting or advance predictive...
How are data used in your business / organization? Is it basic reporting or advance predictive analysis and data mining? What questions are being answered? How are strategic business decisions being made in your organization? Can you provide an example of a business decision that did or did not leverage data in your organization, what were the results? If you do not work for an organization right now, do a Google search to find the answers to the questions above...