Question

In this class you will need to decide if a data value is unusual (an outlier). You have a way of determining if a datum is an outlier by constructing fences (lower fence = Q1-1.5*IQR, upper fence = Q3+1.5*IQR) and comparing the data value to the value of the fences. If the data value is "outside" one of the fences then it is considered an outlier. For the normal model, some would consider any data value that is more than two standard deviations away from the mean to be an outlier and others would require a data value be more than three standard deviations away from the mean to be an outlier. The "fences" for a normal model are 2.698 standard deviations away (above/below) from the mean. How many standard deviations away from the mean do you think a data value needs to be in ordered to be considered an outlier (unusual) and discuss your reasoning for your choice.

Answer #1

Usually when a data is Normally distributed, the cut off values are -2 and 2.So any point lying outside the interval (-2, 2) have less than 5 percent chance. Hence they are termed as an Outlier. For other Distribution if the sample size is large enough we can approximate it to Normal and thus set the same Cut off values.

For other distribution with small sample size, what we can do is we can calculate Z score by (X-u) /sigma that is by scaling the point and if Z score comes out to be grrater than 2 then it is an outlier. Again this is subjective rule. Some may use 2.5 as cutoff some may use 3 as a cut off. But generally (-2, 2) is good enough

For a normal model,what standardized scores for data
do you think would make a datum value an outlier?
justify your response

In a study of environmental lead exposure and IQ, the data was
collected from 148 children in Boston, Massachusetts. Their IQ
scores at age of 10 approximately follow a normal distribution with
mean of 115.9 and standard deviation of 14.2. Suppose one child had
an IQ of 74. The researchers would like to know whether an IQ of 74
is an outlier or not.
Calculate the lower fence for the IQ data, which is the lower
limit value that the...

The data below indicate the contamination in parts per million
in each of 50 samples of drinking water at a specific location.
hw1_q6.csv
a) What is the first quartile of the data?
b)What is the third quartile of the data?
c)What is the median of the data?
d)What is the mean of the data? Give your answer to three
decimal places
e)Values that are greater than Q3 + 1.5 IQR or less than Q1 -
1.5 IQR are typically considered...

The mean value of land and buildings per acre from a sample of
farms is $1400, with a standard deviation of $200. The data set
has a bell-shaped distribution. Using the empirical rule,
determine which of the following farms, whose land and building
values per acre are given, are unusual (more than two standard
deviations from the mean). Are any of the data values very
unusual (more than three standard deviations from the mean)?
$1619 $1881 $1163 $658 $1118 $1122

For a normal distribution, find the percentage of data that are
(a) Within 1 standard deviation of the mean __________ % (b)
Between ?−3.5? μ − 3.5 σ and ?+2.5? μ + 2.5 σ ____________% (c)
More than 2 standard deviations away from the mean _________%

What is the difference between data that is normally distributed
versus data that is not? What can you say about small sample sizes?
How many standard deviations from the mean is considered unusual
and why?
Typed responses only please :)

What is the
probability that a normal random variable will take a value that is
less than 1.05 standard deviations above its mean? In other words,
what is P(Z < 1.05)?
.8531
.1468
.9332
.0668
What is the
probability that a normal random variable will take a value that is
between 1.5 standard deviations below the mean and 2.5 standard
deviations above the mean? In other words, what is P(-1.5 < Z
< 2.5)?
.9938
.0668
.9270
.0730
What is...

Hello, can you please be sure to show all work and I'll be sure
Mean of stock price = 1117.64
STDEV (Population) = 67.61
If a person bought 1 share of Google stock within the
last year, what is the probability that the stock on that day
closed within $50 of the mean for that year (round to two places)?
(Hint: this means the probability of being between 50 below...

According to Chebyshev's theorem, the maximum proportion of data
values from a data set that are more than 1.5 standard deviations
from the mean isAnswer. Round your answer to 2 decimal places.

Data collected at an airport suggests that an exponential
distribution with mean value 2.855 hours is a good model for
rainfall duration.
What is the probability that the duration of a particular
rainfall event at this location is at least 2 hours? At most 3
hours? Between 2 and 3 hours?
What is the probability that rainfall duration exceeds the mean
value by more than 2 standard deviations?

