Question

1. Suppose that the data for analysis includes the attribute age. The age values for the...

1. Suppose that the data for analysis includes the attribute age. The age values for the data tuples are (in increasing order) 13, 13, 16, 18, 18, 20, 20, 20, 21, 21, 22, 25, 26, 30, 32, 33, 33, 33, 36, 36, 36, 37, 47, 47, 49, 61, 100. Use smoothing by bin means to smooth the data above with an equal bin depth of 3. Illustrate your steps. Comment on the effect of this technique for the given data.

Homework Answers

Answer #1

Step 1: Sort the data. (Here data is already sorted.)

Step 2: Partition data into equal depth bins of depth 3.

Step 3: Calculate arithmetic mean of each bin.

Step 4: Replace each of the values in each bin by the arithmetic mean calculated for the bin.

Effects:-

1. To remove noisy data from given attribute.

2. To reduce the effects of error in given data.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
A sample data set includes the following age of employees. What is the range of values...
A sample data set includes the following age of employees. What is the range of values using empirical rule for 68%: 43 22 24 26 28 32 34 36 38 44 49 51 53 61 (26.67,50.92) (24.33,55.47) (26.71,50.58) (22.76,53.52)
descriptive statistics for the given data 4 8 18 20 21 22 23 24 25 26...
descriptive statistics for the given data 4 8 18 20 21 22 23 24 25 26 27 28 29 30 31 31 32 32 33 33 34 35 36 37 38 39 47 53
Using the sample standard deviation of age as an estimate of the population standard deviation, calculate...
Using the sample standard deviation of age as an estimate of the population standard deviation, calculate by hand the standard error of the mean. Show your calculations and the answer. Calculate by hand a 95% confidence interval for "Age" based on the sample mean. Age 18 20 21 24 26 29 30 31 32 35 35 36 37 39 40 42 42 42 44 45 46 48 49 50 52 53 54 58 59 61
B. Determine whether or not the table is a valid probability distribution of a discrete random...
B. Determine whether or not the table is a valid probability distribution of a discrete random variable. Explain fully. X 0.23 0.14 0.17 P(X=x) -0.22 0.38 0.84 Daily sales of a medium sized restaurant are normally distributed with a mean of $720 and a standard deviation of $80. What is the probability that a randomly selected day will make a sales of at least $656? What is the probability that a randomly selected day will make a sales between $680...
4th Grade (Class 1) 4th Grade (Class 2) 12 10 15 12 21 16 21 17...
4th Grade (Class 1) 4th Grade (Class 2) 12 10 15 12 21 16 21 17 22 17 22 19 22 19 25 22 26 22 27 22 27 27 31 28 32 29 33 29 33 31 36 31 37 31 38 33 41 33 43 37 44 39 45 43 45 43 47 47 55 49 57 57 The collected data is from two 4th grade (All female classes - Age 10) Fitnessgram pacer tests. Once you have...
Obs # Age Obs # Age Obs # Age Obs # Age Obs # Age 1...
Obs # Age Obs # Age Obs # Age Obs # Age Obs # Age 1 2019 11 2019 21 1976 31 2019 41 2006 2 1998 12 2019 22 2013 32 2018 42 2013 3 2019 13 2019 23 2019 33 2019 43 1982 4 1995 14 1980 24 1994 34 1997 44 2019 5 2018 15 2019 25 1979 35 2015 45 1988 6 2011 16 2016 26 1974 36 2019 46 2019 7 1974 17 1998 27...
2017-2018 Goals 49 44 43 42 42 41 40 40 39 39 39 37 36 36...
2017-2018 Goals 49 44 43 42 42 41 40 40 39 39 39 37 36 36 35 35 34 34 34 34 2012-2013 Goals 32 29 28 26 23 23 23 22 22 21 21 21 20 20 20 19 19 18 18 18 2007-2008 Goals 65 52 50 47 43 43 42 41 40 40 38 38 36 36 35 34 34 33 33 32 Given the above three sets of data, we want to compare the three seasons...
The results of a sample of 372 subscribers toWiredmagazine shows the time spent using the Internet...
The results of a sample of 372 subscribers toWiredmagazine shows the time spent using the Internet during the week. Previous surveys have revealed that the population standard deviation is 10.95 hours. The sample data can be found in the Excel test data file. What is the probability that another sample of 372 subscribers spends less than 19.00 hours per week using the Internet? Develop a 95% confidence interval for the population mean If the editors of Wiredwanted to have the...
We want to compare the average gas mileage of American-made cars vs. Japanese-made cars. The claim...
We want to compare the average gas mileage of American-made cars vs. Japanese-made cars. The claim is that the Japanese cars and American cars do not get the same gas mileage. Use the list data below to test the hypothesis that the Japanese cars and American cars do not get the same gas mileage. The American cars are in list1 and the Japanese cars in list2 below We do not know whether the mileages are normally distributed or not, but...
Driving under the influence of alcohol (DUI) is a serious offense. The following data give the...
Driving under the influence of alcohol (DUI) is a serious offense. The following data give the ages of random sample of 50 Drivers arrested while driving under the influence of alcohol. Using the listed ages; find 16 18 20 21 21 22 22 22 23 24 24 25 26 26 26 27 27 27 29 30 30 31 31 32 33 34 34 35 35 36 37 38 39 40 40 41 43 45 46 47 47 49 49 51...