Question

Execute the BDMO Algorithm with p = 3 on the following 1-dimensional, Euclidean data: 1, 45,...

Execute the BDMO Algorithm with p = 3 on the following 1-dimensional, Euclidean data:

1, 45, 80, 24, 56, 71, 17, 40, 66, 32, 48, 96, 9, 41, 75, 11, 58, 93, 28, 39, 77

The clustering algorithms is k-means with k = 3. Only the centroid of a cluster, along with its count, is needed to represent a cluster.

Using your clusters from the above, produce the best centroids in response to a query asking for a clustering of the last 10 points.

Show all calculations and steps to prove.

Homework Answers

Answer #1
# Custom clustering library
source("clustering_library.R")

# Load in data
x <- c( 1,  45, 80, 24, 56, 71, 17, 
        40, 66, 32, 48, 96,  9, 41, 
        75, 11, 58, 93, 28, 39, 77)

records <- data.frame(x=x)

# Cluster parameters
p <- 3
k <- 3

# Break up data into buckets
buckets <- initBuckets(records, 3, 2)

bdmo <- lapply(buckets, function(b) {
  rec <- b[['records']]
  if (nrow(rec) > k) {
    km <- kmeans(rec, k)
    return(list(centroid = km$centers,
                count = length(km$cluster)))
  } else {
    return(list(centroid = rec,
                count = nrow(rec)))
  }
    
})

for (b in bdmo) {
  print("Centroid:")
  print(t(b[['centroid']]))
  print
  print(c("Count: ", b[['count']]))
  print("---------------------------")
}
Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
Consider the below vector x, which you can copy and paste directly into Matlab. The vector...
Consider the below vector x, which you can copy and paste directly into Matlab. The vector contains the final grades for each student in a large linear algebra course. x = [61 52 63 58 66 92 64 55 76 60 70 78 76 73 45 63 97 70 100 76 50 64 42 100 67 81 81 59 68 62 72 99 66 76 81 59 47 84 67 75 63 86 73 44 51 69 48 74 61...
Student Grades Student Test Grade 1 76 62 2 84 90 3 79 68 4 88...
Student Grades Student Test Grade 1 76 62 2 84 90 3 79 68 4 88 84 5 76 58 6 66 79 7 75 73 8 94 93 9 66 65 10 92 86 11 80 53 12 87 83 13 86 49 14 63 72 15 92 87 16 75 89 17 69 81 18 92 94 19 79 78 20 60 71 21 68 84 22 71 74 23 61 74 24 68 54 25 76 97...
have a java application need to create an application which is able to do some analysis...
have a java application need to create an application which is able to do some analysis on temperature data stored in a data file. You will be given the “temperatures.dat” data file which contains the data you must analyze. The analysis you’ll need to do is: Total number of data points Find coldest temperature Find warmest temperature Find average temperature Find the frequency of each temperature Find the most frequent temperature Find the least frequent temperature All classes must be...
As part of the quarterly reviews, the manager of a retail store analyzes the quality of...
As part of the quarterly reviews, the manager of a retail store analyzes the quality of customer service based on the periodic customer satisfaction ratings (on a scale of 1 to 10 with 1 = Poor and 10 = Excellent). To understand the level of service quality, which includes the waiting times of the customers in the checkout section, he collected data on 100 customers who visited the store; see the attached Excel file: ServiceQuality. Using Data Mining > Cluster,...
This dataset contains consumer responses indicating the number of times they had to send their product...
This dataset contains consumer responses indicating the number of times they had to send their product for repair and their satisfaction with the repair process. Create a graph which can be used to visually demonstrate the relationship between the two columns of data. Ensure that the chart is professional with appropriate titles, axis labels, etc. Note any observations you see in your visualization (type these as sentences directly into an Excel cell(s)). Sample Satisfaction Rating Repair Requests 1 63% 13...
Question 2: Write a C program that read 100 integers from the attached file (integers.txt) into...
Question 2: Write a C program that read 100 integers from the attached file (integers.txt) into an array and copy the integers from the array into a Binary Search Tree (BST). The program prints out the following: The number of comparisons made to search for a given integer in the BST And The number of comparisons made to search for the same integer in the array Question 3 Run the program developed in Question 2 ten times. The given values...
To see if a spinner that is divided into 100 equal sections labeled 1 to 100...
To see if a spinner that is divided into 100 equal sections labeled 1 to 100 is fair, a researcher spins the spinner 1000 times and records the result. Let X represent the outcome. The table below shows the probability distribution of the data. Find the mean and the standard deviation of the probability distribution using Excel. Round the mean and standard deviation to two decimal places. "x"   P(x) 1   0.011 2   0.011 3   0.011 4   0.01 5   0.008 6  ...
Your assignment is to do a detailed statistical analysis of the data to be able to...
Your assignment is to do a detailed statistical analysis of the data to be able to decide later what would be appropriate control charts to monitor these variables. The manager provides you data that the office has collected on these two variables: Time it takes since the patient checks-in until the patient checks-out Number of patients in the waiting room Present a report to the manager with the results of the statistical analysis of the data and your conclusions on...
Your assignment is to do a detailed statistical analysis of the data to be able to...
Your assignment is to do a detailed statistical analysis of the data to be able to decide later what would be appropriate control charts to monitor these variables. The manager provides you data that the office has collected on these two variables: Time it takes since the patient checks-in until the patient checks-out Number of patients in the waiting room Present a report to the manager with the results of the statistical analysis of the data and your conclusions on...
The file P17_05.xlsx contains data on 100 consumers who drink beer. Some of them prefer light...
The file P17_05.xlsx contains data on 100 consumers who drink beer. Some of them prefer light beer, and others prefer regular beer. A major beer producer believes that the following variables might be useful in discriminating between these two groups: gender, marital status, annual income level, and age. b. Consider a new customer: male, married, income $42,000, age 47. Use the logistic regression equation to estimate the probability that this customer prefers Regular. How would you classify this person? Individual...
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT