Question

We have a dataset with 10 different features for 500 college students. We want to know...

We have a dataset with 10 different features for 500 college students. We want to know if we can group the students based on some similarities in their features. What method can be used for this problem.

Group of answer choices

A. K Fold

B. Classification Tree

C. K-Means

D. Logistic Regression

Homework Answers

Answer #1

Please don't hesitate to give a "thumbs up" for the answer in case the answer has helped you

We have 10 features, with 500 observations.

We want to check if we can see group students by similarties of their features. We don't know the groupings yet - so Classification Tree and Logistic Regression is ruled out. Also, K-Fold technique is for cross validation it is not a technique used to what we want to acheive in the question.

So, we must use the unsupervised learning technique of K-Means Clustering which takes into account all the features, and groups observations into different "clusters" - groups with same properties/ very similar features.

Answer: C. K-Means is correct

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
You want to estimate the average time college students spend studying per week. You know that...
You want to estimate the average time college students spend studying per week. You know that the population standard deviation for studying time per week for college students is 3.5 hours. You want a 99% confidence interval with a margin of error of 30 minutes (0.50 hours). How many students do you need in your study? Group of answer choices 188.24 students 325 students 324.9 students 189 students
10) We conduct a study to determine whether the majority of community college students plan to...
10) We conduct a study to determine whether the majority of community college students plan to vote in the next presidential election. We choose a significance level of 0.05. We survey 650 randomly selected community college students and find that 54% of them plan to vote. The P-value is 0.02. H0: 50% of community college students plan to vote in the next presidential election. Ha: More than 50% of community college students plan to vote in the next presidential election....
Multiple regression model Consider a dataset for 500 students living in Melbourne with the following variables...
Multiple regression model Consider a dataset for 500 students living in Melbourne with the following variables measured in 2019 --- expenditure on public transport, ?E, measured in dollars; number of days he/she attended school, ?S; number of days not attended school, ??NS (which is equal to 365−?365−S); income, ?I, measured in dollars; and age in year, ?A. A researcher is interested in estimating the following model using Eviews: ??=?1+?2??+?3???+?4??+?5??+??Ei=β1+β2Si+β3NSi+β4Ii+β5Ai+ei. Without any further information, we know for sure that one assumption...
Suppose you perform a study about the hours of sleep that college students get. You know...
Suppose you perform a study about the hours of sleep that college students get. You know that for all people, the average is about 7 hours. You randomly select 35 college students and survey them on their sleep habits. From this sample, the mean number of hours of sleep is found to be 6.2 hours with a standard deviation of 0.97 hours. We want to construct a 95% confidence interval for the mean nightly hours of sleep for all college...
Suppose you perform a study about the hours of sleep that college students get. You know...
Suppose you perform a study about the hours of sleep that college students get. You know that for all people, the average is about 7 hours. You randomly select 35 college students and survey them on their sleep habits. From this sample, the mean number of hours of sleep is found to be 6.2 hours with a standard deviation of 0.97 hours. We want to construct a 99% confidence interval for the mean nightly hours of sleep for all college...
1) When we fit a model to data, which is typically larger? a) Test Error b)...
1) When we fit a model to data, which is typically larger? a) Test Error b) Training Error 2) What are reasons why test error could be LESS than training error? (Pick all that applies) a) By chance, the test set has easier cases than the training set. b) The model is highly complex, so training error systematically overestimates test error c) The model is not very complex, so training error systematically overestimates test error 3) Suppose we want to...
1. If you want to study the effect of hormonal changes in adolescent boys, your population...
1. If you want to study the effect of hormonal changes in adolescent boys, your population would be Group of answer choices All adolescents All the people in the world All males All adolescent males 2. Which of the following is most likely to be measured categorically? Group of answer choices Weight gain in first year college students Deterioration in driving performance under the influence of alcohol Level of authoritarianism in a sample of public accountants Species of dog appearing...
7.1 Regional College increased their admission standards so that they only admit students who have scored...
7.1 Regional College increased their admission standards so that they only admit students who have scored in the 80th percentile or higher (top 20%) of all scores on the SAT. What SAT cutoff will the school use this year? (Recall that for the SAT μ = 500, σ = 100, and the scores are normally distributed) 7.2 Even with the increase in standards, the Registrar at Regional College notices that people who don’t meet the minimum continue to apply for...
(15%) For Applied Management Statistics class you want to know how college students feel about the...
(15%) For Applied Management Statistics class you want to know how college students feel about the transportation system in Barcelona. What is the population in this study? What type of sample would you use and why? (25%) A manager of an e-commerce company would like to determine average delivery time of the products. A sample of 25 customers is taken. The average delivery time in the sample was four days. Suppose the delivery times are normally distributed with a standard...
We compared the mean number of hours spent playing video games between students who ride the...
We compared the mean number of hours spent playing video games between students who ride the bus, take a car, and walk or ride bike as the primary means of transportation to school. The results are below. Is there strong evidence to show a difference in the mean number of hours played by these students? μ i= the mean number of hours playing video games by group i. H0: μ b u s = μ c a r = μ...