Explain whether each scenario below is a regression, classification or unsupervised learning problem. If it is a supervised learning scenario, indicate whether we are more interested in inference or prediction. Finally, provide in each case the number of observations, n, and the number of predictors, p.
(a)A policy analyst is interested in discovering factors that are associated with the unemployment rate across different U.S. cities. For each of 400 cities, the policy analyst gathers the following data: the population, state, average income, crime rate, percentage of students who graduate high school and unemployment level.
(b) Stanford received 42,000 undergraduate applications in the year 2017. The application includes the following data for each applicant: age, high school GPA, scores in the SAT Critical Reading, SAT Math and SAT Writing exams, and whether they are domestic or international. The university wishes to understand the different subtypes of students in the application pool.
(c) A neuroscientist wishes to develop a tool that can identify the type of cells based on a few measurements. Each cell is one of three types: glial cell, motor neuron cell, or horizontal cell. The neuroscientist has 68 labeled cells, each with three measurements available: the number of branch points, the number of active processes, and the average process length.
a) Since the rate is to be determined which takes a numeric range not discrete values this is a regression problem
Here we are more interested in inference rather than prediction
b) Since the subtypes of students need to be determined but no actual subtypes are present, this is a unsupervised learning problem
c) We have 3 types of cells and 68 labelled cells, this is clearly a classification problem. Since the type of cell is already know, we are more interested in prediction rather than inference
Get Answers For Free
Most questions answered within 1 hours.