Question

Subject Name: Data Management for Analytics Software used: SAS Q2. Which of the following statements are...

Subject Name: Data Management for Analytics
Software used: SAS


Q2. Which of the following statements are true regarding dummy coding?

a. Dummy coding high cardinality inputs can lead to overfitting a predictive model.
b. The reference level should always be the second level of the categorical variable after sorting in ascending order.
c. In SAS, dummy variables can only be created with following code: if var1="X" then var1_x_dum = 1; else var1_x_dum = 0;
d. Dummy coding is also known as “Cold Encoding”
e. None of the above.

Homework Answers

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
Subject Name: Data Management for Analytics Software used: SAS Choose the best single option. Q1. Which...
Subject Name: Data Management for Analytics Software used: SAS Choose the best single option. Q1. Which of the following statements is false regarding categorical input variables and cardinality? a. Cardinality is the number of distinct levels in a categorical variable. b. Categorical variables with high cardinality can lead to overfitting in a predictive model. c. Calculating cardinality is an important exploratory tool before transforming the categorical input. d. Categorical variables with a low cardinality ratio have more distinct levels than...
Subject Name: Data Management for Analytics Software used: SAS Choose the single best option. Q4. Consider...
Subject Name: Data Management for Analytics Software used: SAS Choose the single best option. Q4. Consider the technique for identifying the optimal number of aggregations (clusters) of categorical levels using PROC CLUSTER. Which of the following statements is true? a. The optimal number of clusters is determined by the “knee rule”. b. The technique explicitly uses information on the variable which would be the target in a predictive modeling analysis. c. The technique automatically creates the dummy variables representing the...
Subject Name: Data Management for Analytics Software used: SAS Choose the best single option. Q10. Consider...
Subject Name: Data Management for Analytics Software used: SAS Choose the best single option. Q10. Consider a categorical variable LOCATION with 10 levels: “location1”- “location10”. In order to create a new numeric variable which has only the numeric value extracted from each level name, which of the following SAS code will work? a. location_num = input(substr(location,2,2),2.); b. location_num = substr(location,2,2); c. location_num = substr(location,9,2); d. location_num = input(substr(location,9,1),1.); e. location_num = input(substr(location,9,2),2.);