Subject Name: Data Management for Analytics
Software used: SAS
Choose the best single option.
Q1. Which of the following statements is false regarding
categorical input variables and cardinality?
a. Cardinality is the number of distinct levels in a categorical
variable.
b. Categorical variables with high cardinality can lead to
overfitting in a predictive model.
c. Calculating cardinality is an important exploratory tool before
transforming the categorical input.
d. Categorical variables with a low cardinality ratio have more
distinct levels than variables with a high cardinality ratio.
e. All are false.
In a dataset Cardinality is the number of distinct levels in a categorical variable
a) Statement 1 is true
Since Cardinality is the number of distinct levels in a categorical variable
b) Statement 2 is true i.e. Categorical variables with high cardinality can lead to overfitting in a predictive model.
c) Statement 3 is true i.e Calculating cardinality is an important exploratory tool before transforming the categorical input
d) Statement 4 is False
Since Categorical variables with a low cardinality ratio have less distinct levels than variables with a high cardinality ratio.
Hence, option d is correct.
Get Answers For Free
Most questions answered within 1 hours.