7. A mobile app developer wants to test the difference of usage time (hours) among Texas users, New York users, and California users. She collected the usage time from these three states. When the developer codes the dummy variables, she chooses New York as the base category, and then creates two dummy variables: StateTX and StateCA. For StateTX, if a user is from Texas, the user receives value 1, and value 0 otherwise. For StateCA, if a user is from California, the user receives value 1, and value 0 otherwise. After the linear regression analysis, the developer finds that the coefficient of StateTX is 1.23, and the coefficient of StateCA is -0.15. Which statement is correct?
A) Users from New York use the mobile app the longest.
B) Users from Texas use the mobile app the longest.
C) Users from California use the mobile app the longest.
D) There is no difference in usage time among users from New York and California
8. Relative to data quality, _____ is a phenomenon where the more attributes there are the easier it is to build a model that fits the sample data but does not necessarily provide a good predictor.
A) Data inconsistency
B) Variable randomness
C) Data normalization
D) Curse of dimensionality
10. ___ provides methods that reduce the number of features used as inputs to a classifier or regressor.
A) k-means clustering.
B) Hierarchical clustering
C) Reduction regression
D) Dimension reduction
12. After you run the Linear Regression task with Y as the dependent variable and X1 as the sole explanatory variable if the parameter estimate (slope) of X1 is 0, then the best guess (predicted value) of Y when X1 = 13 is which of the following:
A) 13
B) The mean of Y
C) A random number
D) The mean of X1
7. As in the above data the new york data is taken as the base category whereas, other two are the dummy variables. We know that in every case the base is the strongest so hence, the answer is:-As
A) Users from New York use the mobile app the longest.
8. C) Data normalization
The data normalization is a technique to organize data and helps build a data model for a sample model but it does not helps in the good prediction.
10. D) Dimension reduction
As dimensionality reduction helps reduce the input dimensions and the fewer input dimensions denote the fewer parameters and a simple structure for a machine learning model.
12. D) The mean of X1
As the variable Y is a dependent variable of X1. Hence, the values of the variable Y will also be dependent upon the value of X1 for a regression slope.
Get Answers For Free
Most questions answered within 1 hours.