Refer to the ROC curve:
As you move along the curve, what changes?
A. The priors in the population
B. The true negative rate in the population
C. The proportion of events in the training data
D. The probability cutoff for scoring
Which SAS program will divide the original data set into 60% training and 40% validation data sets, stratified by county?
A. Option A
B. Option B
C. Option C
D. Option D
Refer to the REG procedure output:
Click on the calculator button to display a calculator if needed.
A. 0.4115
B. 0.6994
C. 0.5884
D. 0.1372
The selection criterion used in the forward selection method in the REG procedure is:
A. Adjusted R-Square
B. SLE
C. Mallows' Cp
D. AIC
Which statistic, calculated from a validation sample, can help decide which model to use for prediction of a binary target variable?
A. Adjusted R Square
B. Mallow's Cp
C. Chi Square
D. Average Squared Error
Refer to the exhibit.
Output from a multiple linear regression analysis is shown.
What is the most appropriate statement concerning collinearity between the input variables?
A. Collinearity is a problem since all variance inflation values are less than 10.
B. Collinearity is not a problem since all variance inflation values are less than 10.
C. Collinearity is not a problem since all Pr>|t| values are less than 0.05.
D. Collinearity is a problem since all Pr>|t| values are less than 0.05.
The selection criterion used in the forward selection method in the GLMSELECT procedure is:
A. RSQ
B. MSE
C. R-squared
D. AIC
When working with smaller data sets (N<200), which method is preferred to perform honest assessment?
A. Training: 40% Validation: 30% Testing: 30%
B. K-fold cross validation
C. Cross validation using 4th quartile observations
D. Use the AIC goodness of fit statistic
While building a predictive model, median imputations are performed while preparing the training data. How should the imputations be addressed in the validation data?
A. The imputed values are irrelevant to the validation data, and are not used.
B. The imputed values must be applied directly to the validation data without recalculation.
C. The imputed values must be recalculated using the validation data.
D. The imputed values must be recalculated using both the training and the validation data.
What is a benefit to performing data cleansing (imputation, transformations, etc.) on data after partitioning the data for honest assessment as opposed to performing the data cleansing prior to partitioning the data?
A. It makes inference on the model possible.
B. It is computationally easier and requires less time.
C. It omits the training (and test) data sets from the benefits of the cleansing methods.
D. It allows for the determination of the effectiveness of the cleansing method.