ISYE 6414 Regression Analysis - Solution_Endterm Closed Book Section - Part 1_ Regression Analysis --Georgia Institute Of Technology Endterm Closed Book Section - Part 1
We should always use mean squared error to determ
...
ISYE 6414 Regression Analysis - Solution_Endterm Closed Book Section - Part 1_ Regression Analysis --Georgia Institute Of Technology Endterm Closed Book Section - Part 1
We should always use mean squared error to determine the best value of lambda in lasso regression.False
True
Question 2 1 / 1 pts
Standard linear regression is an example of a generalized linear model where the response is normally distributed and the link is the identity function.True
False
Question 3 0 / 1 pts
Goodness-of-fit assessment for logistic regression with replications involves checking for the independence, constant variance, and normality of the deviance residuals.
True
False
Question 4 1 / 1 pts
You are interested in understanding the relationship between stress level and exercise, with stress as the response. In your model, the number of hours a person spends exercising per week would be considered an explanatory variable while the person’s age would be a controlling variable.True
False
Question 5 1 / 1 pts
The hypothesis test for goodness-of-fit using Pearson residuals and the test using deviance residuals will always reach the same conclusion.False
True
Question 6 1 / 1 pts
A logistic regression model with high goodness of fit can have low predictive power.
True
False
Question 7 1 / 1 pts
If we apply a Poisson regression model using a small sample size, the estimators of the regression coefficient may not follow an approximate Normal distribution, affecting the reliability of the statistical inference on the coefficients.True
False
Question 8 1 / 1 pts
You fit a regression model using three predictors. You notice the estimated coefficient for predictor X1 is an order of magnitude larger than the estimated coefficient for predictor X2. It is correct to conclude that X1 has a greater effect on the response than X2.False
True
Question 9 0 / 1 pts
Regularized regression with a lambda value equal to 0 is equivalent to regression model estimation without penalization.
True
False
Question 10 0 / 1 pts
In a Poisson regression model, the difference between the null deviance and residual deviance follows a normal distribution.TrueFalse
Question 11 1 / 1 pts
You want to examine the relationship between study time and score on exams. You create five exams and recruit 50 participants. For each participant in your study, you record their time studying for and grade on each of those five exams. If you were to use all the data you recorded to build a simple linear regression model, you would violate the independence assumption.True
False
Question 12 1 / 1 pts
Maximum likelihood estimation produces unbiased coefficient estimates for logistic and Poisson regression.
True
False
Question 13 1 / 1 pts
If considering only BIC for the model selection criterion, a model with lower BIC is preferred over a model with higher BIC.True
False
Question 14 0 / 1 pts
If we do not penalize the training risk, our variable selection method would always prefer more complex models.TrueFalse
Question 15 1 / 1 pts
One goal of variable selection is to balance the bias-variance tradeoff when making predictions.
True
False
Question 16 1 / 1 pts
Forward stepwise selection is computationally more expensive than backward stepwise selection because it takes more iterations to terminate.False
True
Question 17 0 / 1 pts
When testing for the significance of a subset of predictors, the null hypothesis is that all coefficients for variables in that subset are 0 and the alternative is that all those coefficients are not 0.TrueFalse
Question 18 1 / 1 pts
Classification error estimated from leave-one-out cross validation tends to have higher bias but lower variability than classification error estimated from 2-fold cross validation.
True
False
Question 19 0 / 1 pts
One reasonable method for variable selection is to fit the full model and then drop all variables with a high p-value in that full model.TrueFalse
Question 20 1 / 1 pts
The logit link is the only link function that yields s-shaped curves.False
True
Question 21 2 / 2 pts
Recall the formula for adjusted R-squared is
Which of the following is TRUE about adjusted R-squared?
The adjusted R can be used to compare models, and its value will always be
less than or equal to that of R .
2
2
The adjusted R cannot be used to compare models, and its value will always
be less than or equal to that of R .
2
2
The adjusted R can be used to compare models, and its value will always be
greater than or equal to that of R .
2
2
The adjusted R cannot be used to compare models, and its value will always
be greater than or equal to that of R .
2
2
Question 22 0 / 2 pts
Mary has a dataset with height (in inches), weight (in lbs), and math_score (final exam score out of 100) of 300 students in an undergraduate math course. She creates another field called BMI (Body Mass Index) calculated as BMI=703*(weight/height ). She wants to examine if math_score is related to height, weight and BMI. She plans to use a linear regression model math_score ~ height+weight+BMI to study this relationship.Leonard hears about Mary’s plan and tells Mary that BMI should not be used in her experiment because it is created from the height and weight variables which are already included in the model. He says this leads to an issue called multicollinearity in linear regression. Which of the below options is TRUE?2
Leonard is right; retaining height, weight, BMI in the model will definitely lead
to multicollinearity.
Leonard is wrong because BMI is not a linear combination of weight and height.
Leonard is wrong; it is impossible to say whether multicollinearity is a problem in a proposed model without first fitting the model.
Leonard is right, but the correct name for this issue is homoscedasticity.
Question 23 0 / 2 pts
When working to create a logistic regression model, an analyst is considering two models:Model one with only one predicting variable A.Model two includes predicting variable B in addition to variable A.The analyst notices that the sign of the estimated coefficient for A is negative in model one and positive in model two. This is most likely because:A is a confounding variableB is correlated with A
A is significant in Model two
B is significant on Model two
Question 24 2 / 2 pts
The regression glm(Y~X, family=poisson) was fitted to count data, resultingthe log rate increases by 9.4 units.
in the estimate of to be 16 and the estimate of to be 9.4. For a one unit
increase in X,
the rate increases by 9.4 units.
the log rate increases by exp(9.4) units.
the rate increases by exp(9.4) percent
Question 25 2 / 2 pts
Which of the following is correct?Lasso regression uses the L1 norm.
Ridge regression uses the L0 norm.
Ridge regression performs variable selection.
Elastic net does not perform variable selection.
Question 26 0 / 2 pts
A data point with predictor and response values three orders of magnitude higher than the corresponding values of all other observations is for certain
an outlier with respect to the data
an influential point but not an outlier
an erroneous data point
None of the above
Question 27 0 / 2 pts
In simple linear regression, what is the relation between the correlation coefficient and R-squared?= sqrt(R-squared)There is no relation between the two
= R-squared
= R-squared + Adjusted R-squared
Question 28 0 / 2 pts
High VIF (> 10) value for predictors in linear regression suggestsmulticollinearity.autocorrelation.
homoscedasticity.
non-Linear relation between dependent and independent variables.
Question 29 2 / 2 pts
Which of the following is true regarding logistic regression?The logit link is the canonical link function.
Logistic regression can also be replaced by standard linear regression if with
repetitions
Logistic regression can only be used with continuous predicting variables.
Logistic regression can only be used when the response variable is binary.
Question 30 2 / 2 pts
Which of the following is correct?Overdispersion affects the reliability of our statistical inferences if not modeled correctly.
Overdispersion is a concern for Poisson regression but not for logistic
regression.
For both logistic and Poisson regression, the variance of the response equals
the expectation of the response given the predicting variables.
Under overdispersion, the observed variance is smaller than the variance
implied by our model.
Quiz Score: 24 out of 40
[Show More]