Computer Science > QUESTIONS & ANSWERS > ISYE 6414 Regression Analysis - Solution_Endterm Closed Book Section - Part 1_ Regression Analysis - (All)

ISYE 6414 Regression Analysis - Solution_Endterm Closed Book Section - Part 1_ Regression Analysis --Georgia Institute Of Technology. Correct Answers Highlighted.

Document Content and Description Below

ISYE 6414 Regression Analysis - Solution_Endterm Closed Book Section - Part 1_ Regression Analysis --Georgia Institute Of Technology Endterm Closed Book Section - Part 1 We should always use mean squ... ared error to determine the best value of lambda in lasso regression.False True Question 2 1 / 1 pts Standard linear regression is an example of a generalized linear model where the response is normally distributed and the link is the identity function.True False Question 3 0 / 1 pts Goodness-of-fit assessment for logistic regression with replications involves checking for the independence, constant variance, and normality of the deviance residuals. True False Question 4 1 / 1 pts You are interested in understanding the relationship between stress level and exercise, with stress as the response. In your model, the number of hours a person spends exercising per week would be considered an explanatory variable while the person’s age would be a controlling variable.True False Question 5 1 / 1 pts The hypothesis test for goodness-of-fit using Pearson residuals and the test using deviance residuals will always reach the same conclusion.False True Question 6 1 / 1 pts A logistic regression model with high goodness of fit can have low predictive power. True False Question 7 1 / 1 pts If we apply a Poisson regression model using a small sample size, the estimators of the regression coefficient may not follow an approximate Normal distribution, affecting the reliability of the statistical inference on the coefficients.True False Question 8 1 / 1 pts You fit a regression model using three predictors. You notice the estimated coefficient for predictor X1 is an order of magnitude larger than the estimated coefficient for predictor X2. It is correct to conclude that X1 has a greater effect on the response than X2.False True Question 9 0 / 1 pts Regularized regression with a lambda value equal to 0 is equivalent to regression model estimation without penalization. True False Question 10 0 / 1 pts In a Poisson regression model, the difference between the null deviance and residual deviance follows a normal distribution.TrueFalse Question 11 1 / 1 pts You want to examine the relationship between study time and score on exams. You create five exams and recruit 50 participants. For each participant in your study, you record their time studying for and grade on each of those five exams. If you were to use all the data you recorded to build a simple linear regression model, you would violate the independence assumption.True False Question 12 1 / 1 pts Maximum likelihood estimation produces unbiased coefficient estimates for logistic and Poisson regression. True False Question 13 1 / 1 pts If considering only BIC for the model selection criterion, a model with lower BIC is preferred over a model with higher BIC.True False Question 14 0 / 1 pts If we do not penalize the training risk, our variable selection method would always prefer more complex models.TrueFalse Question 15 1 / 1 pts One goal of variable selection is to balance the bias-variance tradeoff when making predictions. True False Question 16 1 / 1 pts Forward stepwise selection is computationally more expensive than backward stepwise selection because it takes more iterations to terminate.False True Question 17 0 / 1 pts When testing for the significance of a subset of predictors, the null hypothesis is that all coefficients for variables in that subset are 0 and the alternative is that all those coefficients are not 0.TrueFalse Question 18 1 / 1 pts Classification error estimated from leave-one-out cross validation tends to have higher bias but lower variability than classification error estimated from 2-fold cross validation. True False Question 19 0 / 1 pts One reasonable method for variable selection is to fit the full model and then drop all variables with a high p-value in that full model.TrueFalse Question 20 1 / 1 pts The logit link is the only link function that yields s-shaped curves.False True Question 21 2 / 2 pts Recall the formula for adjusted R-squared is Which of the following is TRUE about adjusted R-squared? The adjusted R can be used to compare models, and its value will always be less than or equal to that of R . 2 2 The adjusted R cannot be used to compare models, and its value will always be less than or equal to that of R . 2 2 The adjusted R can be used to compare models, and its value will always be greater than or equal to that of R . 2 2 The adjusted R cannot be used to compare models, and its value will always be greater than or equal to that of R . 2 2 Question 22 0 / 2 pts Mary has a dataset with height (in inches), weight (in lbs), and math_score (final exam score out of 100) of 300 students in an undergraduate math course. She creates another field called BMI (Body Mass Index) calculated as BMI=703*(weight/height ). She wants to examine if math_score is related to height, weight and BMI. She plans to use a linear regression model math_score ~ height+weight+BMI to study this relationship.Leonard hears about Mary’s plan and tells Mary that BMI should not be used in her experiment because it is created from the height and weight variables which are already included in the model. He says this leads to an issue called multicollinearity in linear regression. Which of the below options is TRUE?2 Leonard is right; retaining height, weight, BMI in the model will definitely lead to multicollinearity. Leonard is wrong because BMI is not a linear combination of weight and height. Leonard is wrong; it is impossible to say whether multicollinearity is a problem in a proposed model without first fitting the model. Leonard is right, but the correct name for this issue is homoscedasticity. Question 23 0 / 2 pts When working to create a logistic regression model, an analyst is considering two models:Model one with only one predicting variable A.Model two includes predicting variable B in addition to variable A.The analyst notices that the sign of the estimated coefficient for A is negative in model one and positive in model two. This is most likely because:A is a confounding variableB is correlated with A A is significant in Model two B is significant on Model two Question 24 2 / 2 pts The regression glm(Y~X, family=poisson) was fitted to count data, resultingthe log rate increases by 9.4 units. in the estimate of to be 16 and the estimate of to be 9.4. For a one unit increase in X, the rate increases by 9.4 units. the log rate increases by exp(9.4) units. the rate increases by exp(9.4) percent Question 25 2 / 2 pts Which of the following is correct?Lasso regression uses the L1 norm. Ridge regression uses the L0 norm. Ridge regression performs variable selection. Elastic net does not perform variable selection. Question 26 0 / 2 pts A data point with predictor and response values three orders of magnitude higher than the corresponding values of all other observations is for certain an outlier with respect to the data an influential point but not an outlier an erroneous data point None of the above Question 27 0 / 2 pts In simple linear regression, what is the relation between the correlation coefficient and R-squared?= sqrt(R-squared)There is no relation between the two = R-squared = R-squared + Adjusted R-squared Question 28 0 / 2 pts High VIF (> 10) value for predictors in linear regression suggestsmulticollinearity.autocorrelation. homoscedasticity. non-Linear relation between dependent and independent variables. Question 29 2 / 2 pts Which of the following is true regarding logistic regression?The logit link is the canonical link function. Logistic regression can also be replaced by standard linear regression if with repetitions Logistic regression can only be used with continuous predicting variables. Logistic regression can only be used when the response variable is binary. Question 30 2 / 2 pts Which of the following is correct?Overdispersion affects the reliability of our statistical inferences if not modeled correctly. Overdispersion is a concern for Poisson regression but not for logistic regression. For both logistic and Poisson regression, the variance of the response equals the expectation of the response given the predicting variables. Under overdispersion, the observed variance is smaller than the variance implied by our model. Quiz Score: 24 out of 40 [Show More]

Last updated: 2 years ago

Preview 1 out of 14 pages

Buy Now

Instant download

We Accept: