Data Mining > QUESTIONS & ANSWERS > MSMIT CSC550 Week1_hw Chapter 2_Problems. All Answers Provided. (All)

MSMIT CSC550 Week1_hw Chapter 2_Problems. All Answers Provided.

Document Content and Description Below

MSMIT CSC550Week1_hw Chapter 2_Problems 1. Assuming that data mining techniques are to be used in the following cases, identify whether the task required is supervised or unsupervised learning.... (a)Deciding whether to issue a loan to an applicant based on demographic and financial data (with reference to a database of similar data on prior customers). (b)In an online bookstore, making recommendations to customers concerning additional items to buy based on the buying patterns in prior transactions. This is unsupervised learning, because there is no obvious outcome whether the recommendation has been followed or not. (c) Identifying a network data packet as dangerous (virus, hacker attack) based on comparisons to other packets whose threat status is known. This is supervised learning, because the result of identification is known. (d) Identifying segments of similar customers. This is unsupervised learning because there is no known possible outcome. (e) Predicting whether a company will go bankrupt based on comparing its financial data to those of similar bankrupt and nonbankrupt firms. This is supervised learning, because financial data has been used to find the result. (f ) Estimating the repair time required for an aircraft based on a trouble ticket. This is supervised learning, because the value of outcome of repair time is known. (g) Automatic sorting of mail by zip code scanning. This is supervised learning, because there is a outcome of sorting. (h) Printing of customer discount coupons at the conclusion of a grogery store checkout based on what you just bought and what others have bought recently. This is unsupervised learning, because no obvious outcome, it is hard to guess about other customers. 2. Describe the difference in roles assumed by the validation partition and the test partition. 3. Consider the sample from a database of credit applications shown in Figure 2.13. Comment on the likelihood that it was sampled randomly, and whether it is likely to be a useful sample. 5. Using the concept of overfitting, explain why when a model is fit to training data, zero error with those data is not necessarily good. 8. Normalize the data in Table 2.3. Normalization of a measurement is obtained by subtracting the average from each measurement and dividing the difference by the standard deviation. 10. Two models are applied to a dataset that has been partitioned. Model A is considerably more accurate than model B on the training data, but slightly less accurate than model B on the validation data. Which one are you more likely to consider for final deployment. [Show More]

Last updated: 2 years ago

Preview 1 out of 3 pages

Buy Now

Instant download

We Accept:

We Accept
document-preview

Buy this document to get the full access instantly

Instant Download Access after purchase

Buy Now

Instant download

We Accept:

We Accept

Reviews( 0 )

$9.50

Buy Now

We Accept:

We Accept

Instant download

Can't find what you want? Try our AI powered Search

192
0

Document information


Connected school, study & course


About the document


Uploaded On

Sep 22, 2020

Number of pages

3

Written in

Seller


seller-icon
QuizMaster

Member since 5 years

1185 Documents Sold

Reviews Received
185
56
29
11
17
Additional information

This document has been written for:

Uploaded

Sep 22, 2020

Downloads

 0

Views

 192

Document Keyword Tags


$9.50
What is Scholarfriends

In Scholarfriends, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Scholarfriends · High quality services·