Information Technology > QUESTIONS & ANSWERS > Questions and Answers > INFO 367 Chapter 2 Problems (All)

Questions and Answers > INFO 367 Chapter 2 Problems

Document Content and Description Below

INFO 367 Chapter 2 Problems 2/6/17   2.1) Assuming that data mining techniques are to be used in the following cases, identify whether the task required is supervised or unsupervised learning. ...   a. Deciding whether to issue a loan to an applicant based on demographic and financial data (with reference to a database of similar data on prior customers)   b. In an online bookstore, making recommendations to customers concerning additional items to buy based on the buying patterns in prior transactions.   c. Identifying a network data packet as dangerous (virus, hacker attack) based on comparison to other packets whose threat status is known.   d. Identifying segments of similar customers   e. Predicting whether a company will go bankrupt based on comparing its financial data to those of similar bankrupt and non-bankrupt firms.   f. Estimating the repair time required for an aircraft based on a trouble ticket   g. Automated sorting of mail by zip code scanning.   h. Printing of custom discount coupons at the conclusion of a grocery store checkout based on what you just bought and what others have bought previously.    2.2) Describe the difference in roles assumed by the validation partition and the test partition.     2.3) Consider the sample from the database of credit applicants in table 2.5. Comment on the likelihood that it was sampled randomly, and whether it is likely to be a useful sample.     2.4) Consider the sample from a bank database shown in table 2.6; is was selected randomly from a larger database to be the training set. Personal loan indicates whether a solicitation for a personal loan was accepted and is the response variable. A campaign is planned for a similar solicitation in the future, and the bank is looking for a model that will help identify likely responder. Examine the data carefully and indicate what your next step would be.      2.5) Using the concept of overfitting, explain why when the model is fit to training data, zero error with those data is not necessarily good.   2.6) In fitting a model to classify prospects as purchasers or nonpurchasers, a certain company drew the training data from internal data that include demographic and purchase information. Future data to be classified will be listed purchased from other sources, with demographic data included. It was found that “refund issued” was a useful predictor in the training data. Why is this not an appropriate variable to include in the model?     2.7) A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. An analyst decides to remove records with missing values. About how many records would you expect to be removed?     2.8) Normalize the data    | Age | | | Income | |   | 25 | -1.313253 | | 49000 | -0.790027 |   | 56 | 0.7567898 | | 156000 | 0.9119774 |   | 65 | 1.35777 | | 99000 | 0.0053022 |   | 32 | -0.845824 | | 192000 | 1.4846144 |   | 41 | -0.244844 | | 39000 | -0.949093 |   | 49 | 0.2893608 | | 57000 | -0.662774 |   | | | | | |   | | Average | 44.66667 | | Average | 98666.67  | | Standard Deviation | 14.97554 | | Standard Deviation | 62867.06   2.9) Can normalizing the data change which records are furthest away from each other in terms of Euclidean distance?     2.10) Two models are applied to a dataset that has been partitioned. Model A is considerably more accurate than Model B on the training data, but slightly less accurate than Model B on the validation Data, Which model are you more likely to consider for final development?     [Show More]

Last updated: 2 years ago

Preview 1 out of 3 pages

Buy Now

Instant download

We Accept:

We Accept
document-preview

Buy this document to get the full access instantly

Instant Download Access after purchase

Buy Now

Instant download

We Accept:

We Accept

Reviews( 0 )

$5.00

Buy Now

We Accept:

We Accept

Instant download

Can't find what you want? Try our AI powered Search

44
0

Document information


Connected school, study & course


About the document


Uploaded On

Jan 07, 2023

Number of pages

3

Written in

Seller


seller-icon
PAPERS UNLIMITED™

Member since 3 years

509 Documents Sold

Reviews Received
55
20
8
2
8
Additional information

This document has been written for:

Uploaded

Jan 07, 2023

Downloads

 0

Views

 44

Document Keyword Tags

More From PAPERS UNLIMITED™

View all PAPERS UNLIMITED™'s documents »

$5.00
What is Scholarfriends

In Scholarfriends, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Scholarfriends · High quality services·