Computer Science > EXAM > CS 412 Introduction To Data Mining - University of Illinois_Take-Home Midterm (All)

CS 412 Introduction To Data Mining - University of Illinois_Take-Home Midterm

Document Content and Description Below

CS 412 Introduction To Data Mining - University of Illinois_Take-Home Midterm CS 412: Spring’21 Introduction To Data Mining Take-Home Midterm (Due Tuesday, March 23, 10:00 am) General Instruction... s • You will have to answer the questions yourself, you cannot consult with other students in class. It is an open book exam, so you can use the textbook and the material shared in class, e.g., slides, lectures, etc. • The take-home midterm will be due at 10 am, Tue, March 23. We will be using Compass for collecting the homework assignments. Please submit your answers via Compass (http: //compass2g.illinois.edu). Contact the TAs if you are having technical difficulties in submitting the assignment. We will NOT accept late submissions. • Your answers should be typeset and submitted as a pdf. You cannot submit a hand-written and scanned version of your midterm. • You DO NOT have to submit code for any of the questions. • For the questions, you will not get full credit if you only give out a final result. Please show the necessary details, calculation steps, and explanations as appropriate. • If you have clarification questions, you can use slack or campaswire. However, since the midterm needs to be submitted within 24 hours, please try to do your best in answering the questions based on your own understanding, in case responses are delayed. 1 1. (18 points) This question considers summarization and visualization of probability distributions: (a) (3 points) Describe what a five-number summary of a distribution is. (b) (3 points) Describe what boxplots are and explain how boxplots incorporate the fivenumber summary. (c) (3 point) Can two different distributions have the exact same boxplot? Clearly explain your answer. (d) (3 points) Describe what quantile plots are. (e) (3 points) Describe what quantile-quantile plots are. (f) (3 point) How is a quantile-quantile plot different from a quantile plot? Clearly explain. 2. (22 points) Table 1 is a summary of customers’ purchase history of diapers and beer. In particular, for a total of 1000 customers, the table shows how many bought both Beer and Diapers, how many bought Beer but not Diapers, and so on. For the problem, we will treat both ’Buy Beer’ and ’Buy Diaper’ as binary attributes. Buy Diaper Not Buy Diaper Buy Beer Not Buy Beer 100 300 400200 Table 1: Contingency table for Beer and Diaper sales. (a) (3 points) Under the null hypothesis, i.e., ‘Buy Beer’ and ‘Buy Diaper’ are independent, what is the expected number for ‘Buy Beer’ and ‘Buy Diaper’? (b) (3 points) Under the null hypothesis, i.e., ‘Buy Beer’ and ‘Buy Diaper’ are independent, what is the expected number for ‘Buy Beer’ and ‘Not But Diaper’? (c) (4 points) What is the χ2 statistic for the contingency table? Show steps of your calculation. (d) (4 points) At a significance level of α = 0:05, are these two variables ‘Buy Beer’ and ‘Buy Diaper’ independent? Explain your answer. (e) (4 points) Consider an updated contingency table where the entry for ‘Not Buy Beer’ and ‘Not Buy Diaper’ is 20,000 instead of 200, and all other entries are the same. What is the χ2 statistic for this updated contingency table? Show steps of your calculation. (f) (4 points) For the updated contingency table, at a significance level of α = 0:05, are these two variables ‘Buy Beer’ and ‘Buy Diaper’ independent? Explain your answer. 3. (24 points) This question considers frequent pattern mining and association rule mining. (a) (12 points) A transaction database (Table 2) has 5 transactions, and we will consider frequent pattern and association mining with (relative) minimum support min sup = 0:6 and (relative) minimum confidence min conf = 0:6. i. (6 points) What is the frequent k-itemset for the largest k? Explain your answer. If there are more than one, it is sufficient to mention (and explain) only on [Show More]

Last updated: 2 years ago

Preview 1 out of 4 pages

Buy Now

Instant download

We Accept:

We Accept
document-preview

Buy this document to get the full access instantly

Instant Download Access after purchase

Buy Now

Instant download

We Accept:

We Accept

Reviews( 0 )

$6.50

Buy Now

We Accept:

We Accept

Instant download

Can't find what you want? Try our AI powered Search

98
0

Document information


Connected school, study & course


About the document


Uploaded On

Apr 02, 2023

Number of pages

4

Written in

Seller


seller-icon
PAPERS UNLIMITED™

Member since 3 years

509 Documents Sold

Reviews Received
55
20
8
2
8
Additional information

This document has been written for:

Uploaded

Apr 02, 2023

Downloads

 0

Views

 98

Document Keyword Tags

More From PAPERS UNLIMITED™

View all PAPERS UNLIMITED™'s documents »

Recommended For You

Get more on EXAM »

$6.50
What is Scholarfriends

In Scholarfriends, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Scholarfriends · High quality services·