Computer Science > QUESTIONS & ANSWERS > UIUC-CS412 An Introduction to Data Mining_University of Illinois (Fall 2020) Final Exam. 100 marks,  (All)

UIUC-CS412 An Introduction to Data Mining_University of Illinois (Fall 2020) Final Exam. 100 marks, brief answers

Document Content and Description Below

UIUC-CS412 \An Introduction to Data Mining" (Fall 2020) Final Exam Minkowski Distance [10 points] Given three data points in 2-D space: x1 = (1; 0)0, x2 = (-1; 0)0 and x3 = (a; b)0, where a and b ... are two unknown numbers. Let d1 be the distance between x1 and x3, and d2 be the distance between x2 and x3 (a) [6 pts] What are the L2, L1 and L1 distances between x1 and x2 respectively? (b) [1:5 pts] If we use L2 distance, under which condition does d1 = d2? (c) [2:5 pts] If we use L1 distance, under which condition does d1 = d2? half point for each part 3 2 Basic Statistics and Normalization [10 points] Table 1 provides the information of 9 randomly sampled students’ final exam scores of an online course. Table 1: Final Exam Scores of 9 Students. (a) [3 pts] What is the median score? (b) [3 pts] [True or False]. If one student’s score improves, the sample mean will definitely increase as well. (c) [2 pts] [True or False]. If scores of six students improve, the median will definitely increase as well. (d) [2 pts] Suppose scores of k (1 ≤ k ≤ 9) students improve and the remaining (9 - k) students’ scores remain the same. What is the minimal k so that the median will definitely increases? 3 Data Warehouse [10 points] (a) [4 pts] Suppose we build a data warehouse with three dimensions, including location, supplier, and time. If we do not consider the concept hierarchy, how many cuboids are there in total? (b) [6 pts] Suppose the location dimension has three different values, including Urbana, Chicago and New York City; the supplier dimension has two different values, including Dairy Land and Land O’Lakes; the time dimension has twelve different values, ranging from January to December. How many base cells are there in total (3 pts)? How many aggregated cells are there in total (3 pts)? 4 Pattern Evaluations [10 points] Giving two itemsets A and B and the following contingency table (Table 2). A :A Prow B a b a + b :B c d c + d Pcol a + c b + d a + b + c + d Table 2: Contingency Table of Problem 4 (a) [4 pts] If we use lift as the interestingness measure, under which condition will we conclude that A and B are positively correlated? Solution: ad > bc (b) [3 pts] Suppose we conclude that A and B are positively correlated based on lift. [True or False] Now suppose we increase d while keep a; b; c unchanged, A and B will still be positively correlated based on lift. [Show More]

Last updated: 2 years ago

Preview 1 out of 13 pages

Buy Now

Instant download

We Accept:

We Accept
document-preview

Buy this document to get the full access instantly

Instant Download Access after purchase

Buy Now

Instant download

We Accept:

We Accept

Reviews( 0 )

$7.50

Buy Now

We Accept:

We Accept

Instant download

Can't find what you want? Try our AI powered Search

87
0

Document information


Connected school, study & course


About the document


Uploaded On

Apr 02, 2023

Number of pages

13

Written in

Seller


seller-icon
PAPERS UNLIMITED™

Member since 3 years

509 Documents Sold

Reviews Received
55
20
8
2
8
Additional information

This document has been written for:

Uploaded

Apr 02, 2023

Downloads

 0

Views

 87

Document Keyword Tags

More From PAPERS UNLIMITED™

View all PAPERS UNLIMITED™'s documents »

$7.50
What is Scholarfriends

In Scholarfriends, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Scholarfriends · High quality services·