Computer Science > QUESTIONS & ANSWERS > UIUC-CS412 -An Introduction to Data Mining-University of Illinois (Fall 2019) Final Exam. 100 marks, (All)
UIUC-CS412 \An Introduction to Data Mining" (Fall 2019) Final Exam (Tuesday, Dec. 17, 2019, 180 minutes, 100 marks, brief answers) 1 Minkowski Distance [10 points] Given two data points in 2-D s ... pace: x1 = (1; 1)0 and x2 = (2; 3)0. (a) [4 pts] What is the L2 distance between x1 and x2? (b) [3 pts] What is the L1 distance between x1 and x2? (c) [3 pts] What is the L1 distance between x1 and x2? Table ?? provides the information of 9 randomly sampled students’ final exam scores of an online course. Table 1: Final Exam Scores of 9 Students. Student No. 1 2 3 4 5 6 7 8 9 Final 99 82 78 100 86 92 88 60 75 (a) [3 pts] What is the median score? Solution: 86 (b) [3 pts] If the score for student # 8 drops to 40, how would that affect the median score of this (c) [2 pts. True or False] If we perform normalization by decimal scaling, the normalized scores will be in the range of [-1; 1]. ] If we perform z-score normalization on the input dataset, the maximum of the normalized scores is always less than or equal to 1. Given the following contingency table, we want to use χ2 test to decide if the two random variables (play chess vs. like science fiction) are correlated or not. Table 2: Contingency Table. play chess not play chess sum (row) like science fiction 500 0 500 not like science fiction 300 200 500 sum (column) 800 200 1000 (a) [3 pts] Under the null hypothesis (i.e., ‘play chess’ and ‘like science fiction’ are independent with each other), what is the expected number for ‘play chess’ and ‘like science fiction’? (b) [3 pts] Under the null hypothesis (i.e., ‘play chess’ and ‘like science fiction’ are independent with each other), what is the expected number for ‘not play chess’ and ‘like science fiction’? If the χ2 value of this dataset is big, we can conclude that ‘play chess’ and ‘like science fiction’ are correlated with each other (d) [2 pts.] If ‘play chess’ and ‘like science fiction’ are correlated with each other, are these two random variables positively or negatively correlated with each other? (a) [3 pts] Suppose we will build a data warehouse with four dimensions, including location, supplier, time and item. If we do not consider the concept hierarchy, how many cuboids are there? (b) [4 pts] Suppose the location dimension has three different values, including Urbana, Chicago and New York City; the supplier dimension has two different values, including Dairy Land and Land O’Lakes; the time dimension has four different values, including Q1, Q2, Q3 and Q4; the item dimension has two different values, including milk and ham. How many base cells are there (2 pts)? How many cells are there in total (2 pts)? [Show More]
Last updated: 2 years ago
Preview 1 out of 11 pages
Buy this document to get the full access instantly
Instant Download Access after purchase
Buy NowInstant download
We Accept:
Can't find what you want? Try our AI powered Search
Connected school, study & course
About the document
Uploaded On
Apr 02, 2023
Number of pages
11
Written in
All
This document has been written for:
Uploaded
Apr 02, 2023
Downloads
0
Views
160
Scholarfriends.com Online Platform by Browsegrades Inc. 651N South Broad St, Middletown DE. United States.
We're available through e-mail, Twitter, Facebook, and live chat.
FAQ
Questions? Leave a message!
Copyright © Scholarfriends · High quality services·