Computer Science  >  QUESTIONS & ANSWERS  >  CS 412 Introduction To Data Mining - University of Illinois-2020 Midterm Exam Q&A (All)

CS 412 Introduction To Data Mining - University of Illinois-2020 Midterm Exam Q&A

Document Content and Description Below

Midterm Exam 1. Fill in your information: Full Name: NetID: 1 Question 1 (20 points): Get to Know your Data A. [5] The least and greatest number in a list of 7 integers are 2 and 20 respective ... ly. The median and mode of list are 6 and 3 respectively. Find out which of the following options can be mean of the list. (a) 4 (b) 7 (c) 6.85 (d) 6.71 Note there may be more than one correct option. B. [5] Consider the following data on age distribution of a population of 3594 people. age frequency 1-5 6-15 16-20 21-50 51-80 81-110 2004503001500700444 (a) Compute the approximate median age for the given population. (b) What special case will result in maximum error in the approximate median you calculated above? Assume that the frequency for each interval remains unchanged. C. [5] Consider the following data on two persons Jack and Jill. Height and Weight are given in ft and lbs respectively. Name Age Height Weight Jack 24 7 210 Jill 14 4 140 Age , Height and Weight are ordinal attributes with following interval states. (a) Age : [3-18], [19-40], [41-80] (b) Height : [3-5], [6-9] (c) Weight : [70-110], [110-150], [150-190], [190-230] Compute the Manhattan distance between two persons. Show the steps of your calculation. D. [5] The following table contains the medical record of two patients (John and Jane) on six lab tests. Calculate the Dissimilarity between John and Jane based on these records (P & N mean positive and negative test results, respectively). Write down the steps. test-1 test-2 test-3 test-4 test-5 test-6 John P N P N N N Jane P N P N P N Solution. 2 A. 2; 3; 3; 6; x; y; 20 . x and y has to be at least 7 and 8 respectively. B is the correct answer. 7 can be the mean of the list B. (a) Approximate median = 21 + 1797 1500 -950 ∗ (50 - 21) = 37:38 (b) When all 1500 people in the median interval have age 21. 37.38 - 21 = 16.38 C. Jack : (2 , 2, 4) , Jill : (1, 1, 2) [replacing ordinal attribute by rank] Each ordinal attributes has different number of states. So they need to be normalized. Jack : ( 1 2; 1; 1) Jill : (0; 0; 1 3) Manhattan Distance : 13 6 D. these are asymmetric binary attributes, so we need to draw the contingency table and calculate q, r, and s. 2 2 + 0 + 1 = 0:66 3 Question 2 (15 points): Data Preprocessing A. [3] Consider the following data for the attribute price: 8, 9, 15, 16, 21, 21, 24, 26, 27, 30, 30, 34 Use smoothing by bin means to smooth this data using a equi-depth bins and a bin depth of 4. B. [12] Consider the following data for two attributes A and B: A B 21 25 42 43 57 59 657975998781 (a) [5] Normalize attribute ‘A’ based on z-score normalization. (b) [5] Calculate the correlation coefficient. Are these two attributes positively correlated or negatively correlated? (c) [2] Compute the covariance of the two attributes. Solution. A. Bin means: Bin 1: 8+9+15+16 4 = 12 Bin 2: 21+21+24+26 4 = 23 Bin 3: 27+30+30+34 4 = 30 Smoothed data: Bin 1 = 12, 12, 12, 12 Bin 2 = 23, 23, 23, 23 Bin 2 = 30, 30, 30, 30 B. (a) Mean µ = Pn i=1 n Ai = 247 6 = 41:167 Variance s2 = 1 n-1(Pn i=1 A2 i - n1 (Pn i=1 Ai)2) = 6-1 1(11409 - 247 6 2 ) = 248:167 Standard deviation s = p248:167 = 15:753 z-score = Ai-µ [Show More]

Last updated: 2 years ago

Preview 1 out of 8 pages

Buy Now

Instant download

We Accept:

Payment methods accepted on Scholarfriends (We Accept)
Preview image of CS 412 Introduction To Data Mining - University of Illinois-2020 Midterm Exam Q&A document

Buy this document to get the full access instantly

Instant Download Access after purchase

Buy Now

Instant download

We Accept:

Payment methods accepted on Scholarfriends (We Accept)

Reviews( 0 )

$7.50

Buy Now

We Accept:

Payment methods accepted on Scholarfriends (We Accept)

Instant download

Can't find what you want? Try our AI powered Search

72
0

Document information


Connected school, study & course


About the document


Uploaded On

Apr 02, 2023

Number of pages

8

Written in

All

Seller


Profile illustration for PAPERS UNLIMITED™
PAPERS UNLIMITED™

Member since 4 years

509 Documents Sold

Reviews Received
55
20
8
2
8
Additional information

This document has been written for:

Uploaded

Apr 02, 2023

Downloads

 0

Views

 72

Document Keyword Tags

More From PAPERS UNLIMITED™

View all PAPERS UNLIMITED™'s documents »

$7.50
What is Scholarfriends

Scholarfriends.com Online Platform by Browsegrades Inc. 651N South Broad St, Middletown DE. United States.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Scholarfriends · High quality services·