Computer Science > QUESTIONS & ANSWERS > CS_188_Spring_2020_Written_Homework_4_Solutions-all answers CORRECT-GRADED A (All)

CS_188_Spring_2020_Written_Homework_4_Solutions-all answers CORRECT-GRADED A

Document Content and Description Below

Q1. [60 pts] Probabilistic Language Modeling In lecture, you saw an example of supervised learning where we used Naive Bayes for a binary classification problem: To predict whether an email was ham ... or spam. To do so, we needed a labeled (i.e., ham or spam) dataset of emails. To avoid this requirement for labeled datasets, let’s instead explore the area of unsupervised learning, where we don’t need a labeled dataset. In this problem, let’s consider the setting of language modeling. Language modeling is a field of Natural Language Processing (NLP) that tries to model the probability of the next word, given the previous words. Here, instead of predicting a binary label of \yes" or \no," we instead need to predict a multiclass label, where the label is the word (from all possible words of the vocabulary) that is the correct word for the blank that we want to fill in. One possible way to model this problem is with Naive Bayes. Recall that in Naive Bayes, the features X1; :::; Xm are assumed to be pairwise independent when given the label Y . For this problem, let Y be the word we are trying to predict, and our features be Xi for i = −n; :::; −1; 1; :::; n, where Xi = ith word i places from Y . (For example, X −2 would be the word 2 places in front of Y . Again, recall that we assume each feature Xi to be independent of each other, given the word Y . For example, in the sequence Neural networks ____ a lot, X−2 = Neural, X−1 = networks, Y = the blank word (our label), X1 = a, and X2 = lot. (a) First, let’s examine the problem of language modeling with Naive Bayes. (i) [1 pt] Draw the Bayes Net structure for the Naive Bayes formulation of modeling the middle word of a sequence given two preceding words and two succeeding words. You may think of the example sequence listed above: Neural networks ____ a lot. Y X −2 X−1 X+1 X+2 (ii) [1 pt] Write the joint probability P (X−2; X−1; Y; X1; X2) in terms of the relevant Conditional Probability Tables (CPTs) that describe the Bayes Net. P (X−2; X−1; Y; X1; X2) = P (Y )P (X−2jY )P (X−1jY )P (X1jY )P (X2jY ) (iii) [1 pt] What is the size of the largest CPT involved in calculating the joint probability? Assume a vocabulary size of V , so each variable can take on one of possible V values. Maximum CPT size is V 2. (iv) [1 pt] Write an expression of what label y tha [Show More]

Last updated: 2 years ago

Preview 1 out of 16 pages

Buy Now

Instant download

We Accept:

We Accept
document-preview

Buy this document to get the full access instantly

Instant Download Access after purchase

Buy Now

Instant download

We Accept:

We Accept

Reviews( 0 )

$9.00

Buy Now

We Accept:

We Accept

Instant download

Can't find what you want? Try our AI powered Search

69
0

Document information


Connected school, study & course


About the document


Uploaded On

May 03, 2021

Number of pages

16

Written in

Seller


seller-icon
d.occ

Member since 4 years

231 Documents Sold

Reviews Received
30
8
4
1
7
Additional information

This document has been written for:

Uploaded

May 03, 2021

Downloads

 0

Views

 69

Document Keyword Tags


$9.00
What is Scholarfriends

In Scholarfriends, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Scholarfriends · High quality services·