Computer Science  >  QUESTIONS & ANSWERS  >  CS_188_Spring_2020_Written_Homework_4_Solutions-all answers CORRECT-GRADED A (All)

CS_188_Spring_2020_Written_Homework_4_Solutions-all answers CORRECT-GRADED A

Document Content and Description Below

Q1. [60 pts] Probabilistic Language Modeling In lecture, you saw an example of supervised learning where we used Naive Bayes for a binary classification problem: To predict whether an email was ham ... or spam. To do so, we needed a labeled (i.e., ham or spam) dataset of emails. To avoid this requirement for labeled datasets, let’s instead explore the area of unsupervised learning, where we don’t need a labeled dataset. In this problem, let’s consider the setting of language modeling. Language modeling is a field of Natural Language Processing (NLP) that tries to model the probability of the next word, given the previous words. Here, instead of predicting a binary label of \yes" or \no," we instead need to predict a multiclass label, where the label is the word (from all possible words of the vocabulary) that is the correct word for the blank that we want to fill in. One possible way to model this problem is with Naive Bayes. Recall that in Naive Bayes, the features X1; :::; Xm are assumed to be pairwise independent when given the label Y . For this problem, let Y be the word we are trying to predict, and our features be Xi for i = −n; :::; −1; 1; :::; n, where Xi = ith word i places from Y . (For example, X −2 would be the word 2 places in front of Y . Again, recall that we assume each feature Xi to be independent of each other, given the word Y . For example, in the sequence Neural networks ____ a lot, X−2 = Neural, X−1 = networks, Y = the blank word (our label), X1 = a, and X2 = lot. (a) First, let’s examine the problem of language modeling with Naive Bayes. (i) [1 pt] Draw the Bayes Net structure for the Naive Bayes formulation of modeling the middle word of a sequence given two preceding words and two succeeding words. You may think of the example sequence listed above: Neural networks ____ a lot. Y X −2 X−1 X+1 X+2 (ii) [1 pt] Write the joint probability P (X−2; X−1; Y; X1; X2) in terms of the relevant Conditional Probability Tables (CPTs) that describe the Bayes Net. P (X−2; X−1; Y; X1; X2) = P (Y )P (X−2jY )P (X−1jY )P (X1jY )P (X2jY ) (iii) [1 pt] What is the size of the largest CPT involved in calculating the joint probability? Assume a vocabulary size of V , so each variable can take on one of possible V values. Maximum CPT size is V 2. (iv) [1 pt] Write an expression of what label y tha [Show More]

Last updated: 3 years ago

Preview 1 out of 16 pages

Buy Now

Instant download

We Accept:

Payment methods accepted on Scholarfriends (We Accept)
Preview image of CS_188_Spring_2020_Written_Homework_4_Solutions-all answers CORRECT-GRADED A document

Buy this document to get the full access instantly

Instant Download Access after purchase

Buy Now

Instant download

We Accept:

Payment methods accepted on Scholarfriends (We Accept)

Reviews( 0 )

$9.00

Buy Now

We Accept:

Payment methods accepted on Scholarfriends (We Accept)

Instant download

Can't find what you want? Try our AI powered Search

84
0

Document information


Connected school, study & course


About the document


Uploaded On

May 03, 2021

Number of pages

16

Written in

All

Seller


Profile illustration for d.occ
d.occ

Member since 4 years

232 Documents Sold

Reviews Received
30
8
4
1
7
Additional information

This document has been written for:

Uploaded

May 03, 2021

Downloads

 0

Views

 84

Document Keyword Tags


$9.00
What is Scholarfriends

Scholarfriends.com Online Platform by Browsegrades Inc. 651N South Broad St, Middletown DE. United States.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Scholarfriends · High quality services·