Computer Science > EXAMs > CS 234 Winter 2019: Assignment 2-solution-2022 (All)
Introduction In this assignment we will implement deep Q learning, following DeepMind’s paper ([mnih2015human] and [mnih-atari-2013]) that learns to play Atari from raw pixels. The purpose is to un ... derstand the effec- tiveness of deep neural network as well as some of the techniques used in practice to stabilize training and achieve better performance. You’ll also have to get comfortable with Tensorflow. We will train our networks on the Pong-v0 environment from OpenAI gym, but the code can easily be applied to any other environment. In Pong, one player wins if the ball passes by the other player. Winning a game gives a reward of 1, while losing gives a negative reward of -1. An episode is over when one of the two players reaches 21 wins. Thus, the final score is between -21 (lost episode) or +21 (won episode). Our agent plays against a decent hard- coded AI player. Average human performance is −3 (reported in [mnih-atari-2013]). If you go to the end of the homework successfully, you will train an AI agent with super-human performance, reaching at least +10 (hopefully more!). 1 Test Environment (5 pts) Before running our code on Pong, it is crucial to test our code on a test environment. You should be able to run your models on CPU in no more than a few minutes on the following environment: • 4 states: 0, 1, 2, 3 • 5 actions: 0, 1, 2, 3, 4. Action 0 ≤ i ≤ 3 goes to state i, while action 4 makes the agent stay in the same state. • Rewards: Going to state i from states 0, 1, and 3 gives a reward R(i), where R(0) = 0.1, R(1) = −0.2, R(2) = 0, R(3) = −0.1. If we start in state 2, then the rewards defind above are multiplied by −10. See Table 1 for the full transition and reward structure. CONTINUED........ [Show More]
Last updated: 3 years ago
Preview 1 out of 12 pages
Buy this document to get the full access instantly
Instant Download Access after purchase
Buy NowInstant download
We Accept:
Can't find what you want? Try our AI powered Search
Connected school, study & course
About the document
Uploaded On
Aug 07, 2022
Number of pages
12
Written in
All
This document has been written for:
Uploaded
Aug 07, 2022
Downloads
0
Views
60
Scholarfriends.com Online Platform by Browsegrades Inc. 651N South Broad St, Middletown DE. United States.
We're available through e-mail, Twitter, Facebook, and live chat.
FAQ
Questions? Leave a message!
Copyright © Scholarfriends · High quality services·