File:Q-Learning on Taxi-v2 with Shifted Reward.png

File
File history
File usage

Size of this preview: 781 × 599 pixels. Other resolution: 843 × 647 pixels.

Original file ‎(843 × 647 pixels, file size: 79 KB, MIME type: image/png)

Summary

Description	English: Cumulative true reward from the environment is plotted as a function of training episode. Perfect reward observation (beta0=0, blue) along with negative shifts converge to the global optimum, while positive shifts are stuck in a local optimum.
Date	15 April 2019(2019-04-15)
File source	Own Work
Author	NamHeeKim

Licensing

I, the copyright holder of this work, hereby publish it under the following license:

Permission is granted to copy, distribute and/or modify this document according to the terms in Creative Commons License, Attribution-ShareAlike 4.0. The full text of this license may be found here: CC by-sa 4.0

File history

Click on a date/time to view the file as it appeared at that time.

	Date/Time	Thumbnail	Dimensions	User	Comment
current	05:37, 19 April 2019		843 × 647 (79 KB)	NamHeeKim (talk \| contribs)	User created page with UploadWizard

You cannot overwrite this file.

File usage

The following page uses this file:

Course:CPSC522/Reinforcement Learning with Linear Model of Reward Corruption