File:Q-Learning on Taxi-v2 with Shifted Reward.png
Size of this preview: 781 × 599 pixels. Other resolution: 843 × 647 pixels.
Original file (843 × 647 pixels, file size: 79 KB, MIME type: image/png)
Summary
Description | English: Cumulative true reward from the environment is plotted as a function of training episode. Perfect reward observation (beta0=0, blue) along with negative shifts converge to the global optimum, while positive shifts are stuck in a local optimum. |
Date | 15 April 2019 |
File source | Own Work |
Author | NamHeeKim |
Licensing
|
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 05:37, 19 April 2019 | 843 × 647 (79 KB) | NamHeeKim (talk | contribs) | User created page with UploadWizard |
You cannot overwrite this file.
File usage
The following page uses this file: