File:Q-Learning on Tax-v2 with Scaled Reward.png

From UBC Wiki

Original file(858 × 694 pixels, file size: 74 KB, MIME type: image/png)

Summary

Description
English: Cumulative true reward from the environment is plotted as a function of training episodes. Agents with positively scaled rewards converge to the global optimum. Negative rewards have an extremely adversarial effect on the agent's performance.
Date 15 April 2019(2019-04-15)
File source Own Work
Author NamHeeKim

Licensing

I, the copyright holder of this work, hereby publish it under the following license:
Some rights reserved
Permission is granted to copy, distribute and/or modify this document according to the terms in Creative Commons License, Attribution-ShareAlike 4.0. The full text of this license may be found here: CC by-sa 4.0
Attribution-Share-a-like

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current05:41, 19 April 2019Thumbnail for version as of 05:41, 19 April 2019858 × 694 (74 KB)NamHeeKim (talk | contribs)User created page with UploadWizard