Course talk:CPSC522/Reinforcement Learning

From UBC Wiki

Contents

Thread titleRepliesLast modified
Feedback107:34, 9 February 2016
Critiquing Assignment - Junyuan Zheng107:20, 9 February 2016
Suggestions107:16, 9 February 2016
Suggestions for Reinforcement Learning107:12, 9 February 2016

Hey Adnan,

Awesome wiki page! It reads like a textbook (smooth and explains very well).

Some feedback:

- The multi-armed bandit problem is not defined but mentioned briefly when talking about exploration vs. exploitation, so it feels kinda out of place.

- The Bellman equation is references before it is defined.

- The example of a RL agent playing chess can be expanded a bit. What is the state-space and action space? How are the reward functions defined, etc? Essentially, how does this RL agent fit into the framework described in this wiki page?

- The "The Standard Reinforcement Learning Model" assumes discrete state-space, finite number of actions, and discrete time. Maybe a brief mention on whether these constraints can be relaxed and/or why it is difficult to do so.

Overall, amazing read!

Ricky


  • [5] The topic is relevant for the course.
  • [5] The writing is clear and the English is good.
  • [5] The page is written at an appropriate level for CPSC 522 students (where the students have diverse backgrounds).
  • [5] The formalism (definitions, mathematics) was well chosen to make the page easier to understand.
  • [5] The abstract is a concise and clear summary.
  • [3] There were appropriate (original) examples that helped make the topic clear.
  • [3] There was appropriate use of (pseudo-) code.
  • [5] It had a good coverage of representations, semantics, inference and learning (as appropriate for the topic).
  • [5] It is correct.
  • [4] It was neither too short nor too long for the topic.
  • [5] It was an appropriate unit for a page (it shouldn't be split into different topics or merged with another page).
  • [5] It links to appropriate other pages in the wiki.
  • [5] The references and links to external pages are well chosen.
  • [4.5] I would recommend this page to someone who wanted to find out about the topic.
  • [4] This page should be highlighted as an exemplary page for others to emulate.
TianQiChen (talk)05:20, 5 February 2016

Thank you for pointing out the inconsistencies in the page. I've simplified and revised the page based on your suggestions.

AdnanReza (talk)07:34, 9 February 2016
 

Critiquing Assignment - Junyuan Zheng

This page has a good coverage of the topic. but I have several suggestions:

1. Some place need to add clear references.

2. The final fundamental issue is generalization: given that we can only visit a subset of the exponential number of states, how can we know the value of all the states? The most common approach is to approximate the Q/V functions using, say, a neural net. A more promising approach (according to many experts) uses the factored structure of the model to allow safe state abstraction.

This page did not mention Q/V function before, therefore, maybe need some explain....

3. The "Value function approaches" part is not very clear to me, maybe you can use Q-learning function or SARSA Algorithm to explain this part.

JunyuanZheng (talk)05:24, 2 February 2016

Thank you for your feedback. I've added annotated references. I've also revised the generalization and value function approaches sections and kept it as simple as possible while maintaining the important features.

AdnanReza (talk)07:20, 9 February 2016
 

Suggestions

Hi Adnan,

Your work is good, my part is control theory and your contents remind me that I need to add some more information on adaptive controller, to support your page. And the last part - limitations about when does RL fail, is exactly what I am interested in. And I have some suggestions about your work:

1. You have some references to the other wiki pages of this course, but most of them are empty. Can you add some general information or just a little bit explanation about those terms instead of the links direct to pages not edited?

2. I found the section <Value function approaches>, <Monte Carlo methods> referenced the wikipedia, and I think you might want to revise a little bit more later. Please do not forget!

3. Can you explain more about how is the reinforcement learning model is applied in practice in section <Applications>?

DandanWang (talk)23:21, 4 February 2016

Thank you for your feedback. I've fixed all the broken links and tried to simply a lot of the material. I've also revised the Value function approaches, Monte-Carlo methods and Applications sections based on your suggestions.

AdnanReza (talk)07:16, 9 February 2016
 

Suggestions for Reinforcement Learning

Hi Adnan,

First let me thank you for contributing in our course Wiki. What a fantastic job you have done. Also,the quality of the material, as well as its correctness is really good. I also like its length and level for a Masters course Wiki.
One thing that I want to mention that concerns me is that there are a lot of links on your page that point to other Wiki pages which are not developed yet. This does not help the reader (I understand that you wanted to follow the guidelines of the course, but maybe pointing to Wikipedia or other valid sources can be more useful.)
Although it is optional to include the "To Add" section, I really suggest to put some material there and let other readers (and perhaps contributors) know that what are the close fields and related parts that they can add and contribute (after all this is a Wiki page and must be developed by different users contributions.)
Other than these 2 minor points I honestly cannot suggest anything else, and I believe you have done a great job.

Cheers!

MehrdadGhomi (talk)06:59, 5 February 2016

Thank you for your feedback. I've fixed all the broken links and included the "To Add" section based on your suggestions.

AdnanReza (talk)07:12, 9 February 2016