Course talk:CPSC522/Markov Decision Process

From UBC Wiki

Contents

Thread titleRepliesLast modified
Suggestions for MDP118:29, 5 February 2016
MDP suggestions306:06, 5 February 2016
Suggestion104:59, 5 February 2016
Comments104:57, 5 February 2016

Suggestions for MDP

Hi Abed,

First let me thank you for contributing in the development of the course Wiki. You have done a fantastic job, I really like the length of the page and found the material completely correct (as much as I can tell.)
Your Figures and Pseudo-codes were really useful and easy to understand as well.
I can suggest 3 points for your page:
1) Add some links from your page to web pages other than our Wiki to help the reader understand some of the terminologies that you have used (Because they are not experts like you on the topic and some pointers can really help).
2) The language is good for an expert reader that has knowledge related to the topic, but most of the times people read wiki pages to get an overview of the topic and its main related subjects. So I would say it is a bit technical for someone that lacks knowledge on the subject.
3) Although it is optional to include the "To Add" section, I personally believe that can really help the future developments of this page and makes it easier for other readers (perhaps contributors) to start developing this page.
Other than these minor points, I would say you have done a great job.
Cheers!

MehrdadGhomi (talk)07:08, 5 February 2016

Thanks Mehrad for your valuable feedback and keen insights. I would try my best to make the necessary modifications.

MDAbedRahman (talk)18:29, 5 February 2016
 

MDP suggestions

Hi Md Abed Rahman, Yaashaar Hadadian Pour , Adnan Reza Awesome page! It helped me a lot in understanding MDP. Here are some of the things I needed clarifications on:

  1. what is  ? Is it the action at time t?
  2. In the reward and optimal policy section, I couldn't understand how did you get the values 1.6284, 0.4278, 0.0850 etc.
  3. For value iteration could you provide some explanation for the pseudo code like the one that is present in policy iteration?
  4. Just like Jiahong Chen has mentioned a few practical examples would be fun to read.
SamprityKashyap (talk)04:15, 5 February 2016

Thanks Samprity for your feedback. We would definitely try to make the necessary improvements. Any future comments and/or feedback will be thoroughly appreciated.

MDAbedRahman (talk)04:46, 5 February 2016
 

Also, I forgot to answer the points/clarifications. Here are they and also we will try to put them in the Wiki
Regarding question 1, yes it is the action at time t picked by the policy
Regarding question 2, the values there are values picked to show how a highly punishing to highly rewarding values for traversing through nodes can change the optimal policy. The value ranges are mostly empirical here and only calibrated to this situation. We are not sure whether putting the explanation to this might be useful in the Wiki. If you have any insights on that, we would be happy to hear them. Regarding question 3 and 4, we would definitely try our best :)

MDAbedRahman (talk)04:56, 5 February 2016

Thanks for the clarifications! I thought you might have used some equations to reach the values.

SamprityKashyap (talk)06:06, 5 February 2016
 
 

Suggestion

Hi,

It a pleasure to read your page, I think you have given fantastic citations and mastered at using latex to represent formulas. Also, you gave detailed example and pseudo-code to explain what you want to mean. You really helped me a lot in understanding this topic.

The only suggestion is that your page might be too technical as a wiki page, maybe you can add some applications of MDPs to help readers better understand this field.

Best regards,

Jiahong Chen

JiahongChen (talk)04:02, 5 February 2016

Thanks Jiahong for your feedback. I agree with you and we would try to put some applications of MDP to improve the page even more. We would love to hear if you have any other valuable comments and/or ideas to improve the page

MDAbedRahman (talk)04:14, 5 February 2016
 

Looks good to me! Here are my scores:

(5) The topic is relevant for the course.

(4) The writing is clear and the English is good.

(5) The page is written at an appropriate level for CPSC 522 students (where the students have diverse backgrounds).

(3) The formalism (definitions, mathematics) was well chosen to make the page easier to understand.

(5) The abstract is a concise and clear summary.

(3) There were appropriate (original) examples that helped make the topic clear.

(4) There was appropriate use of (pseudo-) code.

(4) It had a good coverage of representations, semantics, inference and learning (as appropriate for the topic).

(5) It is correct.

(4) It was neither too short nor too long for the topic.

(5) It was an appropriate unit for a page (it shouldn't be split into different topics or merged with another page).

(5) It links to appropriate other pages in the wiki.

(5) The references and links to external pages are well chosen.

(4) I would recommend this page to someone who wanted to find out about the topic.

(4) This page should be highlighted as an exemplary page for others to emulate.

Commnets:

1. Add more examples to help understanding.

2. The format of the algorithm should be more clear.

3. Maybe we should add a section of applications to show power of MDP?

In summary, the content is pretty good. And it is better if we can add more examples and applications.

YanZhao (talk)04:45, 5 February 2016

Thanks YanZhao for your feedback. We would definitely try to make the modifications needed.

MDAbedRahman (talk)04:57, 5 February 2016