Feedback

Hey Mehrdad,

The backgrounds section is well put together and provides just enough sufficient information to be useful. My only gripe is that you try to explain how HMMs are limited by not having "features" for the observations. This shortcoming doesn't make sense until reading about MEMMs. (Because like you've said, "features" don't exist in the HMM model, so I think this explanation of a shortcoming of the model should be placed when MEMMs are explained.)

Regarding the MEMM section, as mentioned above, I feel that a more comprehensive comparison between this model and the HMM model is warranted. Right now, half of this comparison is in the HMM background section and half is in this section. I'm also not sure what you mean by HMMs requiring features to be independent, as HMMs don't model any features...

I think someone without a background on entropy/energy models and generalized linear models may have a hard time understanding why the formula for P(s|s',o) takes that form. Also, something is off about the formula: while the features depend on o and s, the normalization constant depends on o and s'? Shouldn't they both depend on all three variables (s, s', o)? A more thorough explanation of the formula may be helpful.

For the MoP-MEMM model, it is not clear how the mixture weights are learned. I'm also not sure why this mixture weight ( $\alpha _{kj}$ ) depend on the indices k and j? Shouldn't it be indexing using the variables $y_{k}$ and $y_{j}$ instead? (Unless the model assumes a fixed length?)

Just out of curiosity, there must be a limit to how many skip connections (parents) we add to each observation. Does the paper mention any complexity estimates? While the motivation is to model long-term dependencies in the sequence, are there any statistics for how hard it is to learn this in MoP-MEMM?

Overall, informative wiki page, as it introduces many simple concepts that all integrate into a large, more complex model.

Thanks for the great read,

Ricky

TianQiChen (talk)‎

Hi Ricky,

Thank you for your very insightful feedback. I will work on my page and solve the problems you mentioned. HMMs assume that these features are independent, things such as: Capitalization of the first letter and lets say the position of the word in the sentence. I will re-arrange some of the sections so the flow becomes better. One of the difficulties that I have is that this paper assumes a lot of background knowledge. Perhaps I should explain a lot in background knowledge. I am not a Skip-chain expert, but certainly there are restrictions like that. Thank you again!

Cheers,

Mehrdad Ghomi

MehrdadGhomi (talk)‎