Course talk:CPSC522/Maximum Entropy Markov Models

[View source↑]
[History↑]

Suggestions

Hi Mehrdad,

A good wiki page, but I have some suggestions,

1. There is a figure (with GIS) in this page. I think you should explain this figure by providing more detail information. It's a little hard to understand without knowing what is f1, f2, f3.

2. In the "Some of Applications" parts, what's the difference between "Part-of-speech tagging" and "Named entity recognition" based on the example I cannot tell their difference.

3. The "MoP-MEMM" part is a too simple. Without any figure its really hard to imagine the usage of those equations.

Sincerely,

Junyuan Zheng

JunyuanZheng (talk)‎

Hi Junyuan,

Thank you for your very insightful feedback. I will work on my page and solve the problems you mentioned. But for your second pointer: Part of Speech tagging is basically giving tags such as: proper noun, prepositions, various types of verbs, etc. to each word. While Named entity recognition is extracting information such as: "Person" works at "Organization" --- "Person" lives in "City" --- etc. I will add more material for MoP-MEMM part.

Thank you again!

Cheers,

Mehrdad Ghomi

MehrdadGhomi (talk)‎

Hi Mehrdad,

I understand this now! Thanks for explain this to me!

Sincerely,

Junyuan Zheng

JunyuanZheng (talk)‎

critique

Hi Merhdad,

A solid first draft! Fairly easy to read, and the overall flow is reasonable.

General comments:

In the way the page is organized, it's not intuitive to me where the two papers come in, or what parts of the content are from which paper. Also, it doesn't seem to be clearly stated what exactly the contribution of the second paper is.
For your references, instead of using superscript tags, there are reference tags you should be using. It makes it a lot easier to maintain the references (the wiki does it for you).
Some of your equations are done using images instead of math tags; it's better to consistently use the math tags to fit the assignment specification.

Section-specific comments:

Abstract
- There seems to be some overlap between your abstract and background knowledge sections. I'd recommend removing the background knowledge part from the abstract and just cover it in a "Builds on" section, as per the page template.
- Some parts of the abstract feel more like an introduction than a summary of the content.
- There is some redundancy (eg. reference to conditional model twice)
HMMs
- Markov networks are undirected, so HMMs are more like Bayes networks with Markov properties.
Maximum Entropy
- The writing style is a bit confusing here. For example, the first sentence seems to say that maximum entropy models have entropy, which is obvious; and it takes until after the part in parentheses to realize that the sentence was actually saying something else.
Skip-chain Models
- I see that the images are credited to the proper source if you click on them, but I'd recommend crediting them directly in the page as well. You can see how figures are handled in the Leaning Markov Logic Network Structure page for an example of what I mean, and the help pages in the UBC wiki give different options for how to do it.
- I was able to glean what $y_{\pi _{k}}$ is from the figure, but it would be good to state it explicitly.
Motivation
- Why do we want to model the posterior? Can you give an example?
Model and Algorithm
- What is $\lambda _{a}$ ?
- What I'm getting from this section is that MEMMs are HMMs where the transition probabilities are learned from a set of features applied to a maximum entropy model. Is that correct?
Other Alternatives
- A bullet list would be more readable here.
- "Here we present its model and we provide a specific task and analyze its performance on it." Are you providing the task and analyzing the performance? Or are you summarizing the authors' work in providing the task and analyzing the performance? If this is a quote from the paper, it needs to be quoted and cited.
MoP-MEMM
- It would be good to explicitly state the new terms and symbols used in the equations
- I'm having trouble understanding how this would be useful (i.e. how it solves the problems you mentioned earlier).
The Task and Results
- Which authors?

Clear skies,
Jordon

JordonJohnson (talk)‎

Hi Jordon,

Thank you for your very insightful feedback. I will work on my page and solve the problems you mentioned. I will rearrange my page. I thought we should read the 2 papers and develop a page about the concept in the same format of the previous page we developed. So rearranging would help a lot. I will change the images into math format and add some citations. and remove all the duplicity and redundancies. And explain in more detail that which paper contributed to which parts. and add a legend kind of thing for the formulas. Thank you again!

Cheers,

Mehrdad Ghomi

MehrdadGhomi (talk)‎

Feedback

Hey Mehrdad,

The backgrounds section is well put together and provides just enough sufficient information to be useful. My only gripe is that you try to explain how HMMs are limited by not having "features" for the observations. This shortcoming doesn't make sense until reading about MEMMs. (Because like you've said, "features" don't exist in the HMM model, so I think this explanation of a shortcoming of the model should be placed when MEMMs are explained.)

Regarding the MEMM section, as mentioned above, I feel that a more comprehensive comparison between this model and the HMM model is warranted. Right now, half of this comparison is in the HMM background section and half is in this section. I'm also not sure what you mean by HMMs requiring features to be independent, as HMMs don't model any features...

I think someone without a background on entropy/energy models and generalized linear models may have a hard time understanding why the formula for P(s|s',o) takes that form. Also, something is off about the formula: while the features depend on o and s, the normalization constant depends on o and s'? Shouldn't they both depend on all three variables (s, s', o)? A more thorough explanation of the formula may be helpful.

For the MoP-MEMM model, it is not clear how the mixture weights are learned. I'm also not sure why this mixture weight ( $\alpha _{kj}$ ) depend on the indices k and j? Shouldn't it be indexing using the variables $y_{k}$ and $y_{j}$ instead? (Unless the model assumes a fixed length?)

Just out of curiosity, there must be a limit to how many skip connections (parents) we add to each observation. Does the paper mention any complexity estimates? While the motivation is to model long-term dependencies in the sequence, are there any statistics for how hard it is to learn this in MoP-MEMM?

Overall, informative wiki page, as it introduces many simple concepts that all integrate into a large, more complex model.

Thanks for the great read,

Ricky

TianQiChen (talk)‎

Hi Ricky,

Thank you for your very insightful feedback. I will work on my page and solve the problems you mentioned. HMMs assume that these features are independent, things such as: Capitalization of the first letter and lets say the position of the word in the sentence. I will re-arrange some of the sections so the flow becomes better. One of the difficulties that I have is that this paper assumes a lot of background knowledge. Perhaps I should explain a lot in background knowledge. I am not a Skip-chain expert, but certainly there are restrictions like that. Thank you again!

Cheers,

Mehrdad Ghomi

MehrdadGhomi (talk)‎

Some suggestions

Hi Mehrdad,

It is overall a good wiki page, it is easy to understand and there are many external links for me to refer to. However, I have some small suggestions to refer to. 1. The motivation section should be introduced earlier to help people to understand. 2. Figures should better have captions, thus they will be easy to refer to in your page. 3. The description for MOP-MEMM is a little short, could you discuss more about it? It is a little hard for me to understand some of the fomulas.

Best regards, Jiahong Chen

JiahongChen (talk)‎

Hi Jiahong,

Thank you for your very insightful feedback. I will work on my page and solve the problems you mentioned. I will bring the motivation part upper in the page. I will change the captions of the images and describe the MoP-MEMM part more and explain the formulas more. Thank you again!

Cheers,

Mehrdad Ghomi

MehrdadGhomi (talk)‎

Thread title	Replies	Last modified
Suggestions	2	20:32, 13 March 2016
critique	1	21:01, 12 March 2016
Feedback	1	20:56, 12 March 2016
Some suggestions	1	20:35, 12 March 2016

Contents

Suggestions

critique

Feedback

Some suggestions