Course talk:CPSC522/Latent Dirichlet Allocation
|Thread title||Replies||Last modified|
|Suggestions||1||03:39, 16 March 2016|
|Critique||0||07:51, 11 March 2016|
|Suggestions||2||22:24, 10 March 2016|
|Some suggestions regarding Latent Dirichlet Allocation||1||06:56, 10 March 2016|
This page is very interesting; I can understand those two models.
But I think the only problem is this page used so many mathematic equations, and some of them is presented without suitable explanation, for instance, the equaion about "Density of dirichlet is given by: ". My suggestion is this topic is an abstract subject, some equation did not explain the topic, but it make people confuse, so it would be better to delete some unnecessary equation or add more explanation.
Hi Junyuan Zheng Thanks for your feedback. I edited my page to address your concerns. I provided a layman's explanation after that dirichlet description. Also please note understanding dirichlet is not necessary to understand the concept of LDA. So i restricted myslef to give a full fledged description of all notations. However i kept to bare minimum so that the interested user may play with it to see how choices of different parameter inside dirichlet can affect the final outcome of LDA. hence i just mentioned the mathematical distribution that dirichlet use, without going into a detalied description of it
Hi Prithu Great page! You wrote the basics in a lucid manner for us to understand. Thumbs up for the Layman's Explanation of LDA ! You can consider putting in an abstract. That would help readers in understanding what the paper is about in a quick glance. Some questions I had:
- In the generative model of LDA you have written Matrix is a , k is the number of topics, what is V in this case?
- How do we decide on the values of topics or is that known like the number of topics?
- Does LDA essentially act like a recommendation model?
- Is CTM always better than LDA? Is there any scenario where considering the correlation between topics is actually harmful?
Hi Samprity, thanks for your feedback. Here is a quick rebuttal: 1. V is explained in the notation section first bullet. Essentially it's the vocabulary i.e. the collection of all the words of the corpus (after doing some preprocessing such as stemming, stopword removal etc) 2. By values of topic do you mean the name of the topic? I am answering it assuming so. See here the name of the topic does not really matter, think it as of an unsupervised clustering where the what the cluster means, is not so relevant (or rather application relevant.). Having said that, when LDA applied on text data, there is an easy hack to name the clusters. Remember, topics are distributions over words, so by looking at the most probable words inside a topic distribution, one can fairly guess the topic name. 3. No LDA on its own is not a recommendation model. It just maps each observation(aka documents) to a (in most cases) smaller space (aka topics) from where they are supposed to be generated from. Having the smaller space mapping helps in comparing observations, and hence if needed application may use this similarity measure to recommend similar set of other observations. 4. I guess CTM is always as good as LDA. Given there is no bound on inference time, there is no need to prefer LDA over CTM. As you can see also from the generative model it's mostly similar to LDA. In case there is no correlation across topics in the dataset, CTM will learn a covariance matrix which is even. So it will essentially be an LDA. Thus according to my understanding performance wise it is never inferior to LDA.
I will include an abstract and also some of the stuff from my response, if this looks convincing to you. Thanks for your suggestions again,
Hi Prithu, Very nice to see your wiki regarding the LDA and it really helps me to understand the topic that I have never stepped into. You give very detailed explanation covering most of the aspects of the LDA. I am wondering whether you can give some incremental comparison between the first paper and the second paper and their respective contribution?
Regards Arthur Sun
Hi Arthur, Glad it helped you understand LDA. Yes I have already added a section on that. The section "Comparing CTM and LDA" is intended to precisely highlight that incremental difference. Let me know if that serves, or if you suggest elaborating that further.