Suggestions

Hi Samprity, thanks for your feedback. Here is a quick rebuttal: 1. V is explained in the notation section first bullet. Essentially it's the vocabulary i.e. the collection of all the words of the corpus (after doing some preprocessing such as stemming, stopword removal etc) 2. By values of topic do you mean the name of the topic? I am answering it assuming so. See here the name of the topic does not really matter, think it as of an unsupervised clustering where the what the cluster means, is not so relevant (or rather application relevant.). Having said that, when LDA applied on text data, there is an easy hack to name the clusters. Remember, topics are distributions over words, so by looking at the most probable words inside a topic distribution, one can fairly guess the topic name. 3. No LDA on its own is not a recommendation model. It just maps each observation(aka documents) to a (in most cases) smaller space (aka topics) from where they are supposed to be generated from. Having the smaller space mapping helps in comparing observations, and hence if needed application may use this similarity measure to recommend similar set of other observations. 4. I guess CTM is always as good as LDA. Given there is no bound on inference time, there is no need to prefer LDA over CTM. As you can see also from the generative model it's mostly similar to LDA. In case there is no correlation across topics in the dataset, CTM will learn a covariance matrix which is even. So it will essentially be an LDA. Thus according to my understanding performance wise it is never inferior to LDA.

I will include an abstract and also some of the stuff from my response, if this looks convincing to you. Thanks for your suggestions again,

best Prithu

PrithuBanerjee (talk)‎

Thanks for the clarifications! Looks good to me now :)

SamprityKashyap (talk)‎