response

Hi Prithu,

I'm not sure I understand the question correctly, but I believe you're asking about looking at applications at a smaller granularity than the sentence level. I suppose there would be two chief considerations:

  • What is the smallest granularity at which the idea of a "topic" is meaningful? I suspect it would be at the clause level, where each clause represents a complete thought. For example, if we take the sentence "I love pie, and my son plays Minecraft all the time," there are two thoughts there, and each of those thoughts is about at least one topic (favourite foods, favourite video game, how we spend our time, etc.). So, if we were to attempt to break down sentences for more detailed subject agreement, I suspect we'd have to stop at the clause level; but that leads me to the next consideration:
  • How do we deal with aggregation? For example, consider a conversation with two topics: dogs and cats. If I say "My dog and my cat are both very good with children," then in a way I've aggregated the topics into one clause; and it might be worthwhile to split the aggregation and deal with that sentence as if it were two sentences ("My dog is good with kids" and "My cat is good with kids"). From a cosine similarity point of view, the aggregate clause would score relatively well with sentences about each of the two topics, but the split pair of sentences would likely score higher with their respective topics.

Of course, I don't have any actual data to back this up, but it seems reasonable to me. Did I actually address what it was you were asking, or did I misunderstand the question completely?

JordonJohnson (talk)19:34, 22 April 2016

Ya granularity was my query. What you said makes perfect sense. I encountered this granularity barrier in my project as well. For example "I wont the place was bad. If you have anything but taco in mind then you are safe" - it's challenging to understand the taco was bad at a restaurant, however other items were alright.

PrithuBanerjee (talk)05:56, 23 April 2016