Feedback

I found it difficult to understand EmQL from the "Representation" section. The Query2Box paragraph was easy to understand, but the EmQL paragraph was impenetrable for me. E.g., I don't know what an "entity set" is! Is it a set of a fixed number of entities (e.g., the set of 6 students in CPSC 532) or the set of all graduate students (which keeps changing)? Why does that formula for a_X give a weighted centroid? S_H(V_x) is not defined. Don't use notation you haven't defined. What is circle-dot? I would also like to see a clearer explanation of relation following in EmQL. Can it be written in a more intuitive way like Query2Box?

DavidPoole (talk)‎

The circle-dot is a Hadamard product, which I have mentioned now under the formula table. S_H(v_x) is a count-min sketch for v_x. There is a huge paper(which I have referenced) introducing a count-min sketch. I tried to explain in short what is a count-min sketch in "Introduction". It is a randomized data structure to approximate v_x with limited storage. This is optional as mentioned in the paper. The reason for this approximation is that a v_x vector has a dimension equal to the number of entities(which can be very huge for practical KBs) and thus this representation can be inefficient for some operations. So, they introduced count-min sketch S_H which kind of approximates v_x with fewer dimensions using hash functions. The entity set in EmQL is similar to Query2Box, where a_x defines the weighted centroid of the set X that identifies the general region containing elements of set X. Query2Box uses the box, while EmQL uses region around centroid to encode sets of entities. A major drawback of their system is that for this kind of system to work, they need entities that can appear together in a set to have entities vectors closer to each other. So, entities vector needs to be pre-trained keeping this in mind.

I hope this clears some doubts.

Also, I will try to edit the article so that it is more intuitive.

MAULIKMAHESHBHAIPARMAR (talk)‎

Also, the paper has not mentioned how do they create an initial v_x vector for sets. How are the weights for sets decided is also not mentioned anywhere(even not in supplements).

I have created an Ambiguity section at the end of the article, for questions that the paper missed clarifying.

MAULIKMAHESHBHAIPARMAR (talk)‎