Course talk:CPSC532:StaRAI2020:Query2Box and Faithful Embeddings for Knowledge Base Queries.

From UBC Wiki

Contents

Thread titleRepliesLast modified
Feedback208:00, 23 March 2021
Aid understanding106:35, 23 March 2021
Question105:16, 22 March 2021

I found it difficult to understand EmQL from the "Representation" section. The Query2Box paragraph was easy to understand, but the EmQL paragraph was impenetrable for me. E.g., I don't know what an "entity set" is! Is it a set of a fixed number of entities (e.g., the set of 6 students in CPSC 532) or the set of all graduate students (which keeps changing)? Why does that formula for a_X give a weighted centroid? S_H(V_x) is not defined. Don't use notation you haven't defined. What is circle-dot? I would also like to see a clearer explanation of relation following in EmQL. Can it be written in a more intuitive way like Query2Box?

DavidPoole (talk)00:36, 23 March 2021

The circle-dot is a Hadamard product, which I have mentioned now under the formula table. S_H(v_x) is a count-min sketch for v_x. There is a huge paper(which I have referenced) introducing a count-min sketch. I tried to explain in short what is a count-min sketch in "Introduction". It is a randomized data structure to approximate v_x  with limited storage. This is optional as mentioned in the paper. The reason for this approximation is that a v_x vector has a dimension equal to the number of entities(which can be very huge for practical KBs) and thus this representation can be inefficient for some operations. So, they introduced count-min sketch S_H which kind of approximates v_x with fewer dimensions using hash functions. The entity set in EmQL is similar to Query2Box, where a_x defines the weighted centroid of the set X that identifies the general region containing elements of set X. Query2Box uses the box, while EmQL uses region around centroid to encode sets of entities. A major drawback of their system is that for this kind of system to work, they need entities that can appear together in a set to have entities vectors closer to each other. So, entities vector needs to be pre-trained keeping this in mind.

I hope this clears some doubts.

Also, I will try to edit the article so that it is more intuitive.

MAULIKMAHESHBHAIPARMAR (talk)07:02, 23 March 2021

Also, the paper has not mentioned how do they create an initial v_x vector for sets. How are the weights for sets decided is also not mentioned anywhere(even not in supplements).

I have created an Ambiguity section at the end of the article, for questions that the paper missed clarifying.

MAULIKMAHESHBHAIPARMAR (talk)07:25, 23 March 2021
 
 

Aid understanding

I think the main issue of "the problem of faithfulness to deductive reasoning" is not explained in the page, can you explain what is meant my faithfulness to deductive reasoning, maybe using laymens terms or with some example ?
ObadaAlhumsi (talk)18:27, 22 March 2021

Hi Obada, to make it clear, I have added a short subsection under the introduction to define what is meant by "faithfulness".

MAULIKMAHESHBHAIPARMAR (talk)06:35, 23 March 2021
 

I'm trying to visualize the regions defined by EmQL. Quick question: are those regions convex?

LuccaSiaudzionis (talk)01:13, 22 March 2021

Hi Lucca, here the problem is that they represent a set of entities using a weighted sum of entity vectors. So, they have similar logic to point-based embedding. You could say a point is a trivial convex set.

Also, when computing answers to a query like relation following, the top-k similar triplet embedding is generally the answer. But as k is the same for every query, it is not necessary this region will be convex(unlike in threshold-based system where this region will be a convex sphere with the center as centroid and radius equal to the threshold).

MAULIKMAHESHBHAIPARMAR (talk)05:16, 22 March 2021