Course talk:CPSC522/Knowledge-Aware Graph Networks for Commonsense Reasoning
- [View source↑]
- [History↑]
Contents
Thread title | Replies | Last modified |
---|---|---|
Critique 2 | 1 | 04:52, 19 February 2023 |
Critique | 1 | 04:47, 19 February 2023 |
Initial Comments | 1 | 04:24, 19 February 2023 |
This is a well-written article. I enjoyed reading an application of GCNs using knowledge graphs on a task such as commonsense reasoning. The second paper is about GCNs provides relevant descriptions to understand the architecture of the first paper. Each mathematical equation is nicely explained, and I liked that the author outlined certain drawbacks with the current approach and opinions for future research. Here are some of my suggestions or thoughts after reading this page.
- Although a KG can encode topological information between the concepts, I see one drawback in this approach that it can lack rich context information. For example, if a graph node is “Mona Lisa”, the graph depicts its relations to multiple other entities. But given this neighbourhood information, it might be hard to infer that it is a painting or require larger hops in the KG. Wouldn't it make more sense to retrieve more precise definition/knowledge from external sources, e.g. the definition of Mona Lisa in Wiktionary is “A painting by Leonardo da Vinci, widely considered as the most famous painting in history”. I am not sure whether there has been work in this direction but it would be nice to read the next page in this direction.
- One part that I like is that the architecture can provide interpretable inferences. At the same time, the approach heavily relies on ConceptNet as the CKG (this is a static KG). I am wondering how this approach would work a concept is not found in this KG, or provides older, outdated inferences.
- What other downstream tasks can this approach be used for except Commonsense QA?
- Do you think that a Graph Attention Network can be used instead of a GCN? if so, could that remove the step of applying Hierarchical Attention Layer to the architecture.
- In the performance section under Table 1, KagNet performance is 58.9 on OF test set, but under Table 3, the accuracy is 82.15, here the test set is IH. Is there a difference between OF and IH test sets? Some clarity will be appreciated.
Thank you Nikhil for your critique! I am glad you enjoyed reading this page. I added a link to your excellent foundation page on Deep Learning on Graph structures.
I will use your feedback to decide on the papers for the March submission. Thank you for letting me know about the confusion between OF and IH terms. I have fixed that.
Cheers
I think the explanation of the papers was very understandable. I had only a few comments, all in the introduction section. I put them as PDF comments here
Thank you for your critique, Yinian. This helped me add specific links and additional descriptions so that a reader with less background knowledge can understand! Cheers!
Overall this is great! Well done!
Sometime later in the term, we should have a discussion on:
- the "magic" of GCN which just means that it provides a strong bias/prior that may or may not correspond with what is desired (the example you give is a property of aggregation used)
- the appropriateness of paths (paths provide limited representation power, e.g., the graphs formed by reification are not defined by paths, but by more structured graphs)
- the assumption that commonsense is captured by finding the most likely answer in multiple choice questions. (Children learn commonsense by playing and interacting in the world.)
Thank you for this comment, Prof. David. I agree with these points and I would love to have a discussion on the following points and/or probably focus my second page (March) on these key points as well! Cheers, Mehar Bhatia