Course talk:CPSC522/Pretraining Methods for Graph Neural Networks

[View source↑]
[History↑]

Discussion 1

I loved reading the article. It was easy to understand. I have the following minor comments:

Abstract: what are the other ways of improving generalization, apart from pre-training? For the sake of novice readers, it would be better to explain pre-training in a single line in the abstract itself.

Background: Please make it more interactive, and add images of GNNs and GCNs for better visualization.

Minor fixes: (Grammar) Can self-supervised training can improve the generalization and robustness capacity of the GCNs?

MAYANKTIWARY (talk)‎

Thanks a lot for the review,

I have slightly modified the abstract to introduce what pre-training is. I am not sure if I should talk about other methods that can improve generalization (using batch norm, bigger batch, generalizable loss, more expressive GNN, etc) since that is a different rabbit hole. I am happy to hear your thoughts if you think otherwise.
Have added an image of a GCN in the background. I also feel based on David's point, there should be a foundation page of graph neural networks which will have the figures you are requesting. For this page, I would be assuming some background knowledge of Graph Neural Networks.
Fixed the pointed grammatical error, thanks!

NIKHILSHENOY (talk)‎

Critique

Outstanding read! I thoroughly enjoyed reading this page. It helped me understand some core concepts. I have a few questions that I feel will be worth discussing.

1. Can you make the shift from the first paper to the next more smooth? It feels a bit arbitrary. Related to David’s point, some clarity over how these two papers are linked would be helpful.

2. It’s a bit difficult to understand what method (self-supervised learning / pre-training) works best for improving generalization. Can you conclude with the method you think would be ideal?

3. I can see that you have added figures to highlight concepts. Can you add a line in each section pointing to the subfigure referring to that subsection?

4. What does it mean by GIN network is most expressive, can you maybe add a line on that?

5. Can you add intuitively why node-level pretraining or graph-level pretraining does not work as well as the introduced method?

MEHARBHATIA (talk)‎

Thanks a lot for the detailed review,

Since, the connection between the two papers wasn't clear, I have modified the abstract to make it explicit that the two papers solve separate problems, changed the naming of the headers of the two paper sections, and also modified the conclusion. I hope this makes it more clear and easy to understand.
Since they solve separate problems, I have mentioned in the conclusion what are the key takeaways for each task (node classification and graph classification)
Added some pointers to figures.
Added a line about expressivity of Graph Isomorphism Network.
Added a section (https://wiki.ubc.ca/Course:CPSC522/Pretraining_Methods_for_Graph_Neural_Networks#Issues_with_using_only_Node-Level_and_Graph-Level_Pre-training_Strategy) on why using individual node-level pre-training and graph-level pre-training does not yield useful results.

NIKHILSHENOY (talk)‎

Initial Feedback

Overall this looks really good.

I see that you give the source when clicking on the figures, but you should give the source on the web page (in the caption) so the casual reader can see what is yours and what is others.
It might be better to move the background into a page on GNNs or CGNNs or SSL for GNN, and then use one as the Jan/Feb page and the other as the March page. Then the background can be explained better as a tutorial introduction (including more words/explanations around the equations). Only one of the pages needs to be finished for the upcoming deadline (but can be referred to - just mark it as an unfinished draft).
Some of the formulae are difficult to parse if the reader doesn't already know what is going on, eg $Z={\hat {A}}{\text{ReLU}}({\hat {A}}XW_{0})W_{1}$ where it isn't clear when some letters are written together it means matrix multiplication, and sometimes it doesn't. Perhaps use lower case single letter for functions, and reserve upper case for matrices (you need to choose and explain the meaning).
I didn't see a comparison of the papers. What was the incremental contribution of one over the other? Did one replace the other or are they solving different problems? What can we learn from the two papers than we couldn't learn from each one separately? Perhaps a short comparison should be in the abstract and a more concrete one in the conclusion.

DavidPoole (talk)‎

Thanks a lot for the detailed review.

I added the source for each figure/table that has been used.
I moved most of the background to another foundation page ([1](https://wiki.ubc.ca/Graph_Neural_Networks)) and added a (Builds On) section to my article. This is currently a work in progress and will be done in the next assignment.
The equation has now been moved to the foundation page and I have added a few lines [2](https://wiki.ubc.ca/Graph_Neural_Networks#Graph_Convolutional_Networks) to explain how the equation is to be read.
So the two papers discuss pre-training strategies on different graph-based tasks (node classification and graph classification). Since this was not clear, I have modified the abstract to make it explicit that the two papers solve separate problems, changed the naming of the headers of the two paper sections, and also modified the conclusion. I hope this makes it more clear and easy to understand.

NIKHILSHENOY (talk)‎

Thread title	Replies	Last modified
Discussion 1	1	02:46, 15 February 2023
Critique	1	02:40, 15 February 2023
Initial Feedback	1	02:28, 15 February 2023

Contents

Discussion 1

Critique

Initial Feedback