Course talk:CPSC522/Ontology Extraction

From UBC Wiki

Contents

Thread titleRepliesLast modified
Feedback from F5100:38, 19 March 2019
Peer Review021:54, 18 March 2019

Feedback from F5

Overall:

  • Your article provides a good overview of the problem domain, approach and experiments of the work.
  • It could be reasonably tilted away from characterization of the nuances of the field and towards a more technical exposition of the core technical challenges and methods.
  • I feel that the quality of your writing could profit from more concision. To be more precise, your style is pleasantly literary and coherent (local text property), but the information density (global text property) is lower than it could be.

Technical Detail:

  • More references to established methods and concepts would be helpful, apart from the papers presenting complete frameworks which use those methods as components.
  • While the lack of mathematical formalism seems to be inherent to the nature of the field, there are many algorithmic notions central to the work, which you could highlight, e.g., text pre-processing, term frequency and similarity/distance measures (on strings, graphs, semantic entities), hierarchical clustering algorithms, ontological relations. This is probably of higher interest than software modules and file formats (although it is worthwhile commenting on OWL, for example).
  • In brief, you could amplify the NLP aspects and dampen the software architecture aspects of the paper.
BoyanBeronov (talk)14:09, 18 March 2019

I am certainly aware that there are more technical NLP aspects in general. However, the nature of the assignment is to discuss incremental progress, primarily using two papers. Those two papers present overall systems of ontology extraction. In particular, the authors of the papers are not advancing some novel NLP algorithms as the primary contributions, discussing them in detail. If I were to focus on NLP aspects, these would not be the right papers; and I am not as much into NLP as into ontology per se. One of the Maedche and Saab's papers I cited has over 2,000 citations. I am fairly sure that at least back in the time there was something novel in their idea about the framework. Also, discussion of supporting technical aspects themselves seem to be more appropriate for topic articles or to be relegated to sources mentioned in the "Build on" section, given the nature of the task.

EDIT: in hindsight, I may have given the wrong impression that ontology extraction solely concerns natural language texts. The paper by Maedche and Saab I mentioned in particular concerns the semantic web, and also Gaeta et al mentions that they might explore other sources of knowledge as well. The former authors seem to do only shallow parsing; I suppose that developing sophisticated natural language parser might not necessarily be useful in handling other forms of information. It might be an interesting topic, albeit unrelated to the task I worked in relation to the two papers, in its own right: what algorithms enable such a system to extract conceptual structures, and how they compare with full fledged NLP algorithms. Anyways, the big picture I am interested in is the one that construes knowledge base as an interface between queries (natural language processing side, which eventually yields first order logic queries on the knowledge base) and the knowledge extraction side, which includes ontology construction from web sources. I first saw this picture in CPSC 422, and it is in the second slide set I added in the "Build on" section. The side I am especially interested in is the latter, out of the two -- it is the entire picture that seems to hold something that can reveal the nature of the relationship between language and the world.

Although I do not agree with all points you made, I acknowldge that they are well articulated criticism.

ShunsukeIshige (talk)21:45, 18 March 2019
 

Peer Review

Hi Shunsuke,

Here is some feedback for your submission.

Overall, the problem definition is clear and interesting. You also cite everything properly, which helps distinguishing what the authors' claims are.

I think you used the bullet points very well in the section introducing the two papers' methods. However, some of them might need more primitive definitions and less details to help understand the big picture.

In the Case Study section, it's not clear what "the additional features provided by the ontology extraction" exactly were. You summarize some of the components under the section header, but I don't see how the learning features in the experiment use the relations and concept ordering.

I agree with your criticism around the high-level design experiments, i.e. sample size and lack of randomization. However, I think some criticism on the methods themselves might be helpful.

NamHeeKim (talk)21:54, 18 March 2019