Pldiff

From UBC Wiki

Pldiff - A diff tool implemented in PL!

Authors: (Derek Huang, Quinn Tao, Chendong Luo)

What is the problem?

A diff algorithm outputs the set of differences between two inputs, which is the basis of a number of commonly used developer tools. Git is one good example where files differences are generated by a diff algorithm.

Inspired by the `dif` primitive call in PL, our goal is to learn and understand diff algorithms commonly used and extent the functionality of 'diff' so that we can do something like `A:- diff(fileA, fileB)`, `A:- diff(Str1, Str2)` etc.


Something extra

The easiest way to implement such alogrithms takes O(N^2) time and space. After researching most current widely used diff algorithms, we found Myers describes a impressive algorithm to look at two sequences and tries to find the minimum number of insertions, deletions, or substitutions required to change sequence A into sequence B. This algorithm can improve the time and space to O(ND). The original paper can be found here AnO(ND) difference algorithm and its variations. We thought that would be cool if we implement a CLI tool to report differences between strings and files using Myers Algorithms in PL.

What did we learn from doing this?

  1. Learned how to leverage academic research papers to improve our algorithm and prove the correctness.
  2. Learned how difficult to keep track of the execution flow when the program is getting big in Prolog. The non-linear execution flow due to bracktracking make it very difficult for us to trace bugs. Also we can't declare a data type and have to use pattern-matching to deal with composite data in Prolog.
  3. Non-Determinism is a powerful feature that allows PL to expolre multiple paths in search of all possible solutions in a query, but It's also annoying and challenging for us to make our method deterministic. We have to be very careful when writing the base cases and using recursions.
  4. Prolog works very well in scenarios where we can describe problems in terms of relations or rules like data systems and knowledge-based sytems (like wiki seen in the lectures). However, it's not designed for general-purpose programming and lacks the efficiency and simplicity for tasks better suited to imperative languages.


Howerver, It's quite fun to work on the a project using Prolog. This experience helps us understand the differences between functional programming and logic programming better. We believe the experience could help us learn any kinda programming languages better in future.


Work division

1. Middle Snake: a module that calculates the middle point of the shortest edit script - Quinn.

2. Search: Divide and Conquer to search for an optimal edit script given an implementation of Middle Snake - Derek.

3. Rendering Differences & CLI: Determine the insertions, deletions, calculate the minimum changes required, and rendering the differences. - Chendong

Everyone had a main focus and also everyone was also doing pair programming, brainstorming, and figuring out problems together. We worked as a efficient team, and we deserve thumbs up. For this project, everyone of us put lots of efforts. We hope people like it. Enjoy!

Link to Implementation

https://github.com/Chendong-Luo/Pldiff