Functional data structures and algorithms
Authors: Imelda S, Rachel Z, Josh Z
What is the problem?
In CPSC 221, we learn more about various sorting algorithms and data structures in C++, such as lists, trees, and graphs. We would like to explore how to implement these algorithms and data structures in Haskell and how it differs from those implemented in a object-orientated language like C++.
Prim's algorithm: Prim's algorithm finds a minimum spanning tree for a weighted undirected graph. Implementing a recursive Prim's algorithm that find a weight minimized subset of edges that connects all the vertices in the graph.
What is the something extra?
Most algorithms and data structures we learn about in class assume imperative programming. We will explore the idea of purely functional data structures (immutable) and implement a few to compare it with our other data structures.
Prim's algorithm is often implemented using priority queue which keeps track of a sorted list of vertices based on their edge weight. However, here we show that it can also be implemented using an unsorted array with a time complexity trade-off between maintaining a sorted list (priority queue) and traversing an entire list of edges to find the minimum weighted edge (unsorted array).
What did we learn from doing this?
The pattern matching and function abstraction available in Haskell significantly reduced the amount of code needed for implementing many algorithms while increasing their readability. Sorting algorithms, for example, can be implemented in relatively few lines of code, especially quicksort. This version of the Prim's algorithm takes advantage of the built-in lazy evaluation and memoization of Haskell as the same computation appears in multiple places in the recursion which could be costly in other languages without implementing memoization mechanism.
While Haskell does not have mutation-related features like pointers (particularly relevant in linked lists), lazy evaluation makes it so that accession is lightweight (i.e. extracting an element can be considered a function that takes two arguments (the node's value and the rest of the linked list) and return only the first; the second (i.e. the rest of the list) is ignored, and thus is essentially constant time, even if the linked list is huge). However, the loss of mutation makes something like selection sort a little cumbersome (ie need steps for 1) producing the min item and 2) removing it from the unsorted portion of the list). Also, as a functional programming languages, Haskell supports self-organizing structures especially poorly; as self-organizing can be considered a side effect, this means that the list does not always remain the same after, for instance, finding an element! This is essentially a form of mutation, and thus we must be able to simulate it in some way. The easiest way is to simply return a tuple - the result of the operation performed, and the updated data structure as the result of any side effects. However, this complicates code readability and ease of use as one now has to deal with multi-valued function returns.