Markdown to HTML Parser

From UBC Wiki

Authors: Shiv Vashisht, Kabir Chattopadhyay, Nam Hee Gordon Kim

What is the problem?

Markdown is a document markup language designed to generate HTML documents without having to use HTML tags (https://en.wikipedia.org/wiki/Markdown). In this project, we will create a primitive markdown-HTML translator using Haskell, whereby we solve two problems in total:

  1. Parse a markdown document and store the document as an object
  2. Render the object to generate an HTML document corresponding to the input

We will limit the scope of our program to supporting the most frequently used Markdown syntax, such as headings (“#”, “##”, etc.) and code block (```).

What is the something extra?

There is a plethora of markdown translators that we can use as reference solutions. In addition to the most frequently used functionalities, the in-depth aspect that we will work on is defining a custom markdown syntax where we can embed a YouTube video by citing the URL (e.g. !!['http://youtu.be/1234'] inserts an embedded YouTube player to the document).

What did we learn from doing this?

We successfully implemented some basic markdown to HTML translation functionality. Given the time constraints, we limited our scope to only a few elements (see our GitHub repo for details). The combination of functional programming and regular expressions makes the task of language parsing very feasible. In fact, Haskell officially supports parsing combinators, which are libraries built specifically for the task of parsing.

The code for a parser can get very involved and redundant. We learned that using the functional approach, we can build a more readable and lean codebase. In fact, our entire program is about 100 lines long, including the definitions of our constants. This is admittedly not the leanest code we could achieve, but most of the boilerplate is trimmed down thanks to the structure of functional programming. Overall, using functional programming for parsing is quite feasible and efficient.

The addition of embedded YouTube video syntax makes it easier for readers to reference linked videos without having to navigate to a different window. Not only is this interesting as it is a unique application, it is already very convenient for the end user.

Link to code:

https://github.ugrad.cs.ubc.ca/v0x0b/cpsc312parser