Course:CPSC312-2019-cindy-theodorus
World Trivia Chatbot
Authors: Cindy Hsu, Theodorus Jossoey
What is the problem?
We want to examine if Haskell is well-suited for building natural language parsing systems.
One popular topic for trivia is world geography - we wanted to build a chatbot that can answer common questions about different countries. One challenge will be converting the string into a format suitable for data look-ups. This program will be different compared to the 20 questions program: in this case, the chatbot will answer the questions given by the user, rather than being the one to ask questions.
What is the something extra?
We will allow the user to add new question-answer entries through the interface, and also overwrite answers if they are wrong. This means we need to keep track of changes to the data store during the course of the program. We will also be considering the multiple ways that one could phrase/order the same question, and make the search result consistent for the rewordings.
What did we learn from doing this?
(This should be written after you have done the work.) What is the bottom-line? Is functional programming suitable for (part-of) the task? Make sure you include the evidence for your claims.
There were 3 aspects to the program to be considered. The interaction with the user, the question parsing, and the data storage.
- The IO monad was extremely well-suited for text interactions. No additional libraries needed to be imported (unlike with a language like Java), and it was easy to follow the data flow within the `do` constructs because data had to be `return`ed in its desired state in order to be used in parent functions. This is because of the immutability in functional programming - there were no side effects apart from the IO to keep track of, and no global variables used to store things. The `Maybe` struct also helped with validation of the IO. Instead of try-catch blocks like in OOP, which lead to jumps in control flow, having the option to return `Nothing` allowed the values to propagate up normally, making the program easier to debug and understand.
- The parsing aspect of this project require us to play a lot with list; removing, adding, and filtering is what we do frequently in the parsing process. There are two ways of parsing that we consider, separating by white space character or separating by words like "is", "of", "the". Considering our data base type we chose to use parsing with words which also includes splitting the sentence for shortenned "is" (eg. Canada's). Using parsing method through white spaces is not possible because of out data base structure that limits to 3 keywords/phrases, making it not possible if we have a two words keyword like "prime minister" or "capital city". Haskell is a good functional programming language to be used to create a parser, because a lot of recursion is needed while parsing. Most of our function in the parser file use recursion. Moreover, functional programming language is good because it helps to debug the code quickly, which we need a lot of trial and error while doing functions for parsing or functions in general.
- Data storage, however, was quite a bit more difficult due to the stateless natural of Haskell. Normally, one would have an object that they could `insert` into whenever, and then read the new version on the next access, but with Haskell we had to keep track of the modified data store as it was updated in the program. This meant that the main program loop had a parameter where the datastore had to be passed in on each loop, which could theoretically get quite big. Luckily, lazy evaluation may have helped in the case that the question was not parsable, since the db would not have needed to read. While developing the program, we noticed that a large majority of the sample questions we came up with had a 3-point structure consisting of a question word, location, and object of interest, so we chose to store the answers in a hash table instead of a tree (which would have been extremely short in depth and not balanced). However, due to Haskell's strong typing, wanting to use a list meant that we had to limit our support to 3 keyword tuples, since lists cannot contain arbitrary mixed types like in a language like JavaScript.