Authors: Tim Straubinger, Hyojin (Ginnie) Yi
What is the problem?
We would like to create a natural language interface to a virtual friend named Patrick. Patrick can be asked basic questions about his feelings, his opinions (which we hope to give a humorous bias), and can also pose questions to given statements, asking well-composed what, why or how questions that are applicable to the input sentence. In the event that the input doesn't make sense, Patrick will respond with a confused retort, and if a question is posed that Patrick isn't well-informed enough to answer meaningfully, Patrick will respond with a humble acknowledgement that he cannot say, or may respond with a clever dodging of the original question.
What is the something extra?
Our project will involve a very detailed language interpreting system that will build upon Dr. Poole's provided simple noun-phrase-verb-phrase tree structure. We will need to be able to classify given input sequences into statements and questions, possibly using clues from punctuation such as commas and question marks, so that Patrick can respond selectively to these different inputs. We will also need to be able to isolate failed grammatical interpretation from a lack knowledge in the database with which to respond, so that Patrick can answer nonsensical input with mock confusion, as opposed to false. We will also need to build a rich database of Patrick's personal information, detailing his likes and dislikes, the relationships between things that he is aware of, and a richer mapping from grammatical structure to relations between entities mentioned in the input sentence and their implications unto the queries made to this database.
What did we learn from doing this?
The sentence parsing builds upon Dr. Poole's provided natural language parser, with the key addition of verb conjugation. For any noun phrase and verb phrase, there is an extra parameter, the conjugation, which determines which verb conjugations are allowed to be paired with which kinds of noun phrases. In verb phrases, this is done by the verb definition having multiple parameters, one for each conjugation. In noun phrases, this is done by categorizing whether determinants are single or plural (this, these, that, those, etc), and whether nouns are single or plural, which is also done by a multi-parameter definition of nouns which includes singular and plural forms.
In Patrick's knowledge base, there is an inheritance system which works something like Java or C++. Types are defined as extending one another, and one type is inherited from another through any indirect chain of extend relationships. For any type, there is a number of type properties which are always true for that type, and these extend down to derived types. Individuals are defined as having one or more type (i.e. Patrick is a person and a computer) and properties can also be defined uniquely for a person (i.e. Donald Trump is a man and is old and evil, but this doesn't mean that all men are old and evil, nor that Donald Trump is an altogether different type of man). The most basic type is simply 'thing.' This system is written as it is so that Patrick can know type relationships without needing to work strictly with individuals, and allows for questions like "what is a yellow thing?" to be answered meaningfully with lemon, banana, apple, etc.
The language parsing and database searching parts work largely independently, so that Patrick doesn't need to say 'false' for things he doesn't know how to answer or things that don't make grammatical sense to him, so that instead he may respond more meaningfully to these distinct cases.
Some limitations of the project is 1. Patrick's restrictive knowledge due to his limited vocabulary, 2. due to the structure of Prolog, some queries result in repetitive answers and finally, Patrick's conversation is not completely natural language. The first and last issues can be reprimanded with an addition of a more extensive database of verbs, nouns, adjectives, etc. The second issue arises from the way the inheritance among types were defined in the programme. For example, as far as Patrick is aware, all objects in his universe branch off from a "thing" such as "fruit", "human" and "computer". This allows Patrick to recognize general questions like "who is X" but also results in repetitive answers as it lists all the "thing"s that satisfy X as well as the properties of X. For example, "who is a computer" will list Patrick as a computer but also list all the characteristics Patrick has at the individual-level. Perhaps for a askPatrick 2.0, there can be an implementations to generate all possible answers and then randomly generate an appropriate response or some sort of filtering for the answers generated.
A complete list of sample queries can be found in the README.txt of the handin. A copy of the *working code*, not the version handed in, can be found: https://github.com/hyojinyi/askPatrick