COGS 200: Ngram/ Maylin Cen

From UBC Wiki

Ngram Assignment

Compare words

799px-Compare_two_words.png

Comparison of two synonyms: Floppy Disk and USB Flash Drive

Code: Floppy Disk, USB Flash Drive

The graph above shows all the states in the state space of the Ngram for Floppy Disk and USB Flash Drive from 1900-2008. In this case, the collective variable is the year (1900-2008). Around 1987, an attractor was Floppy Disk; however, this attractor was replaced by USB Flash Drive. The Floppy Disk was invented by Alan Shugart in 1967. Therefore, it is reasonable to see an increase around that time. On the other hand, the USB flash drive was invented in 1994. Similarly, it is shown that around 1995, there was an increase in the use of the word " USB flash drive". In 1996, the usage of the word "Floppy Disk" decreased and it might be due to the invention of USB Flash Drive. The USB Flash Drive is more practical, convenient and has larger storage than the Floppy Disk, which leads to people to discontinue the use of this product.

Wildcard Search

800px-Google_Ngram.png

Comparison of the phrase "where to *"

Code: where to *

The graph above shows all the states in the state space of Ngram for the phrase "where to *" from 1900 to 2008. The years (1900-2008) are the collective variable for this particular case. A shallow attractor is found for the phrase "where to find", but it was interrupted by a second attractor. However, the leading attractor is "where to find" and it has remained stable. In terms of the phrases, "where to find" and "where to go" have always been used more often than the other phrases. Both phrases started to increase in 1970 and the continue to increase. This might be due to the increased amount of travel books that are being published for travellers. "where to find" helps travellers find restaurants, parks, hotels and "where to go" tells travellers some of the "must go" places in a particular country.

Inflection Search

800px-INF_verb.png

Comparison of the phrase "buy_INF a car" (verb case)

Code: buy_INF a car

The graph above shows the different states in state space of the Ngram "buy_INF a car" throughout the years (1900-2008). In this case, the collective variable is the years (1900-2008). Around the year of 1945 and 1965, there was one attractor and it has remained stable. In terms of the phrases, "buy a car" is more commonly used in books, followed by the phrase "buying a car". It started to increase around the year of 1965 and it continues to increase. This might be due to all the car magazines and comprehensive guides that are being published recently to help people make decisions when buying a car. People are buying more and more cars each year, for example, more than 10 million cars are sold each year. As a result, this increases the need for car magazines and guides. On the other hand, the phrases "bought a car" and "buys a car" are most likely to appear in literature books and novels, but not numerous times.

Code: book_INF a hotel

800px-Book_a_hotel.png

Comparison of the phrase "book_INF a hotel" (noun case)

The graph above shows all the states in the state space of Ngram "book_INF a hotel" from 1900 to 2008, the collective variable. Two attractors (stable) were found, one is less shallow than the other, between the years ~1955-1985. However, the leading attractor is the phrase "book a hotel". In term of the phrases, "book a hotel" is more commonly used in books over the phrase "booking a hotel". It started to increase around the year of 1982 and it kept increasing until 2004. This increase might be due to the publishing of language guides for travellers. For example, many travel phrasebooks have the section called "How to book a hotel?", where it teaches travellers communication phrases in a specific language that helps travellers book a hotel room. On the other hand, the phrase "booked a hotel" are also most likely to appear in literature books and novels.

Search for a word using Part-of-Speech tags

800px-Cook.png

Comparison of the phrase "cook_*"

Code: cook_*

The graph above shows all the states in the state space of Ngram "cook_*" in different years from 1900 to 2008, the collective variable. Between the years 1915 and 1980, two attractors (stable) were found; however, the leading attractor for this scenario is the word "cook" as a verb. In terms of the word, "cook" is more often used as a verb than as a noun. Cook as a verb started to increase around the year 1965, while cook as a noun started to decline the year 1945. The decline might be due to the word "chef", which started to appear in dictionaries around the year of 1934. Since there exists a word that better represent someone who cooks (noun), the word "cook" is more likely to be used as a verb. However, it is still widely used as a noun to refer to someone who cooks.

Search for Parts of Speech (not a specific word)

799px-No.png

Comparison of the "*_VERB"

Code: *_VERB

The graph above shows all the states in the state space of the Ngram from 1900 to 2008, the collective variable. In this case, there was only one attractor, In terms of the verb, the most common verb used is "is", followed by "was". These two verbs have not increased nor declined, they have remained constants. The reason by "is" is the most used verb in books, might be due to the fact that "is" is required to most simple sentences in the present tense. Similarly, the verb "was" in the past tense. All these verbs are constantly used in books, magazines, guides, novel, textbooks and more. This is all written works in written in English.