Course:COGS200/2017W1/NGramAssignment-DHone

From UBC Wiki

1 - Compare words

I compared the words "boy" and "girl." As you can see, for most of the graph, there's a steady use of the word "boy," moreso than the word "girl." While it rises and falls somewhat, during the 17th-20th century, "boy" is static in use in comparison to "girl." This may be attributed to how girls were less favored in children, and literature's tendency to use masculine words to speak about humans in general. As the chart goes on, however, the words "boy" and "girl" become closer and closer in usage, until "girl" actually surpasses "boy" in use in 1974. In addition, there's also an odd spike that takes place around 1620-1630 in the word boy, but this spike also takes place with the word "child," even more so (that n-gram chart can be seen here: http://bit.ly/2y9RnwR). I would think this is more reflective of an influx of material for the n-gram to pull from, rather than an influx in usage.

2 - Wildcard search

(this iframe wouldn't work: https://books.google.com/ngrams/interactive_chart?content=i%27m+feeling+*&year_start=1800&year_end=2000&corpus=15&smoothing=3&share=&direct_url=t2%3B%2CI%20%27m%20feeling%20%2A%3B%2Cc0%3B%2Cs0%3B%3BI%20%27m%20feeling%20a%3B%2Cc0%3B%3BI%20%27m%20feeling%20better%3B%2Cc0%3B%3BI%20%27m%20feeling%20very%3B%2Cc0%3B%3BI%20%27m%20feeling%20fine%3B%2Cc0%3B%3BI%20%27m%20feeling%20so%3B%2Cc0%3B%3BI%20%27m%20feeling%20pretty%3B%2Cc0%3B%3BI%20%27m%20feeling%20like%3B%2Cc0%3B%3BI%20%27m%20feeling%20much%3B%2Cc0%3B%3BI%20%27m%20feeling%20rather%3B%2Cc0%3B%3BI%20%27m%20feeling%20good%3B%2Cc0)

For the wildcard, I chose the phrase "i'm feeling *" Ignoring the most popular result, which just follows with a proposition, the results show two interesting ideas. One, that we may possibly tend to use more positive language when talking about how we are feeling. This can be seen in the frequency of the words "better," "good," and "fine," following the phrase "I'm feeling." And two, that we may emphasize or exaggerating how we are feeling more often than not. The third most frequent phrase in modern usage is "I'm feeling very," which may point towards a tendency to use adjectives that emphasize how our feelings.

3 - Inflection search

While playing around with the inflection search function, I stumbled across something interesting. I searched the word "better," not really quite sure what I was expecting to get from it. However, the ngram viewer surprised me by coming back with synonyms of the word such as "well," "good," and "best." This is interesting because it seems whoever programmed the INF function chose to basically have it work as a synonym function for adjectives, which I wouldn't have thought the function to do.

4 - Part-of-Speech tags

(this iframe wouldn't work: https://books.google.com/ngrams/interactive_chart?content=eat%3D%3E*_NOUN&year_start=1800&year_end=2000&corpus=15&smoothing=3&share=&direct_url=t2%3B%2Ceat%3D%3E%2A_NOUN%3B%2Cc0%3B%2Cs0%3B%3Beat%3D%3Edrink_NOUN%3B%2Cc0%3B%3Beat%3D%3Ebread_NOUN%3B%2Cc0%3B%3Beat%3D%3Eflesh_NOUN%3B%2Cc0%3B%3Beat%3D%3Efood_NOUN%3B%2Cc0%3B%3Beat%3D%3Emeat_NOUN%3B%2Cc0%3B%3Beat%3D%3Edinner_NOUN%3B%2Cc0%3B%3Beat%3D%3Epeople_NOUN%3B%2Cc0%3B%3Beat%3D%3Eman_NOUN%3B%2Cc0%3B%3Beat%3D%3Eanything_NOUN%3B%2Cc0%3B%3Beat%3D%3Efruit_NOUN%3B%2Cc0)

By perusing the ngrams info page, I found out about a cool little hidden feature. One can use part-of-speech tags (such as _NOUN) and wildcards together with a dependency relation provided by the Ngram Viewer (which is the => operator). This creates really cool and complex graphs, such as the one above, which shows that the words used with the word "eat" have drastically changed over time. While today, the most prevalent combinations are things such as "eat=>food_NOUN" and "eat=>meat_NOUN" (although there is a concerning popularity of the combination "eat=>people_NOUN", the 1800s are entirely different. With combinations like "eat=>drink_NOUN" and "eat=>bread_NOUN" being more popular than "eat=>food_NOUN," and the surprisingly popularity of the combination "eat=>flesh_NOUN," this graph could describe cultural shifts in our societal perspectives of the food we consume and what we consume.

5 - Searching for Parts of Speech

Searching "*_VERB" shows an interesting composition of the tenses of the English language. My first thought would be to say that this shows present tense is more prevalent, as the most popular result is "is_VERB." However, the results are actually a very even mix of both past and present tense, including "are" next to "be," "were" next to "has," "would" next to "been." I'm not sure what this means or portrays about the human condition, but it's certainly intriguing!