Course:COGS200/2017W1/NGramAssignment/Lachance

From UBC Wiki

For EACH of the sections below, (a) create a graph making a comparison, (b) include the “code” used to create the graph, (b) describe what is shown by the graph, and, (c) double click on the words to see whether there is anything unexpected driving the effect, (d) if possible explain what factors are driving the differences between ngrams and their changes over time. Cultural changes, scientific discoveries, and historical events are all likely to drive interesting changes.

Use the language of dynamic systems in your descriptions, including state, attractor, collective variables.

Compare words

I compared cyclist,biker,bicyclist, and I did a case insensitive search as I didn't consider capitalisation germane.

Biker is low intensity is almost nonexistent until it takes off in the mid-1960s, presumably primarily referring to motorcycles, though there is an ascendancy period in the late 1920s, which may have coincided with increased material wealth and consumption of motorocycles (this begins to taper back down when the Depression starts). Cyclist is overall the most popular starting in the 1880s, probably with the Golden Age of Bicycling. It peaks in 1917, and has a downward trajectory especially during the 1950s when car consumption rose and public transit availability was reduced. 'Bicyclist' while much less common follows a somewhat similar trajectory (rising starting in the Golden Age, though it does not show increase during WWI). Both 'bicyclist' and 'cyclist' increased starting in the 1970s potentially with the beginnings of the modern bicycle advocacy movement. Biker currently is slightly more used than cyclist currently, perhaps reflecting that biker is also used to refer to cyclist, while cylist is never used to refer to motorcycle rider.


Wildcard search

I used 'why is the *' for my wildcard search. The most popular completion currently in my result is 'why is the world'. The highest rate usage of any phrase occurs by far in 1800, and is 'why is the punishment' -- perhaps there was a lot of literature on penal reform, or perhaps the lion's share of written material with 'why is the' was from written legal judgements, hence the emphasis on punishment. Or perhaps this popularity reflects a trend in moralising literature, or prison reform literature.

Inflection search

Pick a phrase and use the _INF on a noun and on a verb. Look to see which inflection is most frequent. Describe the effect. It may be the case that you can identify a reason for the effect, but just describing the effect in words is sufficient.

I used computer run_INF and got 'computer runs', 'computer ran' and 'computer running'. No usage of any inflection shows up until 1927, which is when this usage would first have been coined -- previously computers did other things besides running! 'Runs' and 'run' peak early in the 1970s and are the most popular, but then decline as 'running' gains steady ground starting in 1980s. Perhaps this switch had something to do with the rising popularity of personal computing.

'Computer ran' has the lowest all around usage and perhaps unsurprisingly is almost imperceptible in relation to the others until the 1980s, when presumably computer obsolescence and breakdown became a more widely discussed issue, and there were computers that were at one time running, but no longer.

Search for a word using Part-of-Speech tags

Parts-of-speech tags can be used both to disambiguate homographic words that differ in part of speech, for example catch_NOUN, catch_VERB. It is also possible to see all parts of speech associated with a form: catch_*

I used object_NOUN and object_VERB

Object in noun form was continuously more common, whereas the verb form reached its lowly peak at 0.00099% in 1841. Interestingly, both forms peaked at around the same time. The noun form had its peak instances in 1835 and then declined until the mid-1980s when it began to make a comeback. I'm not sure why this pattern emerged; perhaps Enlightenment modes of inquiry had something to do with changing frequency and usage.

Search for Parts of Speech (not a specific word)

Use *_NOUN, *_VERB, etc.

I used *_ADJ to take a look at adjectives historically.

Other is the clear and consistent frontrunning adjective for the entire time period, with a slight decline from 0.166% in 1800 to 0.135% in 2000. Interestingly, while great comes in second at the turn of the 19th century at 0.132%, peaks slightly above that in 1810, but shows a steady decline; fast-forward 200 years and we find it in last place at 0.028%. This may be due to its shift in usage as in the late 1840s the first recorded usage of great to mean excellent appears according to the Online Etymology Dictionary. Other usages had already become archaic by this point (Old and Middle English had verb forms like greaten meaning "to become bigger" which again underscores changing usage. Great with its meaning of "excellence" may have declined in popularity since slang fashion trends can lead to obsolescence.