By Maxwell Doroen

Starting Out

The first thing I did was load in the packages. I loaded in tidyverse, ggplot2, and lubridate. Tidyverse helps clean up my data, and ggplot2 helps me plot that data. Lubridate makes my dates easier to format.

A Note on the Dataset

For this project, I used a dataset about Urban Dictionary from Athontz on Kaggle. Urban Dictionary is an online “dictionary” of terms that can be added by anyone. Some people use Urban Dictionary to learn slang they are unfamiliar with. Many of the terms used on this website are considered derrogatory. The site uses “upvotes” to sort words and phrases by popularity. If someone likes a word and the definition, they can upvote the word, and more people might see it. The more upvotes a word has, the more likely it is to be seen by a random user. The site also uses “downvotes”, which anyone can add if they do not like the word or content. For the purpose of this project, I will only be examining upvotes, not downvotes.

Filtering the Data

Let’s talk about how I filtered the data! I started with several thousand rows, which was certainly not easy to work with. I filtered for more than 6000 upvotes, so I could see only the words that were upvoted most often. Next, I filtered for dates in 2016 only. 2016 is the last full year in the dataset-and an election year-so I figured it would be fun to work with.

Graph 1

This graph was difficult to make. First, I had to ensure that the days in my dates category were actually acting as dates in R. Next, I had to actually make the graph. I had to scale it a bit, since I wanted the data to look more neat. I made the bars wider and maroon, since that looks prettier to me.

Assigning a POS Column

In this code, I assigned each word a part of speech. This was difficult because some of the “words” were actually entire complex phrases. In those cases, I simply assigned the tag “phrase” instead of breaking the phrase down. Additionally, some of these words are completely made up, and the definitions come from random people on the internet. Without knowing how the word is used, it was difficult to classify part of speech. I did my best to make it as accurate as possible.

Graph 2

The next graph took a bit more effort on my part. I went through the list of words and tagged them for part of speech to the best of my abilities, given the definitions. This graph shows how often each type of word is upvotes. In the data there are 20 nouns, 8 verbs, 3 phrases, and 3 adjectives. Based on that information, it seems POS probably does not contribute to how often something is upvoted.

Conclusion

In the end, date of posts and part of speech did not tell me much about why certain content is upvoted in Urban Dictionary. If I had more time, I would like to look into other factors. For now, I at least know that these two factors do not seem to correspond to why content may be upvoted. This information is valuable because I now know two factors that are not causes of upvoting.

LS0tDQp0aXRsZTogIlVyYmFuIERpY3Rpb25hcnkgVXB2b3RlcyINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQojIyBCeSBNYXh3ZWxsIERvcm9lbg0KDQpgYGB7cn0NCiNsb2FkIGluIHRoZSBkYXRhc2V0cywgbGV0cyBkbyB0aGlzIQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpsaWJyYXJ5KGdncGxvdDIpDQpsaWJyYXJ5KGx1YnJpZGF0ZSkNCmBgYA0KIyMjIFN0YXJ0aW5nIE91dA0KVGhlIGZpcnN0IHRoaW5nIEkgZGlkIHdhcyBsb2FkIGluIHRoZSBwYWNrYWdlcy4gSSBsb2FkZWQgaW4gdGlkeXZlcnNlLCBnZ3Bsb3QyLCBhbmQgbHVicmlkYXRlLiBUaWR5dmVyc2UgaGVscHMgY2xlYW4gdXAgbXkgZGF0YSwgYW5kIGdncGxvdDIgaGVscHMgbWUgcGxvdCB0aGF0IGRhdGEuIEx1YnJpZGF0ZSBtYWtlcyBteSBkYXRlcyBlYXNpZXIgdG8gZm9ybWF0Lg0KDQojIyMgQSBOb3RlIG9uIHRoZSBEYXRhc2V0DQpGb3IgdGhpcyBwcm9qZWN0LCBJIHVzZWQgYSBkYXRhc2V0IGFib3V0IFVyYmFuIERpY3Rpb25hcnkgZnJvbSBBdGhvbnR6IG9uIEthZ2dsZS4gVXJiYW4gRGljdGlvbmFyeSBpcyBhbiBvbmxpbmUgImRpY3Rpb25hcnkiIG9mIHRlcm1zIHRoYXQgY2FuIGJlIGFkZGVkIGJ5IGFueW9uZS4gU29tZSBwZW9wbGUgdXNlIFVyYmFuIERpY3Rpb25hcnkgdG8gbGVhcm4gc2xhbmcgdGhleSBhcmUgdW5mYW1pbGlhciB3aXRoLiBNYW55IG9mIHRoZSB0ZXJtcyB1c2VkIG9uIHRoaXMgd2Vic2l0ZSBhcmUgY29uc2lkZXJlZCBkZXJyb2dhdG9yeS4gVGhlIHNpdGUgdXNlcyAidXB2b3RlcyIgdG8gc29ydCB3b3JkcyBhbmQgcGhyYXNlcyBieSBwb3B1bGFyaXR5LiBJZiBzb21lb25lIGxpa2VzIGEgd29yZCBhbmQgdGhlIGRlZmluaXRpb24sIHRoZXkgY2FuIHVwdm90ZSB0aGUgd29yZCwgYW5kIG1vcmUgcGVvcGxlIG1pZ2h0IHNlZSBpdC4gVGhlIG1vcmUgdXB2b3RlcyBhIHdvcmQgaGFzLCB0aGUgbW9yZSBsaWtlbHkgaXQgaXMgdG8gYmUgc2VlbiBieSBhIHJhbmRvbSB1c2VyLiBUaGUgc2l0ZSBhbHNvIHVzZXMgImRvd252b3RlcyIsIHdoaWNoIGFueW9uZSBjYW4gYWRkIGlmIHRoZXkgZG8gbm90IGxpa2UgdGhlIHdvcmQgb3IgY29udGVudC4gRm9yIHRoZSBwdXJwb3NlIG9mIHRoaXMgcHJvamVjdCwgSSB3aWxsIG9ubHkgYmUgZXhhbWluaW5nIHVwdm90ZXMsIG5vdCBkb3dudm90ZXMuIA0KYGBge3J9DQojZmlsdGVyIGZvciB1cHZvdGVzDQpmaWx0ZXJlZF91ZCA8LSBmaWx0ZXIodXJiYW5fZGljdGlvbmFyeSwgdXB2b3RlcyA+IDYwMDApDQojY2hhbmdlIHRob3NlIGRhdGVzIQ0KZGF0ZXNfdWQgPC0gbHVicmlkYXRlOjptZHkodXJiYW5fZGljdGlvbmFyeSRkYXRlKQ0KZmlsdGVyZWRfdWQyIDwtIG11dGF0ZShmaWx0ZXJlZF91ZCwgZGF0ZXNfdWQgPSBsdWJyaWRhdGU6Om1keShkYXRlKSkNCmZpbHRlcmVkX3VkMiA8LSBtdXRhdGUoZmlsdGVyZWRfdWQyLCBkYXRlID0gTlVMTCkNCiNwbHMgYmUgbmljZSwgbGV0cyBmaWx0ZXIgZGF0ZXMNCnVkXzIwMTYgPC0gZmlsdGVyZWRfdWQyICU+JSANCiAgZmlsdGVyKGJldHdlZW4oZGF0ZXNfdWQsDQogICAgICAgICAgICAgICAgIGFzLkRhdGUoIjIwMTYtMDEtMDEiKSwNCiAgICAgICAgICAgICAgICAgYXMuRGF0ZSgiMjAxNi0xMi0zMSIpKSkNCmBgYA0KIyMjIEZpbHRlcmluZyB0aGUgRGF0YQ0KTGV0J3MgdGFsayBhYm91dCBob3cgSSBmaWx0ZXJlZCB0aGUgZGF0YSEgSSBzdGFydGVkIHdpdGggc2V2ZXJhbCB0aG91c2FuZCByb3dzLCB3aGljaCB3YXMgY2VydGFpbmx5IG5vdCBlYXN5IHRvIHdvcmsgd2l0aC4gSSBmaWx0ZXJlZCBmb3IgbW9yZSB0aGFuIDYwMDAgdXB2b3Rlcywgc28gSSBjb3VsZCBzZWUgb25seSB0aGUgd29yZHMgdGhhdCB3ZXJlIHVwdm90ZWQgbW9zdCBvZnRlbi4gTmV4dCwgSSBmaWx0ZXJlZCBmb3IgZGF0ZXMgaW4gMjAxNiBvbmx5LiAyMDE2IGlzIHRoZSBsYXN0IGZ1bGwgeWVhciBpbiB0aGUgZGF0YXNldC1hbmQgYW4gZWxlY3Rpb24geWVhci1zbyBJIGZpZ3VyZWQgaXQgd291bGQgYmUgZnVuIHRvIHdvcmsgd2l0aC4gDQpgYGB7cn0NCiNwbG90ISBwbG90ISBwbG90IQ0KdWRfMjAxNiRkYXRlc191ZCA8LSBhcy5EYXRlKHVkXzIwMTYkZGF0ZXNfdWQpDQpnZ3Bsb3QoZGF0YSA9IHVkXzIwMTYpICsNCiAgYWVzKHkgPSB1cHZvdGVzLCB4ID0gZGF0ZXNfdWQpICsNCiAgc2NhbGVfeF9kYXRlKA0KICAgIGxpbWl0cyA9IGFzLkRhdGUoYygiMjAxNi0wMS0wMSIsICIyMDE2LTEyLTMxIikpLA0KICAgIGRhdGVfYnJlYWtzID0gIjEgbW9udGgiLCANCiAgICBkYXRlX2xhYmVscyA9ICIlYiINCiAgICApICsNCiAgZ2VvbV9iYXIoc3RhdCA9ICJpZGVudGl0eSIsIHdpZHRoID0gNSwgZmlsbCA9ICIjOTExQzA2IikgKw0KICBsYWJzKHggPSAiRGF0ZSIsIHkgPSAiTnVtYmVyIG9mIFVwdm90ZXMiKQ0KYGBgDQojIyMgR3JhcGggMQ0KVGhpcyBncmFwaCB3YXMgZGlmZmljdWx0IHRvIG1ha2UuIEZpcnN0LCBJIGhhZCB0byBlbnN1cmUgdGhhdCB0aGUgZGF5cyBpbiBteSBkYXRlcyBjYXRlZ29yeSB3ZXJlIGFjdHVhbGx5IGFjdGluZyBhcyBkYXRlcyBpbiBSLiBOZXh0LCBJIGhhZCB0byBhY3R1YWxseSBtYWtlIHRoZSBncmFwaC4gSSBoYWQgdG8gc2NhbGUgaXQgYSBiaXQsIHNpbmNlIEkgd2FudGVkIHRoZSBkYXRhIHRvIGxvb2sgbW9yZSBuZWF0LiBJIG1hZGUgdGhlIGJhcnMgd2lkZXIgYW5kIG1hcm9vbiwgc2luY2UgdGhhdCBsb29rcyBwcmV0dGllciB0byBtZS4gDQoNCmBgYHtyfQ0KUE9TIDwtIGMoDQogICJub3VuIiwgInBocmFzZSIsICJub3VuIiwgInZlcmIiLCAibm91biIsICJ2ZXJiIiwgIm5vdW4iLCAidmVyYiIsICJwaHJhc2UiLCAidmVyYiIsICJ2ZXJiIiwgIm5vdW4iLCAibm91biIsICJhZGplY3RpdmUiLCAidmVyYiIsICJub3VuIiwgIm5vdW4iLCAibm91biIsICJub3VuIiwgIm5vdW4iLCAidmVyYiIsICJub3VuIiwgIm5vdW4iLCAibm91biIsICJhZGplY3RpdmUiLCAidmVyYiIsICJub3VuIiwgImFkamVjdGl2ZSIsICJub3VuIiwgIm5vdW4iLCAicGhyYXNlIiwgIm5vdW4iLCAibm91biIsICJub3VuIg0KKQ0KdWRfMjAxNiRQT1MgPC0gUE9TDQpgYGANCiMjIyBBc3NpZ25pbmcgYSBQT1MgQ29sdW1uDQpJbiB0aGlzIGNvZGUsIEkgYXNzaWduZWQgZWFjaCB3b3JkIGEgcGFydCBvZiBzcGVlY2guIFRoaXMgd2FzIGRpZmZpY3VsdCBiZWNhdXNlIHNvbWUgb2YgdGhlICJ3b3JkcyIgd2VyZSBhY3R1YWxseSBlbnRpcmUgY29tcGxleCBwaHJhc2VzLiBJbiB0aG9zZSBjYXNlcywgSSBzaW1wbHkgYXNzaWduZWQgdGhlIHRhZyAicGhyYXNlIiBpbnN0ZWFkIG9mIGJyZWFraW5nIHRoZSBwaHJhc2UgZG93bi4gQWRkaXRpb25hbGx5LCBzb21lIG9mIHRoZXNlIHdvcmRzIGFyZSBjb21wbGV0ZWx5IG1hZGUgdXAsIGFuZCB0aGUgZGVmaW5pdGlvbnMgY29tZSBmcm9tIHJhbmRvbSBwZW9wbGUgb24gdGhlIGludGVybmV0LiBXaXRob3V0IGtub3dpbmcgaG93IHRoZSB3b3JkIGlzIHVzZWQsIGl0IHdhcyBkaWZmaWN1bHQgdG8gY2xhc3NpZnkgcGFydCBvZiBzcGVlY2guIEkgZGlkIG15IGJlc3QgdG8gbWFrZSBpdCBhcyBhY2N1cmF0ZSBhcyBwb3NzaWJsZS4gDQoNCmBgYHtyfQ0KZ2dwbG90KGRhdGEgPSB1ZF8yMDE2KSArDQphZXMoeCA9IFBPUywgeSA9IHVwdm90ZXMpICsNCmdlb21fYmFyKHN0YXQgPSAiaWRlbnRpdHkiLCBmaWxsID0gImJsdWUiKQ0KYGBgDQojIyMgR3JhcGggMg0KVGhlIG5leHQgZ3JhcGggdG9vayBhIGJpdCBtb3JlIGVmZm9ydCBvbiBteSBwYXJ0LiBJIHdlbnQgdGhyb3VnaCB0aGUgbGlzdCBvZiB3b3JkcyBhbmQgdGFnZ2VkIHRoZW0gZm9yIHBhcnQgb2Ygc3BlZWNoIHRvIHRoZSBiZXN0IG9mIG15IGFiaWxpdGllcywgZ2l2ZW4gdGhlIGRlZmluaXRpb25zLiBUaGlzIGdyYXBoIHNob3dzIGhvdyBvZnRlbiBlYWNoIHR5cGUgb2Ygd29yZCBpcyB1cHZvdGVzLiBJbiB0aGUgZGF0YSB0aGVyZSBhcmUgMjAgbm91bnMsIDggdmVyYnMsIDMgcGhyYXNlcywgYW5kIDMgYWRqZWN0aXZlcy4gQmFzZWQgb24gdGhhdCBpbmZvcm1hdGlvbiwgaXQgc2VlbXMgUE9TIHByb2JhYmx5IGRvZXMgbm90IGNvbnRyaWJ1dGUgdG8gaG93IG9mdGVuIHNvbWV0aGluZyBpcyB1cHZvdGVkLiANCg0KIyMjIENvbmNsdXNpb24NCkluIHRoZSBlbmQsIGRhdGUgb2YgcG9zdHMgYW5kIHBhcnQgb2Ygc3BlZWNoIGRpZCBub3QgdGVsbCBtZSBtdWNoIGFib3V0IHdoeSBjZXJ0YWluIGNvbnRlbnQgaXMgdXB2b3RlZCBpbiBVcmJhbiBEaWN0aW9uYXJ5LiBJZiBJIGhhZCBtb3JlIHRpbWUsIEkgd291bGQgbGlrZSB0byBsb29rIGludG8gb3RoZXIgZmFjdG9ycy4gRm9yIG5vdywgSSBhdCBsZWFzdCBrbm93IHRoYXQgdGhlc2UgdHdvIGZhY3RvcnMgZG8gbm90IHNlZW0gdG8gY29ycmVzcG9uZCB0byB3aHkgY29udGVudCBtYXkgYmUgdXB2b3RlZC4gVGhpcyBpbmZvcm1hdGlvbiBpcyB2YWx1YWJsZSBiZWNhdXNlIEkgbm93IGtub3cgdHdvIGZhY3RvcnMgdGhhdCBhcmUgbm90IGNhdXNlcyBvZiB1cHZvdGluZy4g