babynames %>%
filter(year == 1966, sex == "M") %>%
mutate(rank = row_number()) %>%
mutate(percent = round(prop * 100, 1)) %>%
filter(name == "Matthew")
NA
<!-- rnb-text-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-chunk-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-source-begin eyJkYXRhIjoiYGBgclxuXG5iYWJ5bmFtZXMgJT4lICAgICAgICAgICAgICAgICAgICAgICAgICAgICBcbiAgZmlsdGVyKHllYXIgPT0gMTk2Niwgc2V4ID09IFwiTVwiKSAlPiUgICAgXG4gIG11dGF0ZShyYW5rID0gcm93X251bWJlcigpKSAlPiUgICAgICAgICBcbiAgbXV0YXRlKHBlcmNlbnQgPSByb3VuZChwcm9wICogMTAwLCAxKSkgJT4lIFxuICBmaWx0ZXIobmFtZSA9PSBcIk1heVwiKSAgICAgICAgICAgICAgIFxuYGBgIn0= -->
```r
babynames %>%
filter(year == 1966, sex == "M") %>%
mutate(rank = row_number()) %>%
mutate(percent = round(prop * 100, 1)) %>%
filter(name == "May")
NA
```This was done as a practice code to make sure it was correct. This code gives me the percentage of males named Matthew in 1966.
babynames %>%
filter(year == 1999, sex == "F") %>%
mutate(rank = row_number()) %>%
mutate(percent = round(prop * 100, 1)) %>%
filter(name == "Morgan")
NA
<!-- rnb-text-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-chunk-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-source-begin eyJkYXRhIjoiYGBgclxuYmFieW5hbWVzICU+JSAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMgc3RhcnQgd2l0aCB0aGUgZGF0YVxuICBmaWx0ZXIobmFtZSA9PSBcIk1vcmdhblwiLCBzZXggPT0gXCJGXCIpICU+JSAgICAgICMgY2hvb3NlIHRoZSBuYW1lIGFuZCBzZXhcbiAgZ2dwbG90KGFlcyh4ID0geWVhciwgeSA9IHByb3ApKSArICAgICAgICAgICAgICAjIHB1dCB5ZWFyIG9uIHRoZSB4LWF4aXMgYW5kIHByb3AgKHByb3BvcnRpb24pIG9uIHlcbiAgZ2VvbV9saW5lKCkgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAjIG1ha2UgaXQgYSBsaW5lIGdyYXBoIFxuXG5gYGAifQ== -->
```r
babynames %>% # start with the data
filter(name == "Morgan", sex == "F") %>% # choose the name and sex
ggplot(aes(x = year, y = prop)) + # put year on the x-axis and prop (proportion) on y
geom_line() # make it a line graph
<!-- rnb-text-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-chunk-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-source-begin eyJkYXRhIjoiYGBgclxuYmFieW5hbWVzICU+JVxuICBmaWx0ZXIobmFtZSA9PSBcIk1vcmdhblwiIHwgbmFtZSA9PSBcIk1hZGlzb25cIiB8IG5hbWUgPT0gXCJNYWNrZW56aWVcIikgJT4lIFxuICBmaWx0ZXIoc2V4ID09IFwiRlwiKSAlPiUgXG4gIGdncGxvdChhZXMoeCA9IHllYXIsIHkgPSBuLCBjb2xvciA9IG5hbWUpKSArXG4gIGdlb21fbGluZSgpXG5gYGAifQ== -->
```r
babynames %>%
filter(name == "Morgan" | name == "Madison" | name == "Mackenzie") %>%
filter(sex == "F") %>%
ggplot(aes(x = year, y = n, color = name)) +
geom_line()
Now, I have changed the name from Matthew to Morgan to see what my own name's popularity was in 1966.
This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Ctrl+Shift+Enter*.
<!-- rnb-text-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-source-begin eyJkYXRhIjoiYGBgclxucGxvdChjYXJzKVxuYGBgIn0= -->
```r
plot(cars)
Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).
The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.
First, I am going to create a code that gives the rank of my name for how popular it was the year I was born.
babynames %>%
filter(year == 1999, sex == "F") %>%
mutate(rank = row_number()) %>%
mutate(percent = round(prop * 100, 1)) %>%
filter(name == "Morgan")
This code showed that my name was ranked as 24th most popular in 1999. Roughly .005%.
Next, I am going to create a word cloud of the female names the year I was born.
<!-- rnb-text-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-chunk-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-source-begin eyJkYXRhIjoiYGBgclxuYmFieW5hbWVzICU+JVxuICBmaWx0ZXIoeWVhciA9PSAxOTk5KSAlPiUgICAgICMgdXNlIG9ubHkgb25lIHllYXJcbiAgZmlsdGVyKHNleCA9PSBcIkZcIikgJT4lICAgICAgICMgdXNlIG9ubHkgb25lIHNleFxuICBzZWxlY3QobmFtZSwgbikgJT4lICAgICAgICAgICMgc2VsZWN0IHRoZSB0d28gcmVsZXZhbnQgdmFyaWFibGVzOiB0aGUgbmFtZSBhbmQgaG93IG9mdGVuIGl0IG9jY3Vyc1xuICB0b3BfbigxMDAsIG4pICU+JSAgICAgICAgICAgICMgdXNlIG9ubHkgdGhlIHRvcCBuYW1lcyBvciBpdCBjb3VsZCBnZXQgdG9vIGJpZ1xuICB3b3JkY2xvdWQyKCkgICAgICAgICAgICAgICAgICMgZ2VuZXJhdGUgdGhlIHdvcmQgY2xvdWRcbmBgYCJ9 -->
```r
babynames %>%
filter(year == 1999) %>% # use only one year
filter(sex == "F") %>% # use only one sex
select(name, n) %>% # select the two relevant variables: the name and how often it occurs
top_n(100, n) %>% # use only the top names or it could get too big
wordcloud2() # generate the word cloud
This shows the most popular names of females in 1999 as the largest, descending by size to the least popular female name in 1999.
Next, I am going to graph the popularity of my name.
<!-- rnb-text-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-chunk-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-source-begin eyJkYXRhIjoiYGBgclxuYmFieW5hbWVzICU+JSAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMgc3RhcnQgd2l0aCB0aGUgZGF0YVxuICBmaWx0ZXIobmFtZSA9PSBcIk1vcmdhblwiLCBzZXggPT0gXCJGXCIpICU+JSAgICAgICMgY2hvb3NlIHRoZSBuYW1lIGFuZCBzZXhcbiAgZ2dwbG90KGFlcyh4ID0geWVhciwgeSA9IHByb3ApKSArICAgICAgICAgICAgICAjIHB1dCB5ZWFyIG9uIHRoZSB4LWF4aXMgYW5kIHByb3AgKHByb3BvcnRpb24pIG9uIHlcbiAgZ2VvbV9saW5lKCkgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAjIG1ha2UgaXQgYSBsaW5lIGdyYXBoIFxuXG5gYGAifQ== -->
```r
babynames %>% # start with the data
filter(name == "Morgan", sex == "F") %>% # choose the name and sex
ggplot(aes(x = year, y = prop)) + # put year on the x-axis and prop (proportion) on y
geom_line() # make it a line graph
This line graph shows that my name did not start gaining popularity until 1980-2000. After 2000, it began dropping in popularity.
Next, I am going to create a table that shows which year my name was most popular.
<!-- rnb-text-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-chunk-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-source-begin eyJkYXRhIjoiYGBgclxuYmFieW5hbWVzICU+JSAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAjIFN0YXJ0IHdpdGggdGhlIGRhdGFzZXRcbiAgZmlsdGVyKG5hbWUgPT0gXCJNb3JnYW5cIiwgc2V4ID09IFwiRlwiKSAlPiUgICAgICAgIyBvbmx5IGxvb2sgYXQgdGhlIG5hbWUgeW91IHdhbnRcbiAgdG9wX24oMSwgcHJvcCkgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIyBnZXQgdGhlIHllYXIgd2l0aCB0aGUgdG9wIG51bWJlciBmb3IgdGhhdCBuYW1lXG5cbmBgYCJ9 -->
```r
babynames %>% # Start with the dataset
filter(name == "Morgan", sex == "F") %>% # only look at the name you want
top_n(1, prop) # get the year with the top number for that name
This table shows that in 1995 my name was the most popular.
Next, I am going to compare the popularity of my name, to other female names through a graph.
<!-- rnb-text-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-chunk-end -->
<!-- rnb-chunk-begin -->
<!-- rnb-source-begin eyJkYXRhIjoiYGBgclxuYmFieW5hbWVzICU+JVxuICBmaWx0ZXIobmFtZSA9PSBcIk1vcmdhblwiIHwgbmFtZSA9PSBcIk1hZGlzb25cIiB8IG5hbWUgPT0gXCJNYWNrZW56aWVcIikgJT4lIFxuICBmaWx0ZXIoc2V4ID09IFwiRlwiKSAlPiUgXG4gIGdncGxvdChhZXMoeCA9IHllYXIsIHkgPSBuLCBjb2xvciA9IG5hbWUpKSArXG4gIGdlb21fbGluZSgpXG5gYGAifQ== -->
```r
babynames %>%
filter(name == "Morgan" | name == "Madison" | name == "Mackenzie") %>%
filter(sex == "F") %>%
ggplot(aes(x = year, y = n, color = name)) +
geom_line()
``` This line graph shows that Madison was by far the most popular name over the years compared to the other two names.