Problem 1:

Our team interpreted this question to mean the following … we have aggregated the total frequency of male and female baby names across all years (no repeats), and ouputed both with their respective frequency.

The most popular male baby name is [James] and the count is [5164280]. The most popular female baby name is [Mary] and the count is [4125675].

male = subset(mydf, mydf$Sex == 'M')
female = subset(mydf, mydf$Sex == 'F')

lookupTop = function(dataframe, iterations){ # this function takes in a datafile and the number of iterations you want returned
  aggregates = ddply(dataframe, 'Name', numcolwise(sum))
  sorted_frame = arrange(aggregates, Count, decreasing = TRUE)
  for(i in seq(1, iterations)){ # loops through sequence range
    print(paste(as.character(sorted_frame[i,]$Name), sorted_frame[i,]$Count)) # prints name and frequency of top i entries
  }
}

print(sprintf('The most popular baby name and its frequency are: %s', lookupTop(male, 10)))
## [1] "James 5164280"
## [1] "John 5124817"
## [1] "Robert 4820129"
## [1] "Michael 4362731"
## [1] "William 4117369"
## [1] "David 3621322"
## [1] "Joseph 2613304"
## [1] "Richard 2565301"
## [1] "Charles 2392779"
## [1] "Thomas 2311849"
## character(0)
print(sprintf('The most popular baby name and its frequency are: %s', lookupTop(female, 10)))
## [1] "Mary 4125675"
## [1] "Elizabeth 1638349"
## [1] "Patricia 1572016"
## [1] "Jennifer 1467207"
## [1] "Linda 1452668"
## [1] "Barbara 1434397"
## [1] "Margaret 1248985"
## [1] "Susan 1121703"
## [1] "Dorothy 1107635"
## [1] "Sarah 1077746"
## character(0)

Problem 2:

The top five baby names in 1950 for males and females are shown below.

Problem 3:

The top five baby names for males and females for 1980 are shown below.

Problem 4:

The file with the top 10 baby names ever has been saved as a csv file named mostpop.csv in the current folder.

df_count <- aggregate(df$Count,by=list(name=df$Name),sum)
df_output <- df_count[order(df_count$x,decreasing = TRUE),]
only_10 = head(df_output,10)
print(only_10)
##           name     x
## 17688     Liam 19860
## 9039      Emma 18697
## 21579     Noah 18442
## 21943   Olivia 17929
## 3578       Ava 14933
## 27868  William 14526
## 11757 Isabella 14479
## 25674   Sophia 13943
## 12546    James 13569
## 17971    Logan 13426
write.csv(only_10,'mostpop.csv')