#2091 observations-- quite unwieldy-- aiming for a more manageable size...
unique(us_contagious_diseases2$state)
[1] Alabama Alaska Arizona
[4] Arkansas California Colorado
[7] Connecticut Delaware District Of Columbia
[10] Florida Georgia Hawaii
[13] Idaho Illinois Indiana
[16] Iowa Kansas Kentucky
[19] Louisiana Maine Maryland
[22] Massachusetts Michigan Minnesota
[25] Mississippi Missouri Montana
[28] Nebraska Nevada New Hampshire
[31] New Jersey New Mexico New York
[34] North Carolina North Dakota Ohio
[37] Oklahoma Oregon Pennsylvania
[40] Rhode Island South Carolina South Dakota
[43] Tennessee Texas Utah
[46] Vermont Virginia Washington
[49] West Virginia Wisconsin Wyoming
51 Levels: Alabama Alaska Arizona Arkansas California Colorado ... Wyoming
#I'm also keen to observe the names given within the state variable (notice the upper case "O" in "District Of Columbia")
us_contagious_diseases3 <-filter(us_contagious_diseases2, state %in%c("District Of Columbia", "Maryland", "Virginia"))head(us_contagious_diseases3)
disease state year weeks_reporting count population
1 Polio District Of Columbia 1928 41 33 472771
2 Polio District Of Columbia 1929 52 6 478871
3 Polio District Of Columbia 1930 51 9 486869
4 Polio District Of Columbia 1931 52 15 497179
5 Polio District Of Columbia 1932 52 34 509735
6 Polio District Of Columbia 1933 52 7 524346
#Here I am creating a new dataframe, to include only data from the Washington Metropolitan Area
#Before creating the visualization, knowing the year range is important.
Creating the Visualization
p1<-ggplot(data = us_contagious_diseases3, mapping =aes(x = year, y = count)) +geom_point() +xlab("Year") +theme_minimal(base_size =12) +ylab("Number of Reported Polio Cases") +ggtitle("Polio Cases in the Washington Metropolitan Area, 1928-1968") +scale_color_brewer(palette ="Set1") +geom_line(mapping =aes(color = state))p1
#In this chunk is the code for the scatterplot of the polio cases per year in each jurisdiction (state), with labels, altering of the font size, and setting the palette. I find the plot to be most informative, as the number of cases sharply rose and fell at just about mid-century.