The data set chosen for this visualization is an overview of 7 contagious diseases in the United States. There are 6 variables: disease, count, state, population, year, and weeks_reporting. In all 50 states, there are reports of disease cases spanning from 1928 to 2011. In the example visualization from the notes, the graph focused on one year, however I would like to look at one state with all diseases. Since we are based in the state of Maryland, in my visualization, we will take a look at the change in the number of cases per disease over time in Maryland.
Load the libraries needed
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.2 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.2 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(RColorBrewer)library(highcharter)
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
Highcharts (www.highcharts.com) is a Highsoft software product which is
not free for commercial and Governmental use
library(ggplot2)library(dslabs)
Attaching package: 'dslabs'
The following object is masked from 'package:highcharter':
stars
By filtering only the state of Maryland, I was able to create a line graph with the trends of all diseases over several decades, however even though I narrowed the scope of data to just the state of Maryland, there are still a plethora of observations that makes the visualization look messy. In addition, the graph was a bit generic.
data("us_contagious_diseases")diseases <- us_contagious_diseases %>%filter(state %in%c("Maryland"))ggplot(diseases, aes(x = year, y = count, color = disease)) +geom_point() +geom_line () +labs(x ="Year",y ="Count",title ="Diseases Cases in Maryland") +theme_minimal() +scale_color_discrete(name ="Diseases")
Final Plot Visualization
Using a different library and newly discovered highchart() function, I created a more visually appealing line graph by filtering the years through a count of five until 1980. The reason why I stopped at 1980 was because the WHO had declared smallpox “eradicated” meaning the number of cases for smallpox would be extremely low. With that said the selection of years should offer a solid representation of the trend of diseases cases overtime in Maryland.
diseases2 <- us_contagious_diseases |>filter(year %in%c(1930, 1935, 1940, 1945, 1950, 1955, 1960, 1965, 1970, 1975, 1980) & state %in%c("Maryland"))colors <-brewer.pal(7, "Set1")highchart() |>hc_add_series(data = diseases2,type="line" , hcaes(x=year,y=count,group = disease)) |>hc_title(text ="Development of Diseases in Maryland") |>hc_xAxis(title =list(text ="Year")) |>hc_yAxis(title =list(text ="Total Number of Reported Cases")) |>hc_colors(colors) |>hc_legend(verticalAlign ="top") |>hc_theme(hc_theme_darkunica())
Essay: Like previously stated before, the data set used in this visualization is the us_contagious_diseases data set. Using the variables state, year, and disease, I created a line graph using the highchart function to break down the number of cases for each diseases in a selected year in Maryland. It is a Java styled function that allows users to be more interactive with the graph. Since it is apart of a different library, the commands used to add modifications to the graph are different than the common commands in a ggplot graph. This chart uses hc_ to plot the axis titles and add color to the graph. Instead of looking at a specific year and disease for all 50 states, I wanted to take a look at one state (Maryland) for all diseases to see how the number of reported cases developed over time for each disease. I was intrigued with how readable the highchart was and the interactive ability to look at a point on the line makes me want to continue to utilize highchart for other graphs in the future.