First we will install the packages we need

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.0     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.1     ✔ tibble    3.1.8
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(highcharter)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

Load the data set and set working directory

setwd("~/Desktop/RWD")
diseases <- read_csv("us_contagious_diseases.csv")
## Rows: 18870 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): disease, state
## dbl (4): year, weeks_reporting, count, population
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Combine all the states being used into one value

states <- c("California", "Arizona", "Alabama", "Connecticut", "New York")

Create new DF that only has polio in the states put into the value

Polio <- diseases %>%
  filter(disease == "Polio", state %in% states)

Create Plot showing Polio count through the years.

ggplot(Polio, aes(x = year, y = count, color = state)) +
  geom_point(aes(size = count, alpha = .5)) +
  geom_line(size = .3) + 
   scale_color_manual(values = c("#DE7CEB", "#FF79C5", "#574143", "#00C4B9", "#FFCAFF")) +
  labs(title = "Total Number of Polio Cases in the US by State",
       x = "Year",
       y = "Total Cases",
       color = "State") +
  geom_vline(xintercept=1956, col = "pink") +
  geom_vline(xintercept=1973, col = "pink") +
  theme_minimal(base_size = 14, base_family = "serif") 
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.

Explaining what I did

I chose to use the contagious diseases data set. I created a graph based on polio in 5 random states through the years. I filtered it so that the only data in my new data frame would be polio and the states I randomly selected. I then created a graph that shows this data. I made a custom color palette and changed the font to make the graph prettier. I added lines in 1956 and 1973, as 1956 is when cases went to 0, and in 1973, some cases came back. I made the points translucent, and their size depended on the number of cases. I also made the graph minimal so you can focus on the importance of the points.