Data110 Week 8 HW

Install Packages

library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.3.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)
Warning: package 'dslabs' was built under R version 4.3.3
library(highcharter)
Warning: package 'highcharter' was built under R version 4.3.3
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
Highcharts (www.highcharts.com) is a Highsoft software product which is
not free for commercial and Governmental use

Attaching package: 'highcharter'

The following object is masked from 'package:dslabs':

    stars
library(RColorBrewer)

Choosing a Dataset

data("us_contagious_diseases")
head(us_contagious_diseases)
      disease   state year weeks_reporting count population
1 Hepatitis A Alabama 1966              50   321    3345787
2 Hepatitis A Alabama 1967              49   291    3364130
3 Hepatitis A Alabama 1968              52   314    3386068
4 Hepatitis A Alabama 1969              49   380    3412450
5 Hepatitis A Alabama 1970              51   413    3444165
6 Hepatitis A Alabama 1971              51   378    3481798

Clean the Dataset

#Creates new data group called states that can be used for the graphs
states <- us_contagious_diseases |> 
#Filters the dataset for observations only between 2000 and 2006
  filter(year %in% c("2000", "2001", "2002", "2003", "2004", "2005", "2006")) |>
#Filters the dataset for observations only in California, Florida, New York, and Maryland
  filter(state %in% c("California", "Florida", "New York", "Maryland")) |>
  arrange(year)

Creating a Scatterplot

#Sets the color to the color palette "Dark2"
cols <- brewer.pal(7, "Dark2")
#Uses highcharter to create a graph
chart <- highchart() |>
#Defines the x axis, y axis, and legend
  hc_add_series(data = states,
                   type = "line",
                   hcaes(x = year,
                   y = count, 
                   group = state)) |>
#Determines the color of the graph
  hc_colors(cols) |>
#Renames the x axis
  hc_xAxis(title = list(text="Year")) |>
#Renames the y axis
  hc_yAxis(title = list(text="Total Number of Reported Cases")) |>
#Makes the points circles while hovering over the graph
  hc_plotOptions(series = list(marker = list(symbol = "circle"))) |>
#Dictates where the legend is located and how it is displayed
  hc_legend(align = "right", 
            verticalAlign = "middle",
            layout = "vertical") |>
#Enables the mouse over functions
  hc_tooltip(shared = TRUE,
             borderColor = "green",
             pointFormat = "{point.state}: {point.count:.2f}<br>") |>
#Adds a title to the graph
  hc_title(
  text = "Number of Total Cases in Certain States by Year",
  margin = 20,
  align = "left"
  )
#Calls the graph to be displayed
chart

Essay

In this homework assignment I wanted to try something new by playing around with the highcharter function. While exploring the dslabs package, I came across this dataset called us_contagious_diseases that caught my attention. Not only was the dataset truly interesting, but it also had enough variables to create a graph and enough obervations to be able to create a big graph containing lots of information. Not going to lie, I was heavily inspired by the highcharter tutorial provided by Professor Saidi in this week’s notes. However, I also did my own research and incorperated some of my own touch in the coding as well. There were a couple of things I had trouble with, for example, I could not figure out how to add a title to the legend using high charter. Also for some reason, although I tried charging the color palette to “Dark2”, the color would not change for some reason. Either way, I feel like this dataset was pretty cool and I think I would want to explore it more in the future using other way such as just basic ggplot.