Warning: package 'tidyverse' was built under R version 4.3.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)
Warning: package 'dslabs' was built under R version 4.3.3
library(highcharter)
Warning: package 'highcharter' was built under R version 4.3.3
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
Highcharts (www.highcharts.com) is a Highsoft software product which is
not free for commercial and Governmental use
Attaching package: 'highcharter'
The following object is masked from 'package:dslabs':
stars
disease state year weeks_reporting count population
1 Hepatitis A Alabama 1966 50 321 3345787
2 Hepatitis A Alabama 1967 49 291 3364130
3 Hepatitis A Alabama 1968 52 314 3386068
4 Hepatitis A Alabama 1969 49 380 3412450
5 Hepatitis A Alabama 1970 51 413 3444165
6 Hepatitis A Alabama 1971 51 378 3481798
Clean the Dataset
#Creates new data group called states that can be used for the graphsstates <- us_contagious_diseases |>#Filters the dataset for observations only between 2000 and 2006filter(year %in%c("2000", "2001", "2002", "2003", "2004", "2005", "2006")) |>#Filters the dataset for observations only in California, Florida, New York, and Marylandfilter(state %in%c("California", "Florida", "New York", "Maryland")) |>arrange(year)
Creating a Scatterplot
#Sets the color to the color palette "Dark2"cols <-brewer.pal(7, "Dark2")#Uses highcharter to create a graphchart <-highchart() |>#Defines the x axis, y axis, and legendhc_add_series(data = states,type ="line",hcaes(x = year,y = count, group = state)) |>#Determines the color of the graphhc_colors(cols) |>#Renames the x axishc_xAxis(title =list(text="Year")) |>#Renames the y axishc_yAxis(title =list(text="Total Number of Reported Cases")) |>#Makes the points circles while hovering over the graphhc_plotOptions(series =list(marker =list(symbol ="circle"))) |>#Dictates where the legend is located and how it is displayedhc_legend(align ="right", verticalAlign ="middle",layout ="vertical") |>#Enables the mouse over functionshc_tooltip(shared =TRUE,borderColor ="green",pointFormat ="{point.state}: {point.count:.2f}<br>") |>#Adds a title to the graphhc_title(text ="Number of Total Cases in Certain States by Year",margin =20,align ="left" )#Calls the graph to be displayedchart
Essay
In this homework assignment I wanted to try something new by playing around with the highcharter function. While exploring the dslabs package, I came across this dataset called us_contagious_diseases that caught my attention. Not only was the dataset truly interesting, but it also had enough variables to create a graph and enough obervations to be able to create a big graph containing lots of information. Not going to lie, I was heavily inspired by the highcharter tutorial provided by Professor Saidi in this week’s notes. However, I also did my own research and incorperated some of my own touch in the coding as well. There were a couple of things I had trouble with, for example, I could not figure out how to add a title to the legend using high charter. Also for some reason, although I tried charging the color palette to “Dark2”, the color would not change for some reason. Either way, I feel like this dataset was pretty cool and I think I would want to explore it more in the future using other way such as just basic ggplot.