── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.3 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggthemes)library(ggrepel)
Load the murders dataset
data(murders)
Create the scatterplot
ggplot(data = murders, aes(x = population/10^6, y = total, color = region)) +geom_point(aes(size = population), alpha =0.6) +geom_smooth(method ="lm", se =FALSE, col ="darkgrey", linetype ="dashed") +geom_text_repel(aes(label = abb), box.padding =0.5, point.padding =0.5) +labs(x ="Population (in millions)", y ="Total number of murders", title ="Total Murders vs. Population by Region", subtitle ="Data from the murders dataset in dslabs package",caption ="Source: dslabs" ) +theme_minimal() +scale_color_brewer(palette ="Set1", name ="Region of USA") +theme(legend.position ="bottom")
`geom_smooth()` using formula = 'y ~ x'
Warning: ggrepel: 37 unlabeled data points (too many overlaps). Consider
increasing max.overlaps
For this graph, I’ve chosen the murders dataset from the dslabs package, which provides information about gun murders in the USA. I’ve plotted a scatterplot representing the relationship between the population (in millions) of each state and the total number of murders. The points are colored based on the region of the USA, which introduces a third variable into our visualization. To make this plot distinct from the common examples, I’ve incorporated a linear regression line, which provides a visual representation of the overall trend in the data. Additionally, state abbreviations are labeled using the ggrepel package to ensure the text doesn’t overlap. The theme_minimal() from ggplot2 provides a clean and unobtrusive background, and the colors are to ensure they are distinct and visually pleasing. This graph helps in understanding the relationship between the population of a state and the total number of murders while also giving insights about regional differences.