── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.1.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs) library(ggthemes)library(ggrepel)
data("murders")
murders |>mutate(pop_millions = population /1e6,murder_rate = total / population *1e5) |>mutate(label_state =if_else(rank(-murder_rate) <=5, abb, NA_character_)) |>ggplot(aes(x = pop_millions, y = total,color = murder_rate, label = label_state)) +geom_point(alpha =0.8, size =4) +geom_text_repel(size =3, color ="black", na.rm =TRUE) +geom_smooth(method ="lm", se =FALSE,linetype ="dashed", color ="gray40") +scale_color_gradientn(colors =c("darkgreen", "gold", "orangered2"),name ="Murder Rate per 100K") +labs(title ="U.S. Gun Murders by State, 2010",subtitle ="Population vs. Total Murders (Color = Murder Rate per 100K)",x ="Population (millions)",y ="Total Murders") +theme_fivethirtyeight()
`geom_smooth()` using formula = 'y ~ x'
Warning: The following aesthetics were dropped during statistical transformation: label.
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
Essay Part
For this project, I used the“murders” dataset from the dslabs package. It shows the total number of gun murders for each U.S. state in 2010, along with their population and region. I made a scatterplot comparing population (in millions) on the x-axis and total murders on the y-axis. Each point represents a state, and the color shows the murder rate per 100,000 people using a green to yellow to red traffic light scale.
To make the graph less crowded, I only labeled the top five states with the highest murder rates instead of showing all of them. This helps the viewer focus on the states that stand out the most. The red points show which states have the highest murder rates, while green points show lower ones. The pattern shows that states with larger populations usually have more total murders, but the color also helps show how murder rates differ between states.