Assignment 7

Quarto

Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see https://quarto.org.

Running Code

When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:

1 + 1
[1] 2

You can add options to executable code like this

[1] 4

The echo: false option disables the printing of code (only output is displayed).

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)
data(package = "dslabs")
head(murders)
       state abb region population total
1    Alabama  AL  South    4779736   135
2     Alaska  AK   West     710231    19
3    Arizona  AZ   West    6392017   232
4   Arkansas  AR  South    2915918    93
5 California  CA   West   37253956  1257
6   Colorado  CO   West    5029196    65
data(murders)

ggplot(murders, aes(x = population, y = total, color = region)) +
  geom_point(size = 3, alpha = 0.75) +
  scale_x_log10() +
  scale_y_log10() +
  labs(
    title = "Population and Total Murders by U.S. Region",
    subtitle = "A multivariable scatterplot using the murders dataset",
    x = "Population (log scale)",
    y = "Total Murders (log scale)",
    color = "Region",
    caption = "Source: DS Labs"
  ) +
  scale_color_manual(values = c(
    "Northeast" = "steelblue3",
    "South" = "firebrick2",
    "North Central" = "darkorange2",
    "West" = "darkgreen"
  )) +
  theme_bw()

#For this assignment, I used the murders dataset from the dslabs package. This dataset includes information about U.S. states, their population, total number of murders, murder rate, and region. I created a multivariable scatterplot using population on the x-axis, total murders on the y-axis, and region as the third variable shown by color.

#This graph shows that states with larger populations usually have a higher total number of murders. The log scales make it easier to compare states of very different sizes. The color legend helps show how states from different regions are distributed across the graph. One interesting insight is that the South includes several states with relatively high total murders, while states in other regions are more spread out. Overall, this visualization makes it easier to compare both population size and murder totals while also considering regional differences.