DS Labs HW

Author

Jhonathan Urquilla

Installing what we might need

library("dslabs")
data(package="dslabs")
list.files(system.file("script", package = "dslabs"))
 [1] "make-admissions.R"                   
 [2] "make-brca.R"                         
 [3] "make-brexit_polls.R"                 
 [4] "make-calificaciones.R"               
 [5] "make-death_prob.R"                   
 [6] "make-divorce_margarine.R"            
 [7] "make-gapminder-rdas.R"               
 [8] "make-greenhouse_gases.R"             
 [9] "make-historic_co2.R"                 
[10] "make-mice_weights.R"                 
[11] "make-mnist_127.R"                    
[12] "make-mnist_27.R"                     
[13] "make-movielens.R"                    
[14] "make-murders-rda.R"                  
[15] "make-na_example-rda.R"               
[16] "make-nyc_regents_scores.R"           
[17] "make-olive.R"                        
[18] "make-outlier_example.R"              
[19] "make-polls_2008.R"                   
[20] "make-polls_us_election_2016.R"       
[21] "make-pr_death_counts.R"              
[22] "make-reported_heights-rda.R"         
[23] "make-research_funding_rates.R"       
[24] "make-stars.R"                        
[25] "make-temp_carbon.R"                  
[26] "make-tissue-gene-expression.R"       
[27] "make-trump_tweets.R"                 
[28] "make-weekly_us_contagious_diseases.R"
[29] "save-gapminder-example-csv.R"        

working with the Mice Weights Data

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggthemes)

data("mice_weights")
Warning in data("mice_weights"): data set 'mice_weights' not found
# Create scatterplot

ggplot(
  # Filter out and only inclued completed rows and columns
  subset(na.omit(mice_weights), select = c(body_weight, percent_fat, diet, sex, gen, bone_density)),
  aes(
    x = body_weight,
    y = percent_fat,
    color = factor(paste(sex, diet, sep = "_")),
    
    # I was having issues trying to figure out how to do the bone density breaks. I had AI help me start the breaks so i would be able to lebel them properly
    ## OpenAI. ChatGPT. June 2025 version, OpenAI, https://chat.openai.com/.
    shape = cut(
      bone_density,
      breaks = c(-Inf, 0.4, 0.7, Inf),
      labels = c("Low (≤ 0.4)", "Medium (0.4–0.7)", "High (> 0.7)"),
      right = TRUE
    )
  )
) +
  
  # visually editing the graph and adding legend
  
  geom_point(alpha = 0.8, size = 3) +
  facet_wrap(~ gen, scales = "free") +
  scale_color_manual(
    values = c(
      "F_chow" = "purple",
      "F_hf" = "#E69",
      "M_chow" = "#007",
      "M_hf" = "#4E9"
    ),
    name = "Sex & Diet"
  ) +
  scale_shape_manual(
    values = c("Low (≤ 0.4)" = 16, "Medium (0.4–0.7)" = 17, "High (> 0.7)" = 15),
    name = "Bone Density"
  ) +
  
  # Plotting titles and axis labels
  
  labs(
    title = "Mouse Body Weight vs Percent Fat\n with bone density by Generation",
    subtitle = "Mice's Generation",
    x = "Body Weight (grams)",
    y = "Percent Fat (%)"
  ) +
  theme_economist() +
  theme(
    plot.title = element_text(face = "bold", size = 16),
    plot.subtitle = element_text(size = 12),
    axis.title = element_text(size = 12),
    strip.text = element_text(size = 12),
    legend.position = "right"
  )

I utilized the mice_weights dataset from the dslabs package for this project. It includes measurements of mice from several generations that were given either high-fat or chow diets. A mouse’s sex, food type, body weight, percentage of fat, bone density, and generation are all included in each entry. I combined several variables into one faceted scatterplot to produce a significantly different image. Body weight is shown on the x-axis, while the percentage of fat is displayed on the y-axis. I utilized different colors to express the combination of food and sex, divided bone density into three unique bins (≤0.4, 0.4–0.7, and >0.7) and displayed them with different shapes, and used faceting to divide the plot by generation. In order to enhance visual clarity and professional style, I also used the theme_economist(). This multivariable graph illustrates the relationship between nutrition, sex, and bone density and body composition, allowing for visual comparison over generations