Show the code
# load the tidyverse
library(tidyverse)
library(patchwork)
library(ggthemes)This Quarto document serves as a practical illustration of the concepts covered in the productive workflow online course. It’s designed primarily for educational purposes, so the focus is on demonstrating Quarto techniques rather than on the rigor of its scientific content. The callout reference page is here.
This is the case study to make an html report by quarto. You can read more about the penguin dataset here.
Let’s load libraries before we start!
# load the tidyverse
library(tidyverse)
library(patchwork)
library(ggthemes)The dataset has already been loaded and cleaned in the previous step of this pipeline.
Let’s load the clean version, together with a few functions available in functions.R!
# Source functions
source(file = "functions.R")
# Read the clean dataset
data <- readRDS(file = "../input/clean_data.rds")Now, let’s make some descriptive analysis, including summary statistics and graphs.
What’s striking is the slightly negative relationship between bill length and bill depth:
data %>%
ggplot(
aes(x = bill_length_mm, y = bill_depth_mm)
) +
geom_point(color="#69b3a2") +
labs(
x = "Bill Length (mm)",
y = "Bill Depth (mm)",
title = paste("Surprising relationship?")
) +
theme_classic()It is also interesting to note that bill length a and bill depth are quite different from one specie to another. The average of a variable can be computed as follow:
\[Avg = \frac{1}{n} \displaystyle\sum_{i=1}^{n} a_i = \frac{a_1+a_2+\dots+a_n}{n}\]
bill length and bill depth averages are summarized in the 2 tables below.
# Calculating mean bill length for different species and islands using dplyr
data %>%
# filter(species == "Adelie") %>%
group_by(species) %>%
summarize(mean_bill_length = round(mean(as.numeric(bill_length_mm), na.rm = TRUE), 2))
# Calculating average bill depth for different species and islands using dplyr
data %>%
# filter(species == "Adelie") %>%
group_by(species) %>%
summarize(mean_bill_depth = round(mean(as.numeric(bill_depth_mm), na.rm = TRUE), 2))
adelie_avg <-
data%>%
filter(species == "Adelie") %>%
# group_by(species) %>%
summarize(mean_bill_length = round(mean(as.numeric(bill_length_mm), na.rm = TRUE), 2))# A tibble: 3 × 2
species mean_bill_length
<chr> <dbl>
1 Adelie 38.8
2 Chinstrap 48.8
3 Gentoo 47.5
# A tibble: 3 × 2
species mean_bill_depth
<chr> <dbl>
1 Adelie 18.3
2 Chinstrap 18.4
3 Gentoo 15.0
For instance, the average bill length for the specie Adelie is 38.81
Now, let’s check the relationship between bill depth and bill length for the specie Adelie on the island Torgersen:
#
# Use the function in functions.R
p1 <- create_scatterplot(data, "Adelie", "#6689c6")
p2 <- create_scatterplot(data, "Chinstrap", "#e85252")
p3 <- create_scatterplot(data, "Gentoo", "#9a6fb0")
p1+p2+p3