“The Association Between Bill Size and Body Mass of Penguins: With differing effects depending on the Species, Diet, and Island”

This is my research to explore how bill size of penguins influences body mass in relation to their species, diet, and island. I hypothesize that bill size is positively associated with the body mass of penguins, with varying effects depending on species, diet, and island. The results of the statistical analysis support my hypothesis that bill size is positively associated with body mass, with varying effects depending on species, diet, and island. For example, according to the linear regression model, there is approximately 70.27% of the variability in body mass (Adjusted R-squared = 0.7027). The overall model was highly significant, F-statistic is 208.8, with 39 and 3390 degrees of freedom, and the result was highly significant at p-value < 2.2e-16. This indicates that the combination of predictor variables effectively accounted for a substantial proportion of the variance in body mass.The ANOVA results also confirmed the significance of those variables, including bill size (F = 1752.61, p < 2.2e-16), species (F = 43.50, p < 2.2e-16), and diet (F = 542.27, p < 2.2e-16), as well as interactions such as bill size:species (F = 5.30, p = 0.005) and bill size:diet (F = 9.50, p < 0.001).

# Load the penguins data
penguins <- read.csv("palmerpenguins.csv", header = TRUE)

# Check the data to ensure it was loaded correctly

# View the first and last rows of the dataset
head(penguins) # first 6 rows
##   species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## 1  Adelie Biscoe           53.4          17.8               219        5687
## 2  Adelie Biscoe           49.3          18.1               245        6811
## 3  Adelie Biscoe           55.7          16.6               226        5388
## 4  Adelie Biscoe           38.0          15.6               221        6262
## 5  Adelie Biscoe           60.7          17.9               177        4811
## 6  Adelie Biscoe           35.7          16.8               194        5266
##      sex diet life_stage health_metrics year
## 1 female fish      adult     overweight 2021
## 2 female fish      adult     overweight 2021
## 3 female fish      adult     overweight 2021
## 4 female fish      adult     overweight 2021
## 5 female fish   juvenile     overweight 2021
## 6 female fish   juvenile     overweight 2021
tail(penguins) # last 6 rows
##      species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## 3425  Gentoo Biscoe           39.6          21.7               265        7230
## 3426  Gentoo Biscoe           44.0          20.4               252        6447
## 3427  Gentoo Biscoe           54.5          25.2               245        6872
## 3428  Gentoo Biscoe           51.4          20.4               258        7409
## 3429  Gentoo Biscoe           55.9          20.5               247        6491
## 3430  Gentoo Biscoe           43.9          22.9               206        6835
##       sex  diet life_stage health_metrics year
## 3425 male squid      adult     overweight 2025
## 3426 male squid      adult        healthy 2025
## 3427 male squid      adult        healthy 2025
## 3428 male squid      adult     overweight 2025
## 3429 male squid      adult        healthy 2025
## 3430 male squid      adult        healthy 2025
colnames(penguins) # view column names
##  [1] "species"           "island"            "bill_length_mm"   
##  [4] "bill_depth_mm"     "flipper_length_mm" "body_mass_g"      
##  [7] "sex"               "diet"              "life_stage"       
## [10] "health_metrics"    "year"
# Check the structure of the data
str(penguins)# structure, giving a concise summary of this information, showing data types and first few values of each variable
## 'data.frame':    3430 obs. of  11 variables:
##  $ species          : chr  "Adelie" "Adelie" "Adelie" "Adelie" ...
##  $ island           : chr  "Biscoe" "Biscoe" "Biscoe" "Biscoe" ...
##  $ bill_length_mm   : num  53.4 49.3 55.7 38 60.7 35.7 61 66.1 61.4 54.9 ...
##  $ bill_depth_mm    : num  17.8 18.1 16.6 15.6 17.9 16.8 20.8 20.8 19.9 22.3 ...
##  $ flipper_length_mm: num  219 245 226 221 177 194 211 246 270 230 ...
##  $ body_mass_g      : num  5687 6811 5388 6262 4811 ...
##  $ sex              : chr  "female" "female" "female" "female" ...
##  $ diet             : chr  "fish" "fish" "fish" "fish" ...
##  $ life_stage       : chr  "adult" "adult" "adult" "adult" ...
##  $ health_metrics   : chr  "overweight" "overweight" "overweight" "overweight" ...
##  $ year             : int  2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
# Load dataset 
penguins <- read.csv("palmerpenguins.csv", header = TRUE)

# Step 1: Check the structure of the data ####
str(penguins) # This will show the structure of the dataset, including variable names and data types
## 'data.frame':    3430 obs. of  11 variables:
##  $ species          : chr  "Adelie" "Adelie" "Adelie" "Adelie" ...
##  $ island           : chr  "Biscoe" "Biscoe" "Biscoe" "Biscoe" ...
##  $ bill_length_mm   : num  53.4 49.3 55.7 38 60.7 35.7 61 66.1 61.4 54.9 ...
##  $ bill_depth_mm    : num  17.8 18.1 16.6 15.6 17.9 16.8 20.8 20.8 19.9 22.3 ...
##  $ flipper_length_mm: num  219 245 226 221 177 194 211 246 270 230 ...
##  $ body_mass_g      : num  5687 6811 5388 6262 4811 ...
##  $ sex              : chr  "female" "female" "female" "female" ...
##  $ diet             : chr  "fish" "fish" "fish" "fish" ...
##  $ life_stage       : chr  "adult" "adult" "adult" "adult" ...
##  $ health_metrics   : chr  "overweight" "overweight" "overweight" "overweight" ...
##  $ year             : int  2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
# Step 2: Count the number of variables ####
num_variables <- ncol(penguins) # Count the number of columns (variables)
cat("The dataset has", num_variables, "variables.\n")
## The dataset has 11 variables.
# Step 3: Identify categorical variables
# Categorical variables are stored as characters or factors
categorical.vars <- names(penguins)[sapply(penguins, is.character) | sapply(penguins, is.factor)]
cat("Categorical variables:", categorical.vars, "\n")
## Categorical variables: species island sex diet life_stage health_metrics
# Step 4: Find the levels within each categorical variable 
# Use the 'unique()' function to display levels (categories) for each categorical variable
cat("\nLevels within each categorical variable:\n")
## 
## Levels within each categorical variable:
cat("---------------------------\n")
## ---------------------------
# Print the variable name
for (var in categorical.vars) {cat("Variable:", var, "\n")
  print(unique(penguins[[var]])) 
  # Show unique levels
cat("---------------------------\n")}
## Variable: species 
## [1] "Adelie"    "Chinstrap" "Gentoo"   
## ---------------------------
## Variable: island 
## [1] "Biscoe"    "Dream"     "Torgensen"
## ---------------------------
## Variable: sex 
## [1] "female" "male"  
## ---------------------------
## Variable: diet 
## [1] "fish"     "krill"    "parental" "squid"   
## ---------------------------
## Variable: life_stage 
## [1] "adult"    "juvenile" "chick"   
## ---------------------------
## Variable: health_metrics 
## [1] "overweight"  "healthy"     "underweight"
## ---------------------------
# Summary statistics for each variable
summary(penguins)
##    species             island          bill_length_mm  bill_depth_mm  
##  Length:3430        Length:3430        Min.   :13.60   Min.   : 9.10  
##  Class :character   Class :character   1st Qu.:28.90   1st Qu.:16.60  
##  Mode  :character   Mode  :character   Median :34.50   Median :18.40  
##                                        Mean   :38.53   Mean   :18.45  
##                                        3rd Qu.:46.60   3rd Qu.:20.30  
##                                        Max.   :88.20   Max.   :27.90  
##  flipper_length_mm  body_mass_g        sex                diet          
##  Min.   :140       Min.   : 2477   Length:3430        Length:3430       
##  1st Qu.:185       1st Qu.: 3844   Class :character   Class :character  
##  Median :203       Median : 4634   Mode  :character   Mode  :character  
##  Mean   :207       Mean   : 4835                                        
##  3rd Qu.:226       3rd Qu.: 5622                                        
##  Max.   :308       Max.   :10549                                        
##   life_stage        health_metrics          year     
##  Length:3430        Length:3430        Min.   :2021  
##  Class :character   Class :character   1st Qu.:2022  
##  Mode  :character   Mode  :character   Median :2024  
##                                        Mean   :2023  
##                                        3rd Qu.:2024  
##                                        Max.   :2025
# Check the levels of each factor variable
sapply(penguins, function(x) if (is.factor(x)) levels(x)) # lists levels for categorical variables
## $species
## NULL
## 
## $island
## NULL
## 
## $bill_length_mm
## NULL
## 
## $bill_depth_mm
## NULL
## 
## $flipper_length_mm
## NULL
## 
## $body_mass_g
## NULL
## 
## $sex
## NULL
## 
## $diet
## NULL
## 
## $life_stage
## NULL
## 
## $health_metrics
## NULL
## 
## $year
## NULL
# Data summaries 

# How many observations per group? to check if the number of observations is evenly distributed across different groups in the categorical variables
table(penguins$species) # counts of species
## 
##    Adelie Chinstrap    Gentoo 
##      1560       623      1247
table(penguins$island)  # counts of islands
## 
##    Biscoe     Dream Torgensen 
##      1785      1133       512
table(penguins$sex)
## 
## female   male 
##   1726   1704
table(penguins$diet)
## 
##     fish    krill parental    squid 
##      958     1419      860      193
table(penguins$life_stage)
## 
##    adult    chick juvenile 
##     1029      860     1541
table(penguins$health_metrics)
## 
##     healthy  overweight underweight 
##        1550        1167         713
table(penguins$year)
## 
## 2021 2022 2023 2024 2025 
##  356  658  695  877  844
library(dplyr)    # Load the package 
## Warning: package 'dplyr' was built under R version 4.4.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Calculate mean difference  between numerical variables across groups
# Select the 4 numerical variables
numerical_data <- penguins %>%
  select(bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g)

# Calculate mean and standard deviation for each variable
summary_stats <- numerical_data %>%
  summarise(
    mean_bill_length = mean(bill_length_mm, na.rm = TRUE),
    sd_bill_length = sd(bill_length_mm, na.rm = TRUE),
    mean_bill_depth = mean(bill_depth_mm, na.rm = TRUE),
    sd_bill_depth = sd(bill_depth_mm, na.rm = TRUE),
    mean_flipper_length = mean(flipper_length_mm, na.rm = TRUE),
    sd_flipper_length = sd(flipper_length_mm, na.rm = TRUE),
    mean_body_mass = mean(body_mass_g, na.rm = TRUE),
    sd_body_mass = sd(body_mass_g, na.rm = TRUE)
  )

# View the calculated means and spreads
print(summary_stats)
##   mean_bill_length sd_bill_length mean_bill_depth sd_bill_depth
## 1         38.52983       13.17517        18.44714      2.774428
##   mean_flipper_length sd_flipper_length mean_body_mass sd_body_mass
## 1            207.0289          28.94476        4834.71     1311.091
# Check for normal distribution 

# Body Mass
m_body_mass <- mean(penguins$body_mass_g, na.rm=TRUE)  # Calculate mean
std_body_mass <- sd(penguins$body_mass_g, na.rm=TRUE)  # Calculate standard deviation

hist(penguins$body_mass_g, 
     density=20, 
     breaks=20, 
     prob=TRUE, 
     xlab="Body Mass (g)", 
     ylim=c(0, 0.0005), 
     main="Normal Curve over Body Mass Histogram")
curve(dnorm(x, mean=m_body_mass, sd=std_body_mass), 
      col="red", 
      lwd=2, 
      add=TRUE, 
      yaxt="n")

# Repeat for Bill Length
m_bill_length <- mean(penguins$bill_length_mm, na.rm=TRUE)
std_bill_length <- sd(penguins$bill_length_mm, na.rm=TRUE)

hist(penguins$bill_length_mm, 
     density=20, 
     breaks=20, 
     prob=TRUE, 
     xlab="Bill Length (mm)", 
     ylim=c(0, 0.06), 
     main="Normal Curve over Bill Length Histogram")
curve(dnorm(x, mean=m_bill_length, sd=std_bill_length), 
      col="red", 
      lwd=2, 
      add=TRUE, 
      yaxt="n")

# Repeat for Bill Depth
m_bill_depth <- mean(penguins$bill_depth_mm, na.rm=TRUE)
std_bill_depth <- sd(penguins$bill_depth_mm, na.rm=TRUE)

hist(penguins$bill_depth_mm, 
     density=20, 
     breaks=20, 
     prob=TRUE, 
     xlab="Bill Depth (mm)", 
     ylim=c(0, 0.12), 
     main="Normal Curve over Bill Depth Histogram")
curve(dnorm(x, mean=m_bill_depth, sd=std_bill_depth), 
      col="red", 
      lwd=2, 
      add=TRUE, 
      yaxt="n")

# Repeat for Flipper Length
m_flipper_length <- mean(penguins$flipper_length_mm, na.rm=TRUE)
std_flipper_length <- sd(penguins$flipper_length_mm, na.rm=TRUE)

hist(penguins$flipper_length_mm, 
     density=20, 
     breaks=20, 
     prob=TRUE, 
     xlab="Flipper Length (mm)", 
     ylim=c(0, 0.04), 
     main="Normal Curve over Flipper Length Histogram")
curve(dnorm(x, mean=m_flipper_length, sd=std_flipper_length), 
      col="red", 
      lwd=2, 
      add=TRUE, 
      yaxt="n")

# Quantile-quantile (QQ) plot for normality check

# Body Mass
qqnorm(penguins$body_mass_g, ylab="Standardized Residuals", 
       xlab="Normal Scores", main="QQ Plot for Body Mass", pch=16)
qqline(penguins$body_mass_g, col="red")

# Bill Length
qqnorm(penguins$bill_length_mm, ylab="Standardized Residuals", 
       xlab="Normal Scores", main="QQ Plot for Bill Length", pch=16)
qqline(penguins$bill_length_mm, col="red")

# Bill Depth
qqnorm(penguins$bill_depth_mm, ylab="Standardized Residuals", 
       xlab="Normal Scores", main="QQ Plot for Bill Depth", pch=16)
qqline(penguins$bill_depth_mm, col="red")

# Flipper Length
qqnorm(penguins$flipper_length_mm, ylab="Standardized Residuals", 
       xlab="Normal Scores", main="QQ Plot for Flipper Length", pch=16)
qqline(penguins$flipper_length_mm, col="red")


# create two simple figures based on bill dimensions and body mass
# Load necessary library
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.3

citation("ggplot2")
## To cite ggplot2 in publications, please use
## 
##   H. Wickham. ggplot2: Elegant Graphics for Data Analysis.
##   Springer-Verlag New York, 2016.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Book{,
##     author = {Hadley Wickham},
##     title = {ggplot2: Elegant Graphics for Data Analysis},
##     publisher = {Springer-Verlag New York},
##     year = {2016},
##     isbn = {978-3-319-24277-4},
##     url = {https://ggplot2.tidyverse.org},
##   }
citation("lme4")
## To cite lme4 in publications use:
## 
##   Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015).
##   Fitting Linear Mixed-Effects Models Using lme4. Journal of
##   Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     title = {Fitting Linear Mixed-Effects Models Using {lme4}},
##     author = {Douglas Bates and Martin M{\"a}chler and Ben Bolker and Steve Walker},
##     journal = {Journal of Statistical Software},
##     year = {2015},
##     volume = {67},
##     number = {1},
##     pages = {1--48},
##     doi = {10.18637/jss.v067.i01},
##   }
# Scatterplot for bill length vs body mass
ggplot(penguins, aes(x = bill_length_mm, y = body_mass_g)) +
  geom_point(size = 1, alpha = 0.7, color = "blue") +
  geom_smooth(method = "lm", se = TRUE, color = "red") + # Adds linear regression line
  labs(title = "Bill Length vs Body Mass",
       x = "Bill Length (mm)",
       y = "Body Mass (g)") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

#Scatterplot for bill depth vs body mass
ggplot(penguins, aes(x = bill_depth_mm, y = body_mass_g)) +
  geom_point(size = 1, alpha = 0.7, color = "pink") +
  geom_smooth(method = "lm", se = TRUE, color = "orange") + # Adds linear regression line
  labs(title = "Bill Depth vs Body Mass",
       x = "Bill Depth (mm)",
       y = "Body Mass (g)") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

#Hypothesis: The bill size is positively correlated to the body mass of the penguins with differing effects depending on the species, diet and island.
#Dependent variable(y): Body mass (g)
#Independent variable(x): Bill size (calculated as bill length × bill dept)
#Control variables: Species, Diet, and Island

# Load necessary libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ lubridate 1.9.3     ✔ tibble    3.2.1
## ✔ purrr     1.0.2     ✔ tidyr     1.3.1
## ✔ readr     2.1.5
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(car)
## Warning: package 'car' was built under R version 4.4.3
## Loading required package: carData
## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:purrr':
## 
##     some
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
# Step 1: Create a new variable for bill size (bill length x bill depth)
penguins$bill_size <- penguins$bill_length_mm * penguins$bill_depth_mm

# Step 2: Remove missing values
penguins <- na.omit(penguins)

penguins$species <- as.factor(penguins$species)
penguins$diet <- as.factor(penguins$diet)
penguins$island <- as.factor(penguins$island)

# Step 3: Fit a linear regression model
# Include bill size, species, diet, and island, with interactions
model <- lm(body_mass_g ~ bill_size * species * diet * island, data = penguins)

# View the summary of the model
summary(model)
## 
## Call:
## lm(formula = body_mass_g ~ bill_size * species * diet * island, 
##     data = penguins)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2273.2  -445.0   -48.7   410.4  4180.2 
## 
## Coefficients: (32 not defined because of singularities)
##                                                           Estimate Std. Error
## (Intercept)                                              4.199e+03  2.403e+02
## bill_size                                                1.557e+00  2.447e-01
## speciesChinstrap                                        -7.282e+02  5.048e+02
## speciesGentoo                                           -2.290e+02  2.840e+02
## dietkrill                                               -1.320e+03  3.288e+02
## dietparental                                            -1.247e+03  3.747e+02
## dietsquid                                                1.834e+02  7.322e+02
## islandDream                                              2.232e+02  4.599e+02
## islandTorgensen                                         -2.079e+02  3.557e+02
## bill_size:speciesChinstrap                               9.402e-01  4.835e-01
## bill_size:speciesGentoo                                  6.290e-01  2.730e-01
## bill_size:dietkrill                                      1.504e+00  4.966e-01
## bill_size:dietparental                                  -8.474e-01  6.404e-01
## bill_size:dietsquid                                     -6.342e-01  8.627e-01
## speciesChinstrap:dietkrill                               6.316e+02  5.659e+02
## speciesGentoo:dietkrill                                  5.931e+02  3.980e+02
## speciesChinstrap:dietparental                            2.862e+02  6.791e+02
## speciesGentoo:dietparental                               3.528e+02  4.598e+02
## speciesChinstrap:dietsquid                               7.892e+02  9.695e+02
## speciesGentoo:dietsquid                                 -3.529e+02  8.417e+02
## bill_size:islandDream                                   -3.528e-01  4.612e-01
## bill_size:islandTorgensen                                1.459e-01  3.661e-01
## speciesChinstrap:islandDream                                    NA         NA
## speciesGentoo:islandDream                                       NA         NA
## speciesChinstrap:islandTorgensen                                NA         NA
## speciesGentoo:islandTorgensen                                   NA         NA
## dietkrill:islandDream                                    4.807e+02  5.395e+02
## dietparental:islandDream                                 7.796e+01  6.263e+02
## dietsquid:islandDream                                   -1.156e+03  1.042e+03
## dietkrill:islandTorgensen                                8.048e+02  4.788e+02
## dietparental:islandTorgensen                             1.292e+02  5.402e+02
## dietsquid:islandTorgensen                               -7.821e+02  9.702e+02
## bill_size:speciesChinstrap:dietkrill                    -4.918e-01  6.776e-01
## bill_size:speciesGentoo:dietkrill                       -8.465e-01  5.707e-01
## bill_size:speciesChinstrap:dietparental                  1.674e-01  9.925e-01
## bill_size:speciesGentoo:dietparental                    -1.102e-01  7.313e-01
## bill_size:speciesChinstrap:dietsquid                    -7.323e-01  1.130e+00
## bill_size:speciesGentoo:dietsquid                        8.255e-01  9.612e-01
## bill_size:speciesChinstrap:islandDream                          NA         NA
## bill_size:speciesGentoo:islandDream                             NA         NA
## bill_size:speciesChinstrap:islandTorgensen                      NA         NA
## bill_size:speciesGentoo:islandTorgensen                         NA         NA
## bill_size:dietkrill:islandDream                         -1.168e+00  7.153e-01
## bill_size:dietparental:islandDream                      -2.074e-01  9.747e-01
## bill_size:dietsquid:islandDream                          1.562e+00  1.258e+00
## bill_size:dietkrill:islandTorgensen                     -1.363e+00  7.130e-01
## bill_size:dietparental:islandTorgensen                   1.613e-02  8.937e-01
## bill_size:dietsquid:islandTorgensen                      1.219e+00  1.179e+00
## speciesChinstrap:dietkrill:islandDream                          NA         NA
## speciesGentoo:dietkrill:islandDream                             NA         NA
## speciesChinstrap:dietparental:islandDream                       NA         NA
## speciesGentoo:dietparental:islandDream                          NA         NA
## speciesChinstrap:dietsquid:islandDream                          NA         NA
## speciesGentoo:dietsquid:islandDream                             NA         NA
## speciesChinstrap:dietkrill:islandTorgensen                      NA         NA
## speciesGentoo:dietkrill:islandTorgensen                         NA         NA
## speciesChinstrap:dietparental:islandTorgensen                   NA         NA
## speciesGentoo:dietparental:islandTorgensen                      NA         NA
## speciesChinstrap:dietsquid:islandTorgensen                      NA         NA
## speciesGentoo:dietsquid:islandTorgensen                         NA         NA
## bill_size:speciesChinstrap:dietkrill:islandDream                NA         NA
## bill_size:speciesGentoo:dietkrill:islandDream                   NA         NA
## bill_size:speciesChinstrap:dietparental:islandDream             NA         NA
## bill_size:speciesGentoo:dietparental:islandDream                NA         NA
## bill_size:speciesChinstrap:dietsquid:islandDream                NA         NA
## bill_size:speciesGentoo:dietsquid:islandDream                   NA         NA
## bill_size:speciesChinstrap:dietkrill:islandTorgensen            NA         NA
## bill_size:speciesGentoo:dietkrill:islandTorgensen               NA         NA
## bill_size:speciesChinstrap:dietparental:islandTorgensen         NA         NA
## bill_size:speciesGentoo:dietparental:islandTorgensen            NA         NA
## bill_size:speciesChinstrap:dietsquid:islandTorgensen            NA         NA
## bill_size:speciesGentoo:dietsquid:islandTorgensen               NA         NA
##                                                         t value Pr(>|t|)    
## (Intercept)                                              17.473  < 2e-16 ***
## bill_size                                                 6.365 2.22e-10 ***
## speciesChinstrap                                         -1.443 0.149206    
## speciesGentoo                                            -0.807 0.419950    
## dietkrill                                                -4.013 6.12e-05 ***
## dietparental                                             -3.328 0.000883 ***
## dietsquid                                                 0.250 0.802234    
## islandDream                                               0.485 0.627373    
## islandTorgensen                                          -0.584 0.558942    
## bill_size:speciesChinstrap                                1.945 0.051888 .  
## bill_size:speciesGentoo                                   2.304 0.021274 *  
## bill_size:dietkrill                                       3.028 0.002480 ** 
## bill_size:dietparental                                   -1.323 0.185800    
## bill_size:dietsquid                                      -0.735 0.462315    
## speciesChinstrap:dietkrill                                1.116 0.264466    
## speciesGentoo:dietkrill                                   1.490 0.136198    
## speciesChinstrap:dietparental                             0.421 0.673511    
## speciesGentoo:dietparental                                0.767 0.442952    
## speciesChinstrap:dietsquid                                0.814 0.415674    
## speciesGentoo:dietsquid                                  -0.419 0.675077    
## bill_size:islandDream                                    -0.765 0.444416    
## bill_size:islandTorgensen                                 0.399 0.690168    
## speciesChinstrap:islandDream                                 NA       NA    
## speciesGentoo:islandDream                                    NA       NA    
## speciesChinstrap:islandTorgensen                             NA       NA    
## speciesGentoo:islandTorgensen                                NA       NA    
## dietkrill:islandDream                                     0.891 0.373004    
## dietparental:islandDream                                  0.124 0.900941    
## dietsquid:islandDream                                    -1.109 0.267479    
## dietkrill:islandTorgensen                                 1.681 0.092871 .  
## dietparental:islandTorgensen                              0.239 0.810911    
## dietsquid:islandTorgensen                                -0.806 0.420263    
## bill_size:speciesChinstrap:dietkrill                     -0.726 0.467970    
## bill_size:speciesGentoo:dietkrill                        -1.483 0.138108    
## bill_size:speciesChinstrap:dietparental                   0.169 0.866040    
## bill_size:speciesGentoo:dietparental                     -0.151 0.880187    
## bill_size:speciesChinstrap:dietsquid                     -0.648 0.517006    
## bill_size:speciesGentoo:dietsquid                         0.859 0.390527    
## bill_size:speciesChinstrap:islandDream                       NA       NA    
## bill_size:speciesGentoo:islandDream                          NA       NA    
## bill_size:speciesChinstrap:islandTorgensen                   NA       NA    
## bill_size:speciesGentoo:islandTorgensen                      NA       NA    
## bill_size:dietkrill:islandDream                          -1.633 0.102492    
## bill_size:dietparental:islandDream                       -0.213 0.831537    
## bill_size:dietsquid:islandDream                           1.242 0.214261    
## bill_size:dietkrill:islandTorgensen                      -1.912 0.055972 .  
## bill_size:dietparental:islandTorgensen                    0.018 0.985604    
## bill_size:dietsquid:islandTorgensen                       1.034 0.301170    
## speciesChinstrap:dietkrill:islandDream                       NA       NA    
## speciesGentoo:dietkrill:islandDream                          NA       NA    
## speciesChinstrap:dietparental:islandDream                    NA       NA    
## speciesGentoo:dietparental:islandDream                       NA       NA    
## speciesChinstrap:dietsquid:islandDream                       NA       NA    
## speciesGentoo:dietsquid:islandDream                          NA       NA    
## speciesChinstrap:dietkrill:islandTorgensen                   NA       NA    
## speciesGentoo:dietkrill:islandTorgensen                      NA       NA    
## speciesChinstrap:dietparental:islandTorgensen                NA       NA    
## speciesGentoo:dietparental:islandTorgensen                   NA       NA    
## speciesChinstrap:dietsquid:islandTorgensen                   NA       NA    
## speciesGentoo:dietsquid:islandTorgensen                      NA       NA    
## bill_size:speciesChinstrap:dietkrill:islandDream             NA       NA    
## bill_size:speciesGentoo:dietkrill:islandDream                NA       NA    
## bill_size:speciesChinstrap:dietparental:islandDream          NA       NA    
## bill_size:speciesGentoo:dietparental:islandDream             NA       NA    
## bill_size:speciesChinstrap:dietsquid:islandDream             NA       NA    
## bill_size:speciesGentoo:dietsquid:islandDream                NA       NA    
## bill_size:speciesChinstrap:dietkrill:islandTorgensen         NA       NA    
## bill_size:speciesGentoo:dietkrill:islandTorgensen            NA       NA    
## bill_size:speciesChinstrap:dietparental:islandTorgensen      NA       NA    
## bill_size:speciesGentoo:dietparental:islandTorgensen         NA       NA    
## bill_size:speciesChinstrap:dietsquid:islandTorgensen         NA       NA    
## bill_size:speciesGentoo:dietsquid:islandTorgensen            NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 714.9 on 3390 degrees of freedom
## Multiple R-squared:  0.7061, Adjusted R-squared:  0.7027 
## F-statistic: 208.8 on 39 and 3390 DF,  p-value: < 2.2e-16
# Step 5: Diagnostic plots for regression
par(mfrow = c(2, 2))  # Split plot into 2x2 grid
plot(model)  # Check residuals, normality, homoscedasticity

# Step 6: Visualize the relationship
# Scatter plot of bill size vs body mass, colored by species and diet, faceted by island
ggplot(penguins, aes(x = bill_size, y = body_mass_g, color = species, shape = diet)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
  facet_wrap(~island) +  # Separate plots by island
  labs(title = "Relationship Between Bill Size and Body Mass Across Islands",
       x = "Bill Size (Bill Length x Bill Depth)",
       y = "Body Mass (g)") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

library(dplyr)


# Check the relationship between species, bill length, bill depth, and body mass

library(dplyr)

# Check the relationship between species, bill length, bill depth, and body mass
penguins %>%
  group_by(species) %>%
  summarize(mean_bill_length = mean(bill_length_mm, na.rm = TRUE),
            mean_bill_depth = mean(bill_depth_mm, na.rm = TRUE),
            mean_body_mass = mean(body_mass_g, na.rm = TRUE))
## # A tibble: 3 × 4
##   species   mean_bill_length mean_bill_depth mean_body_mass
##   <fct>                <dbl>           <dbl>          <dbl>
## 1 Adelie                35.4            17.5          4445.
## 2 Chinstrap             35.0            18.4          4603.
## 3 Gentoo                44.2            19.7          5438.
# Step 7: Hypothesis testing with the model
library(car)
Anova(model)
## Note: model has aliased coefficients
##       sums of squares computed by model comparison
## Anova Table (Type II tests)
## 
## Response: body_mass_g
##                                   Sum Sq   Df  F value    Pr(>F)    
## bill_size                      384645096    1 752.6134 < 2.2e-16 ***
## species                         44468270    2  43.5043 < 2.2e-16 ***
## diet                           831423576    3 542.2666 < 2.2e-16 ***
## island                            508055    2   0.4970  0.608372    
## bill_size:species                5414758    2   5.2974  0.005046 ** 
## bill_size:diet                  14559888    3   9.4962 3.032e-06 ***
## species:diet                     3416891    6   1.1143  0.351161    
## bill_size:island                 2166814    2   2.1198  0.120210    
## species:island                              0                       
## diet:island                      1425614    6   0.4649  0.834715    
## bill_size:species:diet           2211472    6   0.7212  0.632536    
## bill_size:species:island                    0                       
## bill_size:diet:island            3766734    6   1.2284  0.288279    
## species:diet:island                         0                       
## bill_size:species:diet:island               0                       
## Residuals                     1732558656 3390                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1