Observations and measurements of species within their natural habitats shows how animal populations are distributed, how they physically develop, and how they change over time. This data is critical for assessing the baseline health of an ecosystem before environmental shifts occur. The Palmer Archipelago in Antarctica inhabits a diverse mix of marine life, including several distinct species of penguins. Monitoring these colonies gives valuable insights into population dynamics, feeding capabilities, breeding behaviors, and overall environmental stability.
This report analyzes data from the Palmer Penguins data set, collected by Dr. Kristen Gorman and the Palmer Station Long Term Ecological Research Program. The data set contains size measurements and observations for three penguin species (Adelie, Chinstrap, and Gentoo) across three islands (Biscoe, Dream, and Torgersen) from 2007 to 2009.
By analyzing this data, this report aims to uncover trends in geographic segregation, evaluate the yearly stability of the colonies, assess breeding potential through gender ratios, and analyze physical scaling correlations. This analysis offers a comprehensive view of the ecological structure and health of these Antarctic penguin colonies
We compute summary statistics for primary numeric variables to gain an baseline understanding of the data set. The summary statistics highlight the tendencies and variability of each species of penguins’ physical traits across the entire observed population.
After removing the Null values from the data set we can convert the columns into the numeric class and generate summarized statistics of the variables.
# Creating a summary dataframe for numeric columns
numeric_columns <- penguins %>%
mutate(across(c(bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g), as.numeric)) %>%
select(bill_length_mm,bill_depth_mm,flipper_length_mm, body_mass_g)
# Generating Filtered Summary
numeric_summary <- data.frame(
Min = sapply(numeric_columns, min, na.rm = TRUE),
Mean = round(sapply(numeric_columns, mean, na.rm = TRUE), 2),
Median = sapply(numeric_columns, median, na.rm = TRUE),
Max = sapply(numeric_columns, max, na.rm = TRUE))
numeric_summary
## Min Mean Median Max
## bill_length_mm 32.1 43.99 44.5 59.6
## bill_depth_mm 13.1 17.16 17.3 21.5
## flipper_length_mm 172.0 200.97 197.0 231.0
## body_mass_g 2700.0 4207.06 4050.0 6300.0
While the summary table above provides a complete profile of the numeric variables in the data set, including bill length and depth for ecological context, this report narrows its focus to the body mass and flipper length variables. Bill measurements are excellent for micro level species identification, but mass and flipper length dictate macro level survival mechanics such as how these animals move, hunt, and scale physically across different island environments.By isolating these two specific variables, the following visualizations provide a sharper look at the proportional growth and aquatic dominance of the Palmer Archipelago colonies.
The mean and median are closely aligned for most variables, indicating that the physical measurements are relatively symmetrical across the sampled population without extreme outliers skewing the data.
The body mass of the observed penguins ranges widely from 2,700g to 6,300g, with an average of about 4,207g. This wide range where the heaviest penguins are more than double the weight of the lightest hints at significant physical differences either between the genders or among the three distinct species.
Flipper lengths range from 172mm to 231mm, with an average of about 200.9mm. Similar to the penguins body mass, this variance suggests certain groups within the data set have distinct physical advantages for swimming and diving.
Graphs and plots give a clearer view of the ecological data, highlighting trends in population distribution, physical size, yearly stability, and gender ratios. These visualizations help identify patterns and compare the distinct species within the Palmer Archipelago.
Penguin populations can vary dramatically depending on the specific environmental conditions of their habitats. This stacked bar chart displays how the three species are distributed across Biscoe, Dream, and Torgersen islands.
Island_pop_chart <- ggplot(penguins,
aes(x = reorder(island, island, function(x) length(x)),
fill = species)) +
geom_bar(position = "stack") +
coord_flip() +
geom_text(stat = "count",
aes(label = after_stat(count)),
position = position_stack(vjust = 0.5),
size = 4,
color = "black") +
scale_fill_manual(values = c("orange", "lightblue", "pink")) +
labs(title = "Penguin Species count by Island",
x = "Island",
y = "Count of Penguins",
fill = "Species") +
theme(plot.title = element_text(hjust = 0.5))
Island_pop_chart
This chart reveals strict geographic segregation among the species. Biscoe island has been the most populated and is exclusively the home of the Gentoo penguins. Adelie penguins maintain populations on all three islands, while Chinstraps are entirely isolated to Dream Island.
Ecological stability is measured over time. This dual-axis chart tracks both the total population count and average body mass across the observation years.
# Creating a DF of Yearly Average Mass of Penguins
year_stats <- penguins %>%
filter(!is.na(body_mass_g)) %>%
group_by(year) %>%
summarise(population = n(),
avg_mass = mean(body_mass_g))
# Fix Scaling
Scale <- 50
# Creating a Dual Axis Bar Chart
Yearly_trends_chart <- ggplot() +
geom_bar(data = penguins, aes(x = factor(year), fill = species),
position = "stack", width = 0.6) +
geom_line(data = year_stats, aes(x = factor(year), y = avg_mass / Scale, group = 1,
color = "Average Body Mass"), size = 1.2, linetype = "dashed") +
geom_point(data = year_stats, aes(x = factor(year), y = avg_mass / Scale),
size = 4, color = "black") +
geom_text(data = year_stats, aes(x = factor(year), y = avg_mass / Scale,
label = paste0(scales::comma(round(avg_mass, 0)), "g")),
vjust = 2) +
scale_y_continuous(name = "Penguin Population",
sec.axis = sec_axis(~ . * Scale, name = "Average Body Mass (g)", labels = scales::comma)) +
scale_fill_manual(values = c("darkorange", "purple", "lightblue")) +
labs(title = "Yearly Trends Population & Body Mass", x = "Year", fill = "Species") +
scale_color_manual(name = "", values = c("Average Body Mass" = "black")) +
theme(
legend.position = "bottom",
axis.title.y.right = element_text(margin = margin(l = 20)),
plot.title=element_text(hjust=0.5))
Yearly_trends_chart
The data shows excellent ecological stability. The population counts remained very consistent (indicated by the height of the bars according to the left y axis), and the average body mass only slightly fluctuated(indicated by the dashed line and points according to the right y axis). This indicates stable food availability and overall colony health throughout the observation years.
A healthy population relies on an even gender ratio. This diverging bar chart explores the balance of male and female penguins across the different islands and species.
#Creating Df to Quantify Gender Split per Species per Island
Gender_Split <- penguins %>%
filter(!is.na(sex)) %>%
count(island,species, sex)
#Creating Graph
Population_Balance_Chart <-
ggplot(Gender_Split, aes(x = species, fill = sex)) +
geom_col(data = subset(Gender_Split, sex == "female"),
aes(y = n), width = 0.7) +
geom_col(data = subset(Gender_Split, sex == "male"),
aes(y = -n), width = 0.7) +
geom_text(data = subset(Gender_Split, sex == "female"),
aes(y = n, label = n), hjust = 1.5, size = 3) +
geom_text(data = subset(Gender_Split, sex == "male"),
aes(y = -n, label = n), hjust = -0.5, size = 3) +
facet_wrap(~island) +
coord_flip() +
scale_y_continuous(labels = abs, limits = c(-70,70)) +
scale_fill_manual(values = c("female" = "pink", "male" = "lightblue")) +
labs(title = "Population Balance by Island", subtitle = "Island",
x = "Species", y = "Count", fill = "Sex") +
theme(plot.title = element_text(hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5))
Population_Balance_Chart
Across almost all islands and species, the bars are all nearly symmetrical with the Gentoo colony on Biscoe Island, being the only imbalance greater than 1, having three more males than females. This suggests strong reproductive potential across the Archipelago, with no major indicators of threats to future breeding seasons.
To survive in the harsh Antarctic waters, penguins rely entirely on their flippers when hunting for fish. By isolating this physical trait, we can determine which species and gender possesses the longest agerage flipper length, a direct indicator of swimming strength and hunting range.
# Heat Map Data Prep
heat_data <- penguins %>%
filter(!is.na(flipper_length_mm) & !is.na(sex)) %>%
group_by(species, sex) %>%
summarise(avg_flipper = mean(flipper_length_mm), .groups = 'drop')
#Creating Heat Map
Strong_Swim_Heatmap <- ggplot(heat_data, aes(x = sex, y = species, fill = avg_flipper)) +
geom_tile(color = "white", size = 1) +
geom_text(aes(label = paste0(round(avg_flipper, 1), " mm")),
color = "white", fontface = "bold", size = 5) +
scale_fill_gradient(low = "lightblue", high = "midnightblue",
name = "Avg Flipper\nLength (mm)") +
labs(title = "Swimming Potential Heat Map",
x = "Sex",
y = "Species") +
theme(plot.title = element_text(hjust = 0.5, face = "bold"),
plot.subtitle = element_text(hjust = 0.5),
panel.grid = element_blank())
Strong_Swim_Heatmap
The visualization shows two distinct biological rules within the colonies. First, there is an evident gap in flipper size between male and female penguins across all species. However the second and more profound finding is that species dictates swimming potential much more than gender. The darkest squares are entirely found in the Gentoo row. A female Gentoo is equipped with significantly longer flippers than the strongest males of both the Adelie and Chinstrap species. This suggests the Gentoo is physically adapted for different, more demanding aquatic behavior.
To test the correlation between flipper size and overall size, this scatter plot shows the individual relationship between size and power. This graph plots every penguin’s flipper length against its body mass to determine a correlation.
# Creating New Scatter Plot for Correlation: Flipper Length vs Body Mass
Correlation_Scatter_Plot <- ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
geom_point(alpha = 0.5, size = 3) +
geom_smooth(method = "lm", se = FALSE, size = 1.5) +
scale_color_manual(values = c("darkorange", "purple", "limegreen")) +
labs(title = "Correlation: Flipper Length vs. Body Mass",
x = "Flipper Length (mm)",
y = "Body Mass (g)",
color = "Species") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5))
Correlation_Scatter_Plot
The results from the scatter plot show a very strong positive correlation between flipper length and body mass. The regression lines for all three species slope upward, indicating that heavier penguins across the board require proportionally longer flippers. This relationship is consistent throughout the Archipelago, confirming that physical dimensions scale predictably as penguins grow.
This report analyzed the ecological structure and physical traits of the Palmer Penguins data set. By examining geographic distribution, yearly trends, population balance, and physical correlations, we identified distinct biological and spatial patterns among the Adelie, Chinstrap, and Gentoo populations.
The data revealed that these species are highly segregated by geography, with the largest penguins (Gentoo) dominating Biscoe Island, while Chinstraps remain isolated to Dream Island, with finally Adelies inhabit all three islands. Despite these differences, the colony as a whole show great signs of stability. Over the observation years, population counts and average body masses remained consistent, suggesting a stable environment. Also the symmetry in gender ratios across all islands indicates high breeding potential.
Additionally, the analysis explored physical correlations, proving that adaptations through evolution scale proportionally. Heavier penguins consistently are found with longer flippers, highlighting a direct correlation between body mass and aquatic physical traits.
Ultimately, this analysis provides valuable descriptive insights into the Palmer Archipelago ecosystem, underscoring the health, stability, and distinct biological diversity of its penguin colonies throughout the observed years.