Code
library(tidyverse)
library(conflicted)
library(readr)
library(tidyr)
library(dplyr)
library(patchwork)
library(modeest)
library(DescTools)
library(knitr)library(tidyverse)
library(conflicted)
library(readr)
library(tidyr)
library(dplyr)
library(patchwork)
library(modeest)
library(DescTools)
library(knitr)An observation was conducted to evaluate how different dietary treatments influenced litter weight gain and the reproductive performance of sows at the following farrowing cycle (Henman et al., 2023). I chose this data-set to focus on as the relationship between the average weight data-sets, and the treatment data-set was interesting. Taking a look between the relationship of these variables will provide insight to what treatment is best for a heavier litter.
I wish to investigate the correlation between the diet fed, and whether the feed affects the weight of the piglets overtime. I am also interested what feed makes the litter the heaviest and how it compares to a litter with a normal diet.
Therefore I would like to know:
To address these questions, I will analyse the following variables:
Treatment = Categorical (Nominal)
Average weight kg = Numerical
Average weight D3 kg = Numerical
Average weight D14 = Numerical
Average wean weight = Numerical
The treatment variable will help me to compare the treatments that each sow receives and therefore connect that with how much weight a piglet gains.
A = Normal diet (control)
B = Additional 10 kg of protein feed
C = Additional 20 kg of protein feed
The average weight variable is the initial weight that the piglets are born in.
The average weight D3 and D14 variable will allow for me to view how the litter weight has increased overtime.
The average wean weight is the last variable provided, showing the final recorded weight.
sows_data <- read_csv('/Users/sabrinagarcia/Desktop/envx/assignment/Describing Data Report/sows.csv')
sows_data$Treatment <- factor(sows_data$Treatment)ggplot(sows_data, aes(x = Average_weight_kg)) +
geom_histogram (bins = 25 , fill = "rosybrown2" , color = "rosybrown" ) +
labs(
title = "Average Litter Weight Upon Birth",
x = "Weight (Kg)",
y = "Litter Count"
)summary(sows_data$Average_weight_kg) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.9533 1.4138 1.5375 1.5662 1.6917 2.2750
mfv(sows_data$Average_weight_kg)[1] 1.4 1.5 1.6
This graph shows that weights range from 0.9kg-2.3kg. Outliers are present on both ends of the graph. The mean (1.5662kg) is slightly higher than the median (1.5375kg) — indicating that the data has a slight positive skew. Furthermore the average weight is around the expected range of the piglets weight (Pork Checkoff, n.d.). The overall range for this dataset is 1.3217kg.
p1 <- ggplot(sows_data, aes(x = Average_weight_D3_kg)) +
geom_histogram (bins = 25 , fill = "rosybrown2" , color = "rosybrown" ) +
labs(
title = "Average Litter Weight, 3 Days After Birth",
x = "Weight (Kg)",
y = "Litter Count"
)
p2 <- ggplot(sows_data, aes(x = Average_weight_D14_kg)) +
geom_histogram (bins = 25 , fill = "rosybrown2" , color = "rosybrown" ) +
labs(
title = "Average Litter Weight, 14 Days After Birth",
x = "Weight (Kg)",
y = "Litter Count"
)
p1 + p2# Get summary statistics
summary_stats <- summary(sows_data$Average_weight_D3_kg)
mfv_value <- mfv(sows_data$Average_weight_D3_kg, na_rm = TRUE)
# Extract only the numeric values (removes NA count if present)
summary_vals <- as.numeric(summary_stats[1:6])
# Ensure mfv_value is a single number
mfv_single <- as.numeric(mfv_value)[1]
# Create a data frame for the table
results_table <- data.frame(
Statistic = c("Min", "Q1 (25th percentile)", "Median", "Mean",
"Q3 (75th percentile)", "Max", "Most Frequent Value (MFV)"),
Value = c(summary_vals, mfv_single)
)
# Display as a nice table
kable(results_table, digits = 2, caption = "Average Weight D3 (kg) - Summary Statistics")| Statistic | Value |
|---|---|
| Min | 1.11 |
| Q1 (25th percentile) | 1.84 |
| Median | 2.00 |
| Mean | 2.00 |
| Q3 (75th percentile) | 2.21 |
| Max | 2.94 |
| Most Frequent Value (MFV) | 1.70 |
In Figure 2, the data is non-symmetric, appearing to have multiple modes, indicated through the peaks — the most frequent weight is 1.7-1.9kg. There are several visible outliers seen within this graph that are on the right-hand side. The average weight has increased to 2kg compared to the initial birth, and there has been an overall increase in both the minimum and maximum values. The range for this dataset is 1.833kg.
summary(sows_data$Average_weight_D14_kg) Min. 1st Qu. Median Mean 3rd Qu. Max.
2.900 4.418 4.941 4.888 5.367 6.425
mfv(sows_data$Average_weight_D14_kg)[1] 5.06
Figure 3 has a negative skew, which indicates that the median is greater than the mean — this is supported by the summary, the median is 4.941kg, and the mean is 4.888kg. The peak within this graph suggests that the most common weight is around 5kg. There has been a significant increase in the maximum weight, while the minimum weight has only increased by 1.5kg. The range for this dataset is 3.525kg.
ggplot(sows_data, aes(x = Average_wean_weight_kg)) +
geom_histogram (bins = 25 , fill = "rosybrown2" , color = "rosybrown" ) +
labs(
title = "Average Litter weight, when Weaned",
x = "Weight (Kg)",
y = "Litter Count"
)summary(sows_data$Average_wean_weight_kg) Min. 1st Qu. Median Mean 3rd Qu. Max.
4.917 7.378 7.893 7.995 8.647 11.475
mfv(sows_data$Average_wean_weight_kg)[1] 7.400000 7.933333
The data distribution is positively skewed, indicating the mean (7.995kg) is higher than the median (7.893kg). The two peaks within the graph and the mode suggest that the most frequent weight is 7.4kg and 7.933kg. The minimum value increased by 2.017kg, while the maximum increased by 5.05kg. The range for this dataset is 6.558kg.
The box plots below compare the average weights of piglets across the different diets. Notably, piglets from sows on diet B have a larger IQR at birth — indicating higher variability within that data-set. Furthermore, across all graphs, treatment A had the smallest IQR out of the three variables, as well as the most outliers.
p1 <- ggplot(sows_data, aes(x = Treatment, y = Average_weight_kg)) +
geom_boxplot( fill = "rosybrown1" , color = "rosybrown3") +
labs(
title = "Avg. Birthweight of Piglets",
x = "Treatment Types",
y = "Weight (Kg)"
)
#| label: box-avg-weight-d3
p2 <- ggplot(sows_data, aes(x = Treatment, y = Average_weight_D3_kg)) +
geom_boxplot( fill = "rosybrown2" , color = "rosybrown") +
labs(
title = "Avg. Weight of Piglets, D3",
x = "Treatment",
y = "Weight (Kg)"
)
#| label: box-avg-weight-d14
p3 <- ggplot(sows_data, aes(x = Treatment, y = Average_weight_D14_kg)) +
geom_boxplot( fill = "rosybrown3" , color = "rosybrown4") +
labs(
title = "Avg. Weight of Piglets, D14",
x = "Treatment",
y = "Weight (Kg)"
)
#| label: box-avg-weight-weaned
p4 <- ggplot(sows_data, aes(x = Treatment, y = Average_wean_weight_kg)) +
geom_boxplot( fill = "rosybrown" , color = "rosybrown4") +
labs(
title = "Avg. Weight of Piglets, Weaned",
x = "Treatment Types",
y = "Weight (Kg)"
)
(p1 + p2) / (p3 + p4)sows_long <- sows_data |>
select(Treatment,
Average_weight_kg,
Average_weight_D3_kg,
Average_weight_D14_kg,
Average_wean_weight_kg) |>
pivot_longer(
cols = -Treatment,
names_to = "Timepoint",
values_to = "Weight_kg"
) |>
mutate(
Day = case_when(
Timepoint == "Average_weight_kg" ~ 0,
Timepoint == "Average_weight_D3_kg" ~ 3,
Timepoint == "Average_weight_D14_kg" ~ 14,
Timepoint == "Average_wean_weight_kg" ~ 28
),
Treatment = factor(Treatment, levels = c("A", "B", "C"))
)
p <- ggplot(sows_long, aes(x = Day, y = Weight_kg, colour = Treatment)) +
geom_jitter(alpha = 0.40, size = 2, width = 1) +
stat_summary(fun = mean, geom = "line", linewidth = 1.1, aes(group = Treatment)) +
stat_summary(fun = mean, geom = "point", size = 3.5, shape = 18) +
scale_x_continuous(
breaks = c(0, 3, 14, 28),
labels = c("Day 0\n(Birth)", "Day 3", "Day 14", "Day 26\n(Weaning)")
) +
scale_colour_manual(
values = c("A" = "rosybrown1", "B" = "rosybrown", "C" = "rosybrown4"),
name = "Treatment"
) +
labs(
title = "Piglet Growth from Birth to Weaning by Diet",
subtitle = "Points = individual litter means | Diamond + line = treatment mean",
x = "Days",
y = "Average Weight (kg)"
) +
theme_bw(base_size = 13) +
theme(
plot.title = element_text(face = "bold", size = 14),
plot.subtitle = element_text(colour = "grey40", size = 10),
legend.position = "right",
panel.grid.minor = element_blank()
)
p As shown in the graph above, piglets fed treatment C had the highest overall mean. The litter weights starts out fairly clustered together, as the days increase the data increases significantly and becomes more spread, with an increased amount of outliers.
The piglets from sows fed treatment C had the overall highest mean out of all the treatments provided (fig. 8). This is expected as treatment C consisted of an additional 20kg of feed. Piglets from sows fed with treatment A had the least amount of growth compared to the two other treatments, producing the lightest pigs. This outcome is expected as it is the control group, it had the smallest IQR out of all the categories, indicating low variability and high consistency.
Treatment B usually had a large IQR and long whiskers, indicating a wide spread of data points, and significant dispersion as seen in Figure 8.
The study only recorded until piglets were at the weaning stage. It would’ve been useful to see the long term effects of the diet, such as a change in weight after being weaned. Furthermore, it would be useful to increase the sample size for both the parity and control groups to reduce the margin of error, and to increase the representativeness of treatments and effects.
To further enrich the understanding behind the correlation of what a sow is fed and how it affects how heavier their litter is, it would be useful to introduce new treatments consisting of different feeds and observing their effects.
Anthropic. (2023). Claude. Claude.ai. https://claude.ai
Henman, D., Lean, I. J., Block, E., & Golder, H. M. (2023). Data on the effects of the anionic protein meal BioChlorⓇ on sows before and after farrowing. Data in Brief, 48, 109168. https://doi.org/10.1016/j.dib.2023.109168
Pork Checkoff. (n.d.). Life Cycle of a Market Pig. Pork Checkoff. https://porkcheckoff.org/pork-branding/facts-statistics/life-cycle-of-a-market-pig/
I used Claude for troubleshooting errors in my code (specifically the YAML), and to help create the scatterplot, as I wanted to visualise the data in a way that went beyond the functions covered in the practicals, and to assist with the coding for the table. Furthermore I gave Claude a prompt to provide feedback on my task based on the criteria, ensuring that it did not give me a direct response. I relied on Claude as a trusted source as it has strong reasoning when providing responses, and due to the volume of people who use it.
Henman et al., 2023 was the source of my dataset, and allowed for me to understand the dataset further, as I initially thought that the piglets were being fed the treatment directly. This article is reliable as it was written by those who conducted the observation, published by Elsevier, a well known publishing company, and due to the fact that it has an extensive amount of data provided within the article.
I briefly used information from Pork Checkoff, n.d. to compare the piglet weight with the industry standard. This web-page is reliable as it was created by Pork Checkoff, a program that is mandated by the US that supports researche and education to pork farmers.