Trial 1 - Tick serology (antibodies)

Analysis steps

Below is a list of steps for testing differences in means or medians across more than two groups:

  1. Check for Normality:

    • For each tick serology parameter and each vaccine group, assess whether the data is approximately normally distributed. Use the Shapiro-Wilk test.

    • Note: With only 4 values per group, interpret the Shapiro-Wilk test results cautiously, as the test has low power. Supplement with visual inspection (histograms, Q-Q plots) if feasible, but understand that visual inspection can also be unreliable with such small samples

  2. Check for Equality of Variances:

    • For each tick serology parameter, assess whether the variances are approximately equal across the vaccine groups. Use Levene’s test.

    • Note: Similar to normality, the test for equal variances also has limited power with small sample sizes.

  3. Choose the appropriate test:

    • Case 1: Approximately Normal Data & Equal Variances: Use ANOVA followed by a post-hoc test (Tukey’s HSD if comparing all pairs, Dunnett’s test if comparing to a control group).

    • Case 2: Data Not Approximately Normal: Use the Kruskal-Wallis test followed by a post-hoc test (Dunn’s test or Mann-Whitney U with Bonferroni correction).

    • Case 3: Approximately Normal Data & Unequal Variances: Use Welch’s ANOVA followed by Games-Howell.

  4. Perform the test.

  5. Interpret the results:

    • Examine the p-values associated with the main test (ANOVA, Kruskal-Wallis, Welch’s ANOVA). A p-value below your chosen significance level (e.g., 0.05) indicates a statistically significant difference between at least some of the groups.

    • If the main test is significant, examine the p-values from the post-hoc test to determine which specific groups differ significantly from each other.

  6. Report the findings:

    • Clearly state which statistical tests were used (including the normality and equal variance tests).

    • Report the test statistics (e.g., F-statistic for ANOVA, chi-squared statistic for Kruskal-Wallis), degrees of freedom, and p-values for both the main test and the post-hoc tests.

    • Clearly state your conclusions, being careful not to overstate the significance of your findings given the small sample size.

    • Acknowledge the limitations of your analysis due to the small number of replicates per group.

Important Considerations for Your Data:

  • Small Sample Size: You only have 4 animals per group. This limits the statistical power of your tests (makes it harder to find significant differences).

  • High Variability: The high variability within groups further reduces statistical power.

  • Note: Non-parametric tests are generally more robust to violations of assumptions when sample sizes are small (e.g. Kruskal Wallis rather than ANOVA). But then also perform the non-parametric ad-hoc test (e.g. Dunn’s test).

Install and load libraries

# Install necessary packages (if you haven't already)
#install.packages("tidyverse")  # Include ggplot2 (for plotting) and dplyr (for data manipulation)
#install.packages("rstatix") # For Shapiro-Wilk test (more robust version)
#install.packages("gridExtra") # For creating multiple plots on a grid
#install.packages("car") # For using Levene's Test (to test for equal variances) 
#install.packages("dunn.test") # For using Dunn's Test (non-parameteric post-hoc test) 

# Load necessary libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(rstatix)
## 
## Attaching package: 'rstatix'
## 
## The following object is masked from 'package:stats':
## 
##     filter
library(gridExtra)
## 
## Attaching package: 'gridExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     combine
library(car)
## Loading required package: carData
## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
## 
## The following object is masked from 'package:purrr':
## 
##     some
library(dunn.test)

Read in the data

Note that I have transposed and cleaned up the data so that it is easy to use in R.

setwd("/Users/nanette/Documents/Stats_BIF_support/Christian_2025/")
data <- read_tsv("Trial1_all_data_per_Bovine.txt")
## Rows: 16 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (3): Tag, Group, Rep
## dbl (9): Number_females, Weight_females, Egg_weight, Efficacy, Bm86_IgG, OD_...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Exploratory data analysis

column_name <- "OD_Bm86_IgG"
# Boxplot of sample
boxplot(get(column_name) ~ Group, data = data,
        main = "Number of Females by Group",
        xlab = "Group",
        ylab = column_name)

column_name <- "OD_Bm86_IgG1"
# Boxplot of sample
boxplot(get(column_name) ~ Group, data = data,
        main = "Number of Females by Group",
        xlab = "Group",
        ylab = column_name)

column_name <- "OD_Bm86_IgG2"
# Boxplot of sample
boxplot(get(column_name) ~ Group, data = data,
        main = "Number of Females by Group",
        xlab = "Group",
        ylab = column_name)

column_name <- "OD_Bm86_IgG1/IgG2"
# Boxplot of sample
boxplot(get(column_name) ~ Group, data = data,
        main = "Number of Females by Group",
        xlab = "Group",
        ylab = column_name)

Conclusion: Boxplots suggest that the data within groups are not always normally distributed, and group variances may differ. Use the Shapiro-Wilk test to assess normality and Levene’s test to evaluate the equality of variances.

Shapiro-Wilk Test for normality

Interpretation: If the p-value is less than 0.05 (or your chosen alpha level), you reject the null hypothesis of normality, suggesting the data is likely not normally distributed.

column_name <- "OD_Bm86_IgG"
data %>%
  group_by(Group) %>%
  shapiro_test(column_name)
## # A tibble: 4 × 4
##   Group     variable    statistic      p
##   <chr>     <chr>           <dbl>  <dbl>
## 1 Bm86      OD_Bm86_IgG     0.962 0.789 
## 2 Combi     OD_Bm86_IgG     0.965 0.813 
## 3 Combi.CpG OD_Bm86_IgG     0.700 0.0117
## 4 control   OD_Bm86_IgG     0.756 0.0435

Conclusion: For “OD_Bm86_IgG”, Combi.CpG values are not normally distributed.

column_name <- "OD_Bm86_IgG1"
data %>%
  group_by(Group) %>%
  shapiro_test(column_name)
## # A tibble: 4 × 4
##   Group     variable     statistic     p
##   <chr>     <chr>            <dbl> <dbl>
## 1 Bm86      OD_Bm86_IgG1     0.974 0.866
## 2 Combi     OD_Bm86_IgG1     0.818 0.140
## 3 Combi.CpG OD_Bm86_IgG1     0.950 0.717
## 4 control   OD_Bm86_IgG1     0.985 0.933

Conclusion: “OD_Bm86_IgG1” values are normally distributed.

column_name <- "OD_Bm86_IgG2"
data %>%
  group_by(Group) %>%
  shapiro_test(column_name)
## # A tibble: 4 × 4
##   Group     variable     statistic     p
##   <chr>     <chr>            <dbl> <dbl>
## 1 Bm86      OD_Bm86_IgG2     0.854 0.240
## 2 Combi     OD_Bm86_IgG2     0.889 0.377
## 3 Combi.CpG OD_Bm86_IgG2     0.848 0.218
## 4 control   OD_Bm86_IgG2     0.947 0.695

Conclusion: “OD_Bm86_IgG2” values are normally distributed.

column_name <- "OD_Bm86_IgG1/IgG2"
data %>%
  filter(Group != "control") %>% # Exclude rows where Group is "control"
  group_by(Group) %>%
  shapiro_test(column_name)
## # A tibble: 3 × 4
##   Group     variable          statistic      p
##   <chr>     <chr>                 <dbl>  <dbl>
## 1 Bm86      OD_Bm86_IgG1/IgG2     0.705 0.0131
## 2 Combi     OD_Bm86_IgG1/IgG2     0.930 0.592 
## 3 Combi.CpG OD_Bm86_IgG1/IgG2     0.893 0.397

Conclusion: For “OD_Bm86_IgG”, Bm86 values are not normally distributed.

Levene’s Test for equality of variances

Note: This test is not very sensitive to the assumption of normality (this is good).

Interpretation: If the p-value is less than 0.05 (or your chosen alpha level), you reject the null hypothesis of equal variances, suggesting the variances between groups are likely significantly different.

column_name <- "OD_Bm86_IgG"
#Assuming you already have the column and variable selected.
data %>%
   leveneTest(get(column_name) ~ Group, data = .)
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  3  4.9385 0.01847 *
##       12                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion: Variances between groups vary for “OD_Bm86_IgG”.

column_name <- "OD_Bm86_IgG1"
#Assuming you already have the column and variable selected.
data %>%
   leveneTest(get(column_name) ~ Group, data = .)
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value   Pr(>F)   
## group  3  9.1712 0.001978 **
##       12                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion: Variances between groups vary for “OD_Bm86_IgG1”.

column_name <- "OD_Bm86_IgG2"
#Assuming you already have the column and variable selected.
data %>%
   leveneTest(get(column_name) ~ Group, data = .)
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  2.2148 0.1391
##       12

Conclusion: For “OD_Bm86_IgG2”, groups have equal variances.

column_name <- "OD_Bm86_IgG1/IgG2"
#Assuming you already have the column and variable selected.

data %>%
  leveneTest(get(column_name) ~ Group, data = .)
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.9579  0.444
##       12

Conclusion: For “OD_Bm86_IgG1/IgG2”, groups have equal variances.

Kruskal-Wallis test

As the serology parameters do not all meet the assumptions of normality and equal variances, non-parametric tests should be used for statistical analysis.

# Specify the column to analyze
column_name <- "OD_Bm86_IgG"

# Perform Kruskal-Wallis test
kruskal_results <- kruskal.test(get(column_name) ~ Group, data = data)
print(kruskal_results)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  get(column_name) by Group
## Kruskal-Wallis chi-squared = 5.1197, df = 3, p-value = 0.1632
# Perform Dunn's post-hoc test (if Kruskal-Wallis is significant or of interest)
dunn_results <- dunn.test(data[[column_name]], g = data$Group,
                       method = "bonferroni")  #or other correction methods
##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 5.1197, df = 3, p-value = 0.16
## 
## 
##                            Comparison of x by group                            
##                                  (Bonferroni)                                  
## Col Mean-|
## Row Mean |       Bm86      Combi   Combi.Cp
## ---------+---------------------------------
##    Combi |   1.040420
##          |     0.8944
##          |
## Combi.Cp |  -0.408736  -1.449156
##          |     1.0000     0.4419
##          |
##  control |   1.597788   0.557367   2.006524
##          |     0.3303     1.0000     0.1344
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2
print(dunn_results)
## $chi2
## [1] 5.119661
## 
## $Z
## [1]  1.0404202 -0.4087365 -1.4491567  1.5977882  0.5573680  2.0065247
## 
## $P
## [1] 0.14907236 0.34136652 0.07364691 0.05504506 0.28863802 0.02240015
## 
## $P.adjusted
## [1] 0.8944341 1.0000000 0.4418815 0.3302704 1.0000000 0.1344009
## 
## $comparisons
## [1] "Bm86 - Combi"        "Bm86 - Combi.CpG"    "Combi - Combi.CpG"  
## [4] "Bm86 - control"      "Combi - control"     "Combi.CpG - control"
# Find indices where the adjusted p-value is significant
significant_indices <- which(dunn_results$P.adjusted < 0.05)

# Extract the corresponding comparisons
significant_comparisons <- dunn_results$comparisons[significant_indices]

# Print the significant comparisons
print(significant_comparisons)
## character(0)

Conclusion: There are no significant differences between medians of groups for “OD_Bm86_IgG”.

# Specify the column to analyze
column_name <- "OD_Bm86_IgG1"

# Perform Kruskal-Wallis test
kruskal_results <- kruskal.test(get(column_name) ~ Group, data = data)
print(kruskal_results)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  get(column_name) by Group
## Kruskal-Wallis chi-squared = 7.3015, df = 3, p-value = 0.06289
# Perform Dunn's post-hoc test (if Kruskal-Wallis is significant or of interest)
dunn_results <- dunn.test(data[[column_name]], g = data$Group,
                       method = "bonferroni")  #or other correction methods
##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 7.3015, df = 3, p-value = 0.06
## 
## 
##                            Comparison of x by group                            
##                                  (Bonferroni)                                  
## Col Mean-|
## Row Mean |       Bm86      Combi   Combi.Cp
## ---------+---------------------------------
##    Combi |   0.074261
##          |     1.0000
##          |
## Combi.Cp |   2.376354   2.302093
##          |     0.0525     0.0640
##          |
##  control |   0.816871   0.742610  -1.559482
##          |     1.0000     1.0000     0.3566
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2
print(dunn_results)
## $chi2
## [1] 7.301471
## 
## $Z
## [1]  0.07426107  2.37635410  2.30209304  0.81687172  0.74261066 -1.55948238
## 
## $P
## [1] 0.470401328 0.008742337 0.010664963 0.207000869 0.228858717 0.059441126
## 
## $P.adjusted
## [1] 1.00000000 0.05245402 0.06398978 1.00000000 1.00000000 0.35664676
## 
## $comparisons
## [1] "Bm86 - Combi"        "Bm86 - Combi.CpG"    "Combi - Combi.CpG"  
## [4] "Bm86 - control"      "Combi - control"     "Combi.CpG - control"
# Find indices where the adjusted p-value is significant
significant_indices <- which(dunn_results$P.adjusted < 0.07)

# Extract the corresponding comparisons
significant_comparisons <- dunn_results$comparisons[significant_indices]

# Print the significant comparisons
print(significant_comparisons)
## [1] "Bm86 - Combi.CpG"  "Combi - Combi.CpG"

Conclusion: Strictly speaking, there are no statistically significant differences between the group medians for “OD_Bm86_IgG1”. However, the Kruskal-Wallis p-value of 0.06 suggests a possible trend. In support of this, Dunn’s test showed that the comparisons “Bm86 - Combi.CpG” and “Combi - Combi.CpG” had adjusted p-values of 0.052 and 0.064, respectively—just above the conventional significance threshold.

# Specify the column to analyze
column_name <- "OD_Bm86_IgG2"

# Perform Kruskal-Wallis test
kruskal_results <- kruskal.test(get(column_name) ~ Group, data = data)
print(kruskal_results)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  get(column_name) by Group
## Kruskal-Wallis chi-squared = 5.2798, df = 3, p-value = 0.1524
# Perform Dunn's post-hoc test (if Kruskal-Wallis is significant or of interest)
dunn_results <- dunn.test(data[[column_name]], g = data$Group,
                       method = "bonferroni")  #or other correction methods
##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 5.2798, df = 3, p-value = 0.15
## 
## 
##                            Comparison of x by group                            
##                                  (Bonferroni)                                  
## Col Mean-|
## Row Mean |       Bm86      Combi   Combi.Cp
## ---------+---------------------------------
##    Combi |  -2.006524
##          |     0.1344
##          |
## Combi.Cp |  -0.371578   1.634946
##          |     1.0000     0.3062
##          |
##  control |  -1.486314   0.520210  -1.114735
##          |     0.4116     1.0000     0.7949
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2
print(dunn_results)
## $chi2
## [1] 5.279823
## 
## $Z
## [1] -2.0065247 -0.3715786  1.6349461 -1.4863146  0.5202101 -1.1147359
## 
## $P
## [1] 0.02240015 0.35510329 0.05103018 0.06859796 0.30145857 0.13248180
## 
## $P.adjusted
## [1] 0.1344009 1.0000000 0.3061811 0.4115878 1.0000000 0.7948908
## 
## $comparisons
## [1] "Bm86 - Combi"        "Bm86 - Combi.CpG"    "Combi - Combi.CpG"  
## [4] "Bm86 - control"      "Combi - control"     "Combi.CpG - control"
# Find indices where the adjusted p-value is significant
significant_indices <- which(dunn_results$P.adjusted < 0.1)

# Extract the corresponding comparisons
significant_comparisons <- dunn_results$comparisons[significant_indices]

# Print the significant comparisons
print(significant_comparisons)
## character(0)

Conclusion: There are no significant differences between medians of groups for “OD_Bm86_IgG2”.

# Specify the column to analyze
column_name <- "OD_Bm86_IgG1/IgG2"

# Perform Kruskal-Wallis test
kruskal_results <- kruskal.test(get(column_name) ~ Group, data = data)
print(kruskal_results)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  get(column_name) by Group
## Kruskal-Wallis chi-squared = 11.051, df = 3, p-value = 0.01145
# Perform Dunn's post-hoc test (if Kruskal-Wallis is significant or of interest)
dunn_results <- dunn.test(data[[column_name]], g = data$Group,
                       method = "bonferroni")  #or other correction methods
##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 11.0515, df = 3, p-value = 0.01
## 
## 
##                            Comparison of x by group                            
##                                  (Bonferroni)                                  
## Col Mean-|
## Row Mean |       Bm86      Combi   Combi.Cp
## ---------+---------------------------------
##    Combi |   2.153570
##          |     0.0938
##          |
## Combi.Cp |   3.267486   1.113915
##          |    0.0033*     0.7959
##          |
##  control |   1.708004  -0.445566  -1.559482
##          |     0.2629     1.0000     0.3566
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2
print(dunn_results)
## $chi2
## [1] 11.05147
## 
## $Z
## [1]  2.1535709  3.2674869  1.1139160  1.7080045 -0.4455664 -1.5594824
## 
## $P
## [1] 0.0156369190 0.0005425345 0.1326576154 0.0438177527 0.3279552450
## [6] 0.0594411258
## 
## $P.adjusted
## [1] 0.093821514 0.003255207 0.795945692 0.262906516 1.000000000 0.356646755
## 
## $comparisons
## [1] "Bm86 - Combi"        "Bm86 - Combi.CpG"    "Combi - Combi.CpG"  
## [4] "Bm86 - control"      "Combi - control"     "Combi.CpG - control"
# Find indices where the adjusted p-value is significant
significant_indices <- which(dunn_results$P.adjusted < 0.05)

# Extract the corresponding comparisons
significant_comparisons <- dunn_results$comparisons[significant_indices]

# Print the significant comparisons
print(significant_comparisons)
## [1] "Bm86 - Combi.CpG"

Conclusion: There is a significant difference between medians of groups for “OD_Bm86_IgG1/IgG2”. And the specific comparison where a difference was detected was “Bm86 - Combi.CpG”.

Summary of conclusions

Shapiro-Wilk test was used to assess normality and Levene’s test to evaluate the equality of variances (visual inspection confirmed the results). Since the serology parameters did not all meet the assumptions of normality and equal variances, non-parametric tests were used for statistical analysis.

Conclusions:

  • There are no significant differences between medians of groups for “OD_Bm86_IgG”.

  • Strictly speaking, there are no statistically significant differences between the group medians for “OD_Bm86_IgG1”. However, the Kruskal-Wallis p-value of 0.06 suggests a possible trend. In support of this, Dunn’s test showed that the comparisons “Bm86 - Combi.CpG” and “Combi - Combi.CpG” had adjusted p-values of 0.052 and 0.064, respectively—just above the conventional significance threshold.

  • There are no significant differences between medians of groups for “OD_Bm86_IgG2”.

  • There is a significant difference between medians of groups for “OD_Bm86_IgG1/IgG2”. And the specific comparison where a difference was detected was “Bm86 - Combi.CpG”.