Trial 1 - Tick survival parameters

Analysis steps

Below is a list of steps for testing differences in means or medians across more than two groups:

  1. Check for Normality:

    • For each tick survival parameter and each vaccine group, assess whether the data is approximately normally distributed. Use the Shapiro-Wilk test.

    • Note: With only 4 values per group, interpret the Shapiro-Wilk test results cautiously, as the test has low power. Supplement with visual inspection (histograms, Q-Q plots) if feasible, but understand that visual inspection can also be unreliable with such small samples

  2. Check for Equality of Variances:

    • For each tick survival parameter, assess whether the variances are approximately equal across the vaccine groups. Use Levene’s test.

    • Note: Similar to normality, the test for equal variances also has limited power with small sample sizes.

  3. Choose the appropriate test:

    • Case 1: Approximately Normal Data & Equal Variances: Use ANOVA followed by a post-hoc test (Tukey’s HSD if comparing all pairs, Dunnett’s test if comparing to a control group).

    • Case 2: Data Not Approximately Normal: Use the Kruskal-Wallis test followed by a post-hoc test (Dunn’s test or Mann-Whitney U with Bonferroni correction).

    • Case 3: Approximately Normal Data & Unequal Variances: Use Welch’s ANOVA followed by Games-Howell.

  4. Perform the test.

  5. Interpret the results:

    • Examine the p-values associated with the main test (ANOVA, Kruskal-Wallis, Welch’s ANOVA). A p-value below your chosen significance level (e.g., 0.05) indicates a statistically significant difference between at least some of the groups.

    • If the main test is significant, examine the p-values from the post-hoc test to determine which specific groups differ significantly from each other.

  6. Report the findings:

    • Clearly state which statistical tests were used (including the normality and equal variance tests).

    • Report the test statistics (e.g., F-statistic for ANOVA, chi-squared statistic for Kruskal-Wallis), degrees of freedom, and p-values for both the main test and the post-hoc tests.

    • Clearly state your conclusions, being careful not to overstate the significance of your findings given the small sample size.

    • Acknowledge the limitations of your analysis due to the small number of replicates per group.

Important Considerations for Your Data:

  • Small Sample Size: You only have 4 animals per group. This limits the statistical power of your tests (makes it harder to find significant differences).

  • High Variability: The high variability within groups further reduces statistical power.

  • Note: Non-parametric tests are generally more robust to violations of assumptions when sample sizes are small (e.g. Kruskal Wallis rather than ANOVA). But then also perform the non-parametric ad-hoc test (e.g. Dunn’s test).

Install and load libraries

# Install necessary packages (if you haven't already)
#install.packages("tidyverse")  # Include ggplot2 (for plotting) and dplyr (for data manipulation)
#install.packages("rstatix") # For Shapiro-Wilk test (more robust version)
#install.packages("gridExtra") # For creating multiple plots on a grid
#install.packages("car") # For using Levene's Test (to test for equal variances) 
#install.packages("dunn.test") # For using Dunn's Test (non-parameteric post-hoc test) 

# Load necessary libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(rstatix)
## 
## Attaching package: 'rstatix'
## 
## The following object is masked from 'package:stats':
## 
##     filter
library(gridExtra)
## 
## Attaching package: 'gridExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     combine
library(car)
## Loading required package: carData
## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
## 
## The following object is masked from 'package:purrr':
## 
##     some
library(dunn.test)

Read in the data

Note that I have transposed and cleaned up the data so that it is easy to use in R.

setwd("/Users/nanette/Documents/Stats_BIF_support/Christian_2025/")
data <- read_tsv("Trial1_all_data_per_Bovine.txt")
## Rows: 16 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (3): Tag, Group, Rep
## dbl (9): Number_females, Weight_females, Egg_weight, Efficacy, Bm86_IgG, OD_...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Visual inspection: QQPlots

qqplot_ggplot <- function(data, group_name, column_name) {
  # Calculate theoretical quantiles (normal distribution)
  theoretical_quantiles <- qnorm(ppoints(nrow(data)))
  
  # Order the sample data
  sample_quantiles <- sort(data[[column_name]])
  
  # Create a data frame for plotting
  qq_data <- data.frame(theoretical = theoretical_quantiles,
                        sample = sample_quantiles)
  
  # Create the ggplot QQ plot
  ggplot(qq_data, aes(x = theoretical, y = sample)) +
    geom_point() +
    geom_abline(intercept = mean(sample_quantiles), slope = sd(sample_quantiles), color = "red") +  # Add a line to show the distribution
    labs(title = paste("QQ Plot for Group", group_name),
         x = "Theoretical Quantiles (Normal)",
         y = "Sample Quantiles") +
    theme_bw()
}
column_name <- "Number_females"

# Get unique group names
group_names <- unique(data$Group)

# Create a list to store the QQ plots
qq_plots <- list()

# Loop through groups and create QQ plots
for (i in seq_along(group_names)) {
  group_name <- group_names[i]
  
  # Subset the data for the current group
  group_data <- data %>% filter(Group == group_name)
  
  # Create the QQ plot using the custom function
  qq_plot <- qqplot_ggplot(group_data, group_name, column_name)
  
  # Store the plot in the list
  qq_plots[[i]] <- qq_plot
}

# Arrange the plots in a grid
grid.arrange(grobs = qq_plots, nrow = 2, ncol = 2)  # Adjust nrow and ncol as needed

Conclusion: “Total # engorged female ticks collected” likely normally distributed.

column_name <- "Weight_females"

# Get unique group names
group_names <- unique(data$Group)

# Create a list to store the QQ plots
qq_plots <- list()

# Loop through groups and create QQ plots
for (i in seq_along(group_names)) {
  group_name <- group_names[i]
  
  # Subset the data for the current group
  group_data <- data %>% filter(Group == group_name)
  
  # Create the QQ plot using the custom function
  qq_plot <- qqplot_ggplot(group_data, group_name, column_name)
  
  # Store the plot in the list
  qq_plots[[i]] <- qq_plot
}

# Arrange the plots in a grid
grid.arrange(grobs = qq_plots, nrow = 2, ncol = 2)  # Adjust nrow and ncol as needed

Conclusion: “Mean weight engorged females” likely normally distributed.

column_name <- "Egg_weight"

# Get unique group names
group_names <- unique(data$Group)

# Create a list to store the QQ plots
qq_plots <- list()

# Loop through groups and create QQ plots
for (i in seq_along(group_names)) {
  group_name <- group_names[i]
  
  # Subset the data for the current group
  group_data <- data %>% filter(Group == group_name)
  
  # Create the QQ plot using the custom function
  qq_plot <- qqplot_ggplot(group_data, group_name, column_name)
  
  # Store the plot in the list
  qq_plots[[i]] <- qq_plot
}

# Arrange the plots in a grid
grid.arrange(grobs = qq_plots, nrow = 2, ncol = 2)  # Adjust nrow and ncol as needed

Conclusion: “Mean egg weight” likely normally distributed.

column_name <- "Efficacy"

# Get unique group names
group_names <- unique(data$Group)[-1]

# Create a list to store the QQ plots
qq_plots <- list()

# Loop through groups and create QQ plots
for (i in seq_along(group_names)) {
  group_name <- group_names[i]
  
  # Subset the data for the current group
  group_data <- data %>% filter(Group == group_name)
  
  # Create the QQ plot using the custom function
  qq_plot <- qqplot_ggplot(group_data, group_name, column_name)
  
  # Store the plot in the list
  qq_plots[[i]] <- qq_plot
}

# Arrange the plots in a grid
grid.arrange(grobs = qq_plots, nrow = 2, ncol = 2)  # Adjust nrow and ncol as needed

Conclusion: “Calculated efficacy %” likely normally distributed.

Shapiro-Wilk Test for normality

Interpretation: If the p-value is less than 0.05 (or your chosen alpha level), you reject the null hypothesis of normality, suggesting the data is likely not normally distributed.

column_name <- "Number_females"
data %>%
  group_by(Group) %>%
  shapiro_test(column_name)
## # A tibble: 4 × 4
##   Group     variable       statistic     p
##   <chr>     <chr>              <dbl> <dbl>
## 1 Bm86      Number_females     0.873 0.311
## 2 Combi     Number_females     0.868 0.290
## 3 Combi.CpG Number_females     0.925 0.563
## 4 control   Number_females     0.887 0.368

Conclusion: “Total # engorged female ticks collected” likely normally distributed.

column_name <- "Weight_females"
data %>%
  group_by(Group) %>%
  shapiro_test(column_name)
## # A tibble: 4 × 4
##   Group     variable       statistic     p
##   <chr>     <chr>              <dbl> <dbl>
## 1 Bm86      Weight_females     0.995 0.982
## 2 Combi     Weight_females     0.928 0.581
## 3 Combi.CpG Weight_females     0.880 0.340
## 4 control   Weight_females     0.998 0.994

Conclusion: “Mean weight engorged females” likely normally distributed.

column_name <- "Egg_weight"
data %>%
  group_by(Group) %>%
  shapiro_test(column_name)
## # A tibble: 4 × 4
##   Group     variable   statistic     p
##   <chr>     <chr>          <dbl> <dbl>
## 1 Bm86      Egg_weight     0.998 0.995
## 2 Combi     Egg_weight     0.910 0.484
## 3 Combi.CpG Egg_weight     0.939 0.651
## 4 control   Egg_weight     0.875 0.319

Conclusion: “Mean egg weight” likely normally distributed.

column_name <- "Efficacy"
data %>%
  filter(Group != "control") %>% # Exclude rows where Group is "control"
  group_by(Group) %>%
  shapiro_test(column_name)
## # A tibble: 3 × 4
##   Group     variable statistic     p
##   <chr>     <chr>        <dbl> <dbl>
## 1 Bm86      Efficacy     0.812 0.124
## 2 Combi     Efficacy     0.900 0.432
## 3 Combi.CpG Efficacy     0.945 0.687

Conclusion: “Calculated efficacy %” likely normally distributed.

Levene’s Test for equality of variances

Note: This test is not very sensitive to the assumption of normality (this is good).

Interpretation: If the p-value is less than 0.05 (or your chosen alpha level), you reject the null hypothesis of equal variances, suggesting the variances between groups are likely significantly different.

column_name <- "Number_females"
#Assuming you already have the column and variable selected.
data %>%
   leveneTest(get(column_name) ~ Group, data = .)
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3   1.327 0.3114
##       12

Conclusion: For “Total # engorged female ticks collected”, groups have equal variances.

column_name <- "Weight_females"
#Assuming you already have the column and variable selected.
data %>%
   leveneTest(get(column_name) ~ Group, data = .)
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.8778 0.4798
##       12

Conclusion: For “Mean weight engorged females”, groups have equal variances.

column_name <- "Egg_weight"
#Assuming you already have the column and variable selected.
data %>%
   leveneTest(get(column_name) ~ Group, data = .)
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  1.8154  0.198
##       12

Conclusion: For “Mean egg weight”, groups have equal variances.

column_name <- "Efficacy"
#Assuming you already have the column and variable selected.

data %>%
  filter(Group != "control") %>% # Exclude rows where Group is "control"
  leveneTest(get(column_name) ~ Group, data = .)
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.2593 0.7772
##        9

Conclusion: For “Calculated efficacy %”, groups have equal variances.

One-way ANOVA

# Specify the tick survival parameter
column_name <- "Number_females"

# Perform ANOVA
anova_results <- aov(get(column_name) ~ Group, data = data)
summary(anova_results)
##             Df Sum Sq Mean Sq F value Pr(>F)
## Group        3 332163  110721   1.941  0.177
## Residuals   12 684669   57056
# Perform Post-Hoc Test (if ANOVA is significant)
TukeyHSD(anova_results)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = get(column_name) ~ Group, data = data)
## 
## $Group
##                     diff       lwr      upr     p adj
## Combi-Bm86        -342.0 -843.4532 159.4532 0.2325104
## Combi.CpG-Bm86    -337.0 -838.4532 164.4532 0.2428334
## control-Bm86      -135.5 -636.9532 365.9532 0.8521070
## Combi.CpG-Combi      5.0 -496.4532 506.4532 0.9999901
## control-Combi      206.5 -294.9532 707.9532 0.6251502
## control-Combi.CpG  201.5 -299.9532 702.9532 0.6424212
# Graphing (Optional but Recommended)
# Boxplot of sample
boxplot(get(column_name) ~ Group, data = data,
        main = "Number of Females by Group",
        xlab = "Group",
        ylab = column_name)

Conclusion: There are no significant differences between the group means for “Total # engorged female ticks collected”.

# Specify the tick survival parameter
column_name <- "Weight_females"

# Perform ANOVA
anova_results <- aov(get(column_name) ~ Group, data = data)
summary(anova_results)
##             Df Sum Sq Mean Sq F value Pr(>F)
## Group        3   1064   354.5   0.699  0.571
## Residuals   12   6088   507.3
# Perform Post-Hoc Test (if ANOVA is significant)
TukeyHSD(anova_results)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = get(column_name) ~ Group, data = data)
## 
## $Group
##                      diff       lwr      upr     p adj
## Combi-Bm86        -19.725 -67.01081 27.56081 0.6158937
## Combi.CpG-Bm86     -0.200 -47.48581 47.08581 0.9999992
## control-Bm86      -10.125 -57.41081 37.16081 0.9184606
## Combi.CpG-Combi    19.525 -27.76081 66.81081 0.6232239
## control-Combi       9.600 -37.68581 56.88581 0.9292425
## control-Combi.CpG  -9.925 -57.21081 37.36081 0.9226645
# Graphing (Optional but Recommended)
# Boxplot of sample
boxplot(get(column_name) ~ Group, data = data,
        main = "Number of Females by Group",
        xlab = "Group",
        ylab = column_name)

Conclusion: There are no significant differences between the group means for “Mean weight engorged females”.

# Specify the tick survival parameter
column_name <- "Egg_weight"

# Perform ANOVA
anova_results <- aov(get(column_name) ~ Group, data = data)
summary(anova_results)
##             Df Sum Sq Mean Sq F value Pr(>F)
## Group        3    298   99.37   0.361  0.782
## Residuals   12   3301  275.05
# Perform Post-Hoc Test (if ANOVA is significant)
TukeyHSD(anova_results)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = get(column_name) ~ Group, data = data)
## 
## $Group
##                     diff       lwr      upr     p adj
## Combi-Bm86        -6.825 -41.64159 27.99159 0.9356108
## Combi.CpG-Bm86    -5.400 -40.21659 29.41659 0.9662505
## control-Bm86       3.975 -30.84159 38.79159 0.9859293
## Combi.CpG-Combi    1.425 -33.39159 36.24159 0.9993210
## control-Combi     10.800 -24.01659 45.61659 0.7943865
## control-Combi.CpG  9.375 -25.44159 44.19159 0.8533786
# Graphing (Optional but Recommended)
# Boxplot of sample
boxplot(get(column_name) ~ Group, data = data,
        main = "Number of Females by Group",
        xlab = "Group",
        ylab = column_name)

Conclusion: There are no significant differences between the group means for “Mean egg weight”.

# Specify the tick survival parameter
column_name <- "Efficacy"

data_no_control <- data %>%
  filter(Group != "control") # Exclude rows where Group is "control"

# Perform ANOVA
anova_results <- aov(get(column_name) ~ Group, data = data_no_control)
summary(anova_results)
##             Df Sum Sq Mean Sq F value Pr(>F)
## Group        2   4144  2071.8   2.557  0.132
## Residuals    9   7293   810.3
# Perform Post-Hoc Test (if ANOVA is significant)
TukeyHSD(anova_results)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = get(column_name) ~ Group, data = data_no_control)
## 
## $Group
##                    diff       lwr      upr     p adj
## Combi-Bm86      41.7075 -14.49112 97.90612 0.1508811
## Combi.CpG-Bm86  36.6400 -19.55862 92.83862 0.2174754
## Combi.CpG-Combi -5.0675 -61.26612 51.13112 0.9657947
# Graphing (Optional but Recommended)
# Boxplot of sample
boxplot(get(column_name) ~ Group, data = data,
        main = "Number of Females by Group",
        xlab = "Group",
        ylab = column_name)

Conclusion: There are no significant differences between the group means for “Calculated efficacy %”.

Kruskal-Wallis test

# Specify the column to analyze
column_name <- "Number_females"

# Perform Kruskal-Wallis test
kruskal_results <- kruskal.test(get(column_name) ~ Group, data = data)
print(kruskal_results)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  get(column_name) by Group
## Kruskal-Wallis chi-squared = 4.6324, df = 3, p-value = 0.2008
# Perform Dunn's post-hoc test (if Kruskal-Wallis is significant or of interest)
dunn_results <- dunn.test(data[[column_name]], g = data$Group,
                       method = "bonferroni")  #or other correction methods
##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 4.6324, df = 3, p-value = 0.2
## 
## 
##                            Comparison of x by group                            
##                                  (Bonferroni)                                  
## Col Mean-|
## Row Mean |       Bm86      Combi   Combi.Cp
## ---------+---------------------------------
##    Combi |   1.633743
##          |     0.3069
##          |
## Combi.Cp |   1.782265   0.148522
##          |     0.2241     1.0000
##          |
##  control |   0.445566  -1.188177  -1.336699
##          |     1.0000     0.7043     0.5440
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2
print(dunn_results)
## $chi2
## [1] 4.632353
## 
## $Z
## [1]  1.6337434  1.7822656  0.1485221  0.4455664 -1.1881771 -1.3366992
## 
## $P
## [1] 0.05115637 0.03735297 0.44096536 0.32795525 0.11738183 0.09066042
## 
## $P.adjusted
## [1] 0.3069382 0.2241178 1.0000000 1.0000000 0.7042910 0.5439625
## 
## $comparisons
## [1] "Bm86 - Combi"        "Bm86 - Combi.CpG"    "Combi - Combi.CpG"  
## [4] "Bm86 - control"      "Combi - control"     "Combi.CpG - control"

Conclusion: There are no significant differences between the group medians for “Total # engorged female ticks collected”.

# Specify the column to analyze
column_name <- "Weight_females"

# Perform Kruskal-Wallis test
kruskal_results <- kruskal.test(get(column_name) ~ Group, data = data)
print(kruskal_results)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  get(column_name) by Group
## Kruskal-Wallis chi-squared = 2.2721, df = 3, p-value = 0.5179
# Perform Dunn's post-hoc test (if Kruskal-Wallis is significant or of interest)
dunn_results <- dunn.test(data[[column_name]], g = data$Group,
                       method = "bonferroni")  #or other correction methods
##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 2.2721, df = 3, p-value = 0.52
## 
## 
##                            Comparison of x by group                            
##                                  (Bonferroni)                                  
## Col Mean-|
## Row Mean |       Bm86      Combi   Combi.Cp
## ---------+---------------------------------
##    Combi |   1.039654
##          |     0.8955
##          |
## Combi.Cp |  -0.371305  -1.410960
##          |     1.0000     0.4748
##          |
##  control |   0.519827  -0.519827   0.891132
##          |     1.0000     1.0000     1.0000
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2
print(dunn_results)
## $chi2
## [1] 2.272059
## 
## $Z
## [1]  1.0396549 -0.3713053 -1.4109602  0.5198275 -0.5198275  0.8911328
## 
## $P
## [1] 0.14925013 0.35520506 0.07912817 0.30159192 0.30159192 0.18642897
## 
## $P.adjusted
## [1] 0.8955008 1.0000000 0.4747690 1.0000000 1.0000000 1.0000000
## 
## $comparisons
## [1] "Bm86 - Combi"        "Bm86 - Combi.CpG"    "Combi - Combi.CpG"  
## [4] "Bm86 - control"      "Combi - control"     "Combi.CpG - control"

Conclusion: There are no significant differences between the group medians for “Mean weight engorged females”.

# Specify the column to analyze
column_name <- "Egg_weight"

# Perform Kruskal-Wallis test
kruskal_results <- kruskal.test(get(column_name) ~ Group, data = data)
print(kruskal_results)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  get(column_name) by Group
## Kruskal-Wallis chi-squared = 1.1912, df = 3, p-value = 0.7551
# Perform Dunn's post-hoc test (if Kruskal-Wallis is significant or of interest)
dunn_results <- dunn.test(data[[column_name]], g = data$Group,
                       method = "bonferroni")  #or other correction methods
##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 1.1912, df = 3, p-value = 0.76
## 
## 
##                            Comparison of x by group                            
##                                  (Bonferroni)                                  
## Col Mean-|
## Row Mean |       Bm86      Combi   Combi.Cp
## ---------+---------------------------------
##    Combi |   0.594088
##          |     1.0000
##          |
## Combi.Cp |   0.594088   0.000000
##          |     1.0000     1.0000
##          |
##  control |  -0.297044  -0.891132  -0.891132
##          |     1.0000     1.0000     1.0000
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2
print(dunn_results)
## $chi2
## [1] 1.191176
## 
## $Z
## [1]  0.5940885  0.5940885  0.0000000 -0.2970443 -0.8911328 -0.8911328
## 
## $P
## [1] 0.2762265 0.2762265 0.5000000 0.3832164 0.1864290 0.1864290
## 
## $P.adjusted
## [1] 1 1 1 1 1 1
## 
## $comparisons
## [1] "Bm86 - Combi"        "Bm86 - Combi.CpG"    "Combi - Combi.CpG"  
## [4] "Bm86 - control"      "Combi - control"     "Combi.CpG - control"

Conclusion: There are no significant differences between the group medians for “Mean egg weight”.

# Specify the column to analyze
column_name <- "Efficacy"

# Perform Kruskal-Wallis test
kruskal_results <- kruskal.test(get(column_name) ~ Group, data = data_no_control)
print(kruskal_results)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  get(column_name) by Group
## Kruskal-Wallis chi-squared = 4.1538, df = 2, p-value = 0.1253
# Perform Dunn's post-hoc test (if Kruskal-Wallis is significant or of interest)
dunn_results <- dunn.test(data[[column_name]], g = data$Group,
                       method = "bonferroni")  #or other correction methods
##   Kruskal-Wallis rank sum test
## 
## data: x and group
## Kruskal-Wallis chi-squared = 4.1538, df = 2, p-value = 0.13
## 
## 
##                            Comparison of x by group                            
##                                  (Bonferroni)                                  
## Col Mean-|
## Row Mean |       Bm86      Combi
## ---------+----------------------
##    Combi |  -1.765045
##          |     0.1163
##          |
## Combi.Cp |  -1.765045   0.000000
##          |     0.1163     1.0000
## 
## alpha = 0.05
## Reject Ho if p <= alpha/2
print(dunn_results)
## $chi2
## [1] 4.153846
## 
## $Z
## [1] -1.765045 -1.765045  0.000000
## 
## $P
## [1] 0.03877808 0.03877808 0.50000000
## 
## $P.adjusted
## [1] 0.1163343 0.1163343 1.0000000
## 
## $comparisons
## [1] "Bm86 - Combi"      "Bm86 - Combi.CpG"  "Combi - Combi.CpG"

Conclusion: There are no significant differences between the group medians for “Calculated efficacy %”.

Summary of conclusions

The Shapiro-Wilk test was used to assess normality, and Levene’s test to evaluate the equality of variances, with visual inspection supporting the results. Since all survival parameters met the assumptions of normality and homogeneity of variances, ANOVA followed by Tukey’s HSD post-hoc test was used for statistical analysis. However, given the small sample size per group (n = 4), non-parametric tests were also performed as a precaution, and these yielded consistent conclusions.

Conclusions:

  • There were no statistically significant differences between group means or medians for any of the tick survival parameters in Trial 1: total number of engorged female ticks collected, mean weight of engorged females, mean egg weight, and calculated efficacy (%).

  • A non-significant p-value does not necessarily imply the absence of a true difference between vaccine groups. Instead, it reflects that this study — with its limited sample size and potentially high variability among biological replicates — may not have had sufficient power to detect such a difference, if it exists. In these cases, it is useful to review descriptive statistics (e.g., mean, median, standard deviation) and visualizations (e.g., boxplots) to identify potential trends.