Stat analysis on Cadmium accumulation in grain of Oryza sativa varieties in the PH

Introduction

In this analysis, we will conduct a Kruskal-Wallis test to determine if there are significant differences among Cadmium (Cd) accumulation measurements of different rice varieties. If a significant difference is found, we will perform Dunn’s post hoc test to identify which groups differ.

Cd Data

gt(Cd_data) |> 
    tab_header(
    title = "Experiment data"
  ) %>%
  # fmt_number(columns = value, decimals = 2) %>%
  cols_label(VAR = "Rice Variety",
             REP = "Replicate") %>%
  opt_table_lines()

Experiment data
Rice Variety	Replicate	Unpolished	Polished	BAC_Un	BAC_Po
414	1	0.10	0.00	0.010	0.000
414	2	0.10	0.00	0.010	0.000
414	3	0.10	0.00	0.010	0.000
482	1	0.35	0.11	0.035	0.011
482	2	0.40	0.09	0.040	0.009
482	3	0.45	0.10	0.045	0.010
528	1	0.50	0.10	0.050	0.010
528	2	0.35	0.09	0.035	0.009
528	3	0.35	0.11	0.035	0.011
604	1	0.55	0.10	0.055	0.010
604	2	0.60	0.10	0.060	0.010
604	3	0.65	0.10	0.065	0.010
222	1	0.10	0.00	0.010	0.000
222	2	0.10	0.00	0.010	0.000
222	3	0.10	0.00	0.010	0.000
508	1	0.10	0.00	0.010	0.000
508	2	0.10	0.00	0.010	0.000
508	3	0.10	0.00	0.010	0.000
512	1	0.55	0.10	0.055	0.010
512	2	0.45	0.10	0.045	0.010
512	3	0.50	0.10	0.050	0.010

Kruskal-Wallis Test

kruskal_test1 <- Cd_data |>
  pivot_longer(cols = Unpolished:BAC_Po,
               names_to = "Type",
               values_to = "value") |>
  group_by(Type) |> 
  kruskal_test(value ~ VAR)


kruskal_test1 |> 
  select(-c(.y.,method)) |> 
    gt() %>% 
  tab_header(
    title = "Kruskal-Wallis Test Results"
  ) %>%
  fmt_number(columns = c(statistic, p), decimals = 4) %>%
  cols_label(Type = "Response variable",
             statistic = "Test Statistic", p = "p-value") %>%
  opt_table_lines()

Kruskal-Wallis Test Results
Response variable	n	Test Statistic	df	p-value
BAC_Po	21	17.0015	6	0.0093
BAC_Un	21	18.9384	6	0.0043
Polished	21	17.0015	6	0.0093
Unpolished	21	18.9384	6	0.0043

Interpretation

The Kruskal-Wallis test checks if at least one group median is significantly different. If the p-value is below 0.05, we reject the null hypothesis and proceed to post hoc testing.

In the test results above, there is sufficient evidence at 5% level of signficnce, to warrant the rejection of the claim that the median measurement is equal across all varieties (\(H_0\)). This is true for bot the Unpolished and Polished data. This means that median measurement is signficantly different for at least one variety in the experiment.

This can be validated further by estimating effect kruskal wallis effect sizes.

Kruskal wallis effect size

kruskal_effsize_G <- Cd_data |>
  pivot_longer(cols = Unpolished:BAC_Po,
               names_to = "Type",
               values_to = "value") |>
  group_by(Type) |> 
  kruskal_effsize(value ~ VAR)


kruskal_effsize_G %>% 
  select(-c(.y.,method)) |> 
  gt() %>% 
  tab_header(
    title = "Kruskal-Wallis Effect Size"
  ) %>%
  fmt_number(columns = c(effsize), decimals = 4) %>%
    cols_label(Type = "Response variable",
             effsize = "Effect size") %>%
  opt_table_lines()

Kruskal-Wallis Effect Size
Response variable	n	Effect size	magnitude
BAC_Po	21	0.7858	large
BAC_Un	21	0.9242	large
Polished	21	0.7858	large
Unpolished	21	0.9242	large

Interpretation

The effect size estimates show that the differences between the median of each variety are substantial. This means that the Kruskal Wallis test found a meaningful overall difference in the distribution of the varieties. This does not necessarily indicate which groups differ, and therefore Dunn’s test is needed to identify which varieties differ.

Dunn’s post hoc test (Unpolished)

dunn_test_UP <- dunnTest(Unpolished ~ VAR, data = Cd_data, method = "bh")
dunn_test_UP2 <- dunnTest(BAC_Un ~ VAR, data = Cd_data, method = "bh")
gt(dunn_test_UP$res) |> 
    tab_header(
    title = "Dunn's Post Hoc Test Results (Cadmium accumulation)"
  ) %>%
  fmt_number(columns = c(Z, P.unadj,P.adj), decimals = 4) %>%
  cols_label(Comparison ="Varietal pairwise comparisons", Z = "Z-Statistic",
             P.unadj = 'Unadjusted p-value',
             P.adj = "Adjusted p-value") %>%
    data_color(
    columns = P.adj,
    colors = scales::col_numeric(
      palette = c("white", "lightpink"),
      domain = c(0, 0.05)
    )
  ) %>%
  opt_table_lines()

Warning: Since gt v0.9.0, the `colors` argument has been deprecated.
• Please use the `fn` argument instead.
This warning is displayed once every 8 hours.

Warning: Some values were outside the color scale and will be treated as NA

Dunn's Post Hoc Test Results (Cadmium accumulation)
Varietal pairwise comparisons	Z-Statistic	Unadjusted p-value	Adjusted p-value
222 - 414	0.0000	1.0000	1.0000
222 - 482	−1.6142	0.1065	0.1864
414 - 482	−1.6142	0.1065	0.2033
222 - 508	0.0000	1.0000	1.0000
414 - 508	0.0000	1.0000	1.0000
482 - 508	1.6142	0.1065	0.2236
222 - 512	−2.3697	0.0178	0.0623
414 - 512	−2.3697	0.0178	0.0748
482 - 512	−0.7556	0.4499	0.5905
508 - 512	−2.3697	0.0178	0.0935
222 - 528	−1.6142	0.1065	0.2485
414 - 528	−1.6142	0.1065	0.2795
482 - 528	0.0000	1.0000	1.0000
508 - 528	−1.6142	0.1065	0.3195
512 - 528	0.7556	0.4499	0.6299
222 - 604	−3.0566	0.0022	0.0157
414 - 604	−3.0566	0.0022	0.0235
482 - 604	−1.4425	0.1492	0.2238
508 - 604	−3.0566	0.0022	0.0470
512 - 604	−0.6869	0.4922	0.6080
528 - 604	−1.4425	0.1492	0.2410

dunn_test_UP2 <- dunnTest(BAC_Un ~ VAR, data = Cd_data, method = "bh")
gt(dunn_test_UP2$res) |> 
    tab_header(
    title = "Dunn's Post Hoc Test Results (BAC Values)"
  ) %>%
  fmt_number(columns = c(Z, P.unadj,P.adj), decimals = 4) %>%
  cols_label(Comparison ="Varietal pairwise comparisons", Z = "Z-Statistic",
             P.unadj = 'Unadjusted p-value',
             P.adj = "Adjusted p-value") %>%
    data_color(
    columns = P.adj,
    colors = scales::col_numeric(
      palette = c("white", "lightpink"),
      domain = c(0, 0.05)
    )
  ) %>%
  opt_table_lines()

Warning: Some values were outside the color scale and will be treated as NA

Dunn's Post Hoc Test Results (BAC Values)
Varietal pairwise comparisons	Z-Statistic	Unadjusted p-value	Adjusted p-value
222 - 414	0.0000	1.0000	1.0000
222 - 482	−1.6142	0.1065	0.1864
414 - 482	−1.6142	0.1065	0.2033
222 - 508	0.0000	1.0000	1.0000
414 - 508	0.0000	1.0000	1.0000
482 - 508	1.6142	0.1065	0.2236
222 - 512	−2.3697	0.0178	0.0623
414 - 512	−2.3697	0.0178	0.0748
482 - 512	−0.7556	0.4499	0.5905
508 - 512	−2.3697	0.0178	0.0935
222 - 528	−1.6142	0.1065	0.2485
414 - 528	−1.6142	0.1065	0.2795
482 - 528	0.0000	1.0000	1.0000
508 - 528	−1.6142	0.1065	0.3195
512 - 528	0.7556	0.4499	0.6299
222 - 604	−3.0566	0.0022	0.0157
414 - 604	−3.0566	0.0022	0.0235
482 - 604	−1.4425	0.1492	0.2238
508 - 604	−3.0566	0.0022	0.0470
512 - 604	−0.6869	0.4922	0.6080
528 - 604	−1.4425	0.1492	0.2410

Dunn’s post hoc test (Polished)

dunn_test_Po <- dunnTest(Polished ~ VAR, data = Cd_data, method = "bh")
dunn_test_Po2 <- dunnTest(BAC_Po ~ VAR, data = Cd_data, method = "bh")
gt(dunn_test_Po$res) |> 
    tab_header(
    title = "Dunn's Post Hoc Test Results (Cadmium Accumulation)"
  ) %>%
  fmt_number(columns = c(Z, P.unadj,P.adj), decimals = 4) %>%
  cols_label(Comparison ="Varietal pairwise comparisons", Z = "Z-Statistic",
             P.unadj = 'Unadjusted p-value',
             P.adj = "Adjusted p-value") %>%
      data_color(
    columns = P.adj,
    colors = scales::col_numeric(
      palette = c("white", "lightpink"),
      domain = c(0, 0.05)
    )
  ) %>%
  
  opt_table_lines()

Warning: Some values were outside the color scale and will be treated as NA

Dunn's Post Hoc Test Results (Cadmium Accumulation)
Varietal pairwise comparisons	Z-Statistic	Unadjusted p-value	Adjusted p-value
222 - 414	0.0000	1.0000	1.0000
222 - 482	−2.2268	0.0260	0.0454
414 - 482	−2.2268	0.0260	0.0496
222 - 508	0.0000	1.0000	1.0000
414 - 508	0.0000	1.0000	1.0000
482 - 508	2.2268	0.0260	0.0545
222 - 512	−2.2268	0.0260	0.0606
414 - 512	−2.2268	0.0260	0.0681
482 - 512	0.0000	1.0000	1.0000
508 - 512	−2.2268	0.0260	0.0779
222 - 528	−2.2268	0.0260	0.0909
414 - 528	−2.2268	0.0260	0.1090
482 - 528	0.0000	1.0000	1.0000
508 - 528	−2.2268	0.0260	0.1363
512 - 528	0.0000	1.0000	1.0000
222 - 604	−2.2268	0.0260	0.1817
414 - 604	−2.2268	0.0260	0.2726
482 - 604	0.0000	1.0000	1.0000
508 - 604	−2.2268	0.0260	0.5451
512 - 604	0.0000	1.0000	1.0000
528 - 604	0.0000	1.0000	1.0000

gt(dunn_test_Po2$res) |> 
    tab_header(
    title = "Dunn's Post Hoc Test Results (BAC Values)"
  ) %>%
  fmt_number(columns = c(Z, P.unadj,P.adj), decimals = 4) %>%
  cols_label(Comparison ="Varietal pairwise comparisons", Z = "Z-Statistic",
             P.unadj = 'Unadjusted p-value',
             P.adj = "Adjusted p-value") %>%
      data_color(
    columns = P.adj,
    colors = scales::col_numeric(
      palette = c("white", "lightpink"),
      domain = c(0, 0.05)
    )
  ) %>%
  
  opt_table_lines()

Warning: Some values were outside the color scale and will be treated as NA

Dunn's Post Hoc Test Results (BAC Values)
Varietal pairwise comparisons	Z-Statistic	Unadjusted p-value	Adjusted p-value
222 - 414	0.0000	1.0000	1.0000
222 - 482	−2.2268	0.0260	0.0454
414 - 482	−2.2268	0.0260	0.0496
222 - 508	0.0000	1.0000	1.0000
414 - 508	0.0000	1.0000	1.0000
482 - 508	2.2268	0.0260	0.0545
222 - 512	−2.2268	0.0260	0.0606
414 - 512	−2.2268	0.0260	0.0681
482 - 512	0.0000	1.0000	1.0000
508 - 512	−2.2268	0.0260	0.0779
222 - 528	−2.2268	0.0260	0.0909
414 - 528	−2.2268	0.0260	0.1090
482 - 528	0.0000	1.0000	1.0000
508 - 528	−2.2268	0.0260	0.1363
512 - 528	0.0000	1.0000	1.0000
222 - 604	−2.2268	0.0260	0.1817
414 - 604	−2.2268	0.0260	0.2726
482 - 604	0.0000	1.0000	1.0000
508 - 604	−2.2268	0.0260	0.5451
512 - 604	0.0000	1.0000	1.0000
528 - 604	0.0000	1.0000	1.0000

Interpretation

The Dunn’s test performs pairwise comparisons. Significant p-values (below 0.05) indicate that the compared groups have significantly different median. Please use the Adjusted p-values as it corrects for multiple comparisons and yield more robust estimates.

For Unpolished measurements, Dunn’s test shows the variety 604 have significantly higher median measurements than 222, 414, and 508. While for polisehd measurements, the only significant difference detected can be found in higher mediabn measurements of 482 against the varieties 222 and 414.

For Unpolished measurements, Dunn’s test reveals that the variety 604 has significantly higher median values compared to 222, 414, and 508. In contrast, for Polished measurements, the only significant difference observed is that variety 482 exhibits higher median values than both 222 and 414 at 5% level of significance.