By:

Adam Rivers
USDA-ARS Genomics and Bioinformatics Research Unit
January 29, 2021

For:

Phillip Wadl

Import Data

library(tidyverse)
dat <- read.csv("~/Desktop/2019 Sweetpotato Trial_2 Data.csv", header=TRUE, sep=",", na.strings=c("#VALUE!","."))
dat2 <- tibble(dat)
dat2.tot <- filter(dat2, Grade=='Total')
dat2.tot

Check data distributions

ggplot(dat2.tot, aes(x=WDS_Index)) + geom_histogram()

These results are not normally distributed but may be close enough for an initial examination with standard models.

Power calculation for WDS index

library('pwr')
library('effectsize')

k = 2 not k = n_varieties is used because we are not interested in testing if all the varieties are the same, but if a specific variety is different from the mean.

# Calculate number of sweet potato varieties
n_varieties <- length(levels(factor(dat2.tot$Entry)))
# estimate f
# f = (mu1 - mu2) /sigma
#A difference of 0.5 from mean is desired
mu1 <- mean(dat2.tot$WDS_Index, na.rm = TRUE)
sd1 <- sd(dat2.tot$WDS_Index, na.rm = TRUE)
f <-0.5/sd1
# comparison between two type s of sweet potatoes
pwr.anova.test(k=2, n=NULL, f=f, power=0.8, sig.level=0.05)

     Balanced one-way analysis of variance power calculation 

              k = 2
              n = 7.695865
              f = 0.7707629
      sig.level = 0.05
          power = 0.8

NOTE: n is number in each group

Multiple comparison correction

The goal is to detect differences between each variety and the average sweet potato for every variety, which means multiple comparisons are made so false discovery rate must be controlled. I will account for this first by applying conservative Bonferroni correction to the P value.

#estimate f
# f = (mu1 - mu2) /sigma
#A difference of 0.5 from mean is desired
mu1 <- mean(dat2.tot$WDS_Index, na.rm = TRUE)
sd1 <- sd(dat2.tot$WDS_Index, na.rm = TRUE)
f <-0.5/0.64
# comparison for two 
pwr.anova.test(k=2, n=NULL, f=f, power=0.8, sig.level=0.05/n_varieties)

     Balanced one-way analysis of variance power calculation 

              k = 2
              n = 16.51077
              f = 0.78125
      sig.level = 0.001111111
          power = 0.8

NOTE: n is number in each group

Work so far

This is just a starting point for discussion not an end point of our analysis. I like to check in with people our unit is helping as I go so we can clarify needs and research questions.

Interim conclusions for WDS index

  1. The distribution of WDS Index is not normal so different power calculation methods may need to be applied.
  2. How is WDS index generated? Understanding that would help us select the correct distribution to model.
  3. The minimum number to detect and absolute WDS effect size of 0.5 given the standard deviation of the data is about 8 samples in the single case or 16 in the Bonferroni corrected case.
  4. You were interested in reducing sample size to save money. There are two levels you are sampling at: 10 plants/storage roots per plot * 4 plots. I don’t think you can reduce the number of plots you sample with your current design but you could potentially reduce the number of plants you sample in a plot if you know the variance within a plot. Do to that we would need to know the plant/storage root level variance within a plot. Do you have that data?

Interim thoughts on the clean roots comparsison

I have not done this comparison yet but would likely use a prop test or chi square test. the results will depend on the total number or roots which will go down if sample size is reduced.

LS0tCnRpdGxlOiAiUG93ZXIgYW5hbHlzaXMgb2Ygc3dlZXQgcG90YXRvIGRpc2Vhc2Ugc2FtcGxpbmciCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KIyMgQnk6CkFkYW0gUml2ZXJzICAKVVNEQS1BUlMgR2Vub21pY3MgYW5kIEJpb2luZm9ybWF0aWNzIFJlc2VhcmNoIFVuaXQgIApKYW51YXJ5IDI5LCAyMDIxICAKCiMjIEZvcjoKUGhpbGxpcCBXYWRsCgojIyBJbXBvcnQgRGF0YQpgYGB7cn0KbGlicmFyeSh0aWR5dmVyc2UpCmRhdCA8LSByZWFkLmNzdigifi9EZXNrdG9wLzIwMTkgU3dlZXRwb3RhdG8gVHJpYWxfMiBEYXRhLmNzdiIsIGhlYWRlcj1UUlVFLCBzZXA9IiwiLCBuYS5zdHJpbmdzPWMoIiNWQUxVRSEiLCIuIikpCmRhdDIgPC0gdGliYmxlKGRhdCkKZGF0Mi50b3QgPC0gZmlsdGVyKGRhdDIsIEdyYWRlPT0nVG90YWwnKQpgYGAKCmBgYHtyfQpkYXQyLnRvdApgYGAKIyMgQ2hlY2sgZGF0YSBkaXN0cmlidXRpb25zCgpgYGB7cn0KZ2dwbG90KGRhdDIudG90LCBhZXMoeD1XRFNfSW5kZXgpKSArIGdlb21faGlzdG9ncmFtKCkKYGBgClRoZXNlIHJlc3VsdHMgYXJlIG5vdCBub3JtYWxseSBkaXN0cmlidXRlZCBidXQgbWF5IGJlIGNsb3NlIGVub3VnaCBmb3IgYW4gaW5pdGlhbCBleGFtaW5hdGlvbiB3aXRoIHN0YW5kYXJkIG1vZGVscy4KCiMjIFBvd2VyIGNhbGN1bGF0aW9uIGZvciBXRFMgaW5kZXgKCmBgYHtyfQpsaWJyYXJ5KCdwd3InKQpsaWJyYXJ5KCdlZmZlY3RzaXplJykKYGBgCgprID0gMiBub3QgayA9IG5fdmFyaWV0aWVzIGlzIHVzZWQgYmVjYXVzZSB3ZSBhcmUgbm90IGludGVyZXN0ZWQgaW4gdGVzdGluZyBpZiBhbGwgdGhlIHZhcmlldGllcyBhcmUgdGhlIHNhbWUsIGJ1dCBpZiBhIHNwZWNpZmljIHZhcmlldHkgaXMgZGlmZmVyZW50IGZyb20gdGhlIG1lYW4uCmBgYHtyfQojIENhbGN1bGF0ZSBudW1iZXIgb2Ygc3dlZXQgcG90YXRvIHZhcmlldGllcwpuX3ZhcmlldGllcyA8LSBsZW5ndGgobGV2ZWxzKGZhY3RvcihkYXQyLnRvdCRFbnRyeSkpKQojIGVzdGltYXRlIGYKIyBmID0gKG11MSAtIG11MikgL3NpZ21hCiNBIGRpZmZlcmVuY2Ugb2YgMC41IGZyb20gbWVhbiBpcyBkZXNpcmVkCm11MSA8LSBtZWFuKGRhdDIudG90JFdEU19JbmRleCwgbmEucm0gPSBUUlVFKQpzZDEgPC0gc2QoZGF0Mi50b3QkV0RTX0luZGV4LCBuYS5ybSA9IFRSVUUpCmYgPC0wLjUvc2QxCiMgY29tcGFyaXNvbiBiZXR3ZWVuIHR3byB0eXBlIHMgb2Ygc3dlZXQgcG90YXRvZXMKcHdyLmFub3ZhLnRlc3Qoaz0yLCBuPU5VTEwsIGY9ZiwgcG93ZXI9MC44LCBzaWcubGV2ZWw9MC4wNSkKYGBgCgoKIyMgTXVsdGlwbGUgY29tcGFyaXNvbiBjb3JyZWN0aW9uCgpUaGUgZ29hbCBpcyB0byBkZXRlY3QgZGlmZmVyZW5jZXMgYmV0d2VlbiBlYWNoIHZhcmlldHkgYW5kIHRoZSBhdmVyYWdlIHN3ZWV0IHBvdGF0byBmb3IgZXZlcnkgdmFyaWV0eSwgd2hpY2ggbWVhbnMgbXVsdGlwbGUgY29tcGFyaXNvbnMgYXJlIG1hZGUgc28gZmFsc2UgZGlzY292ZXJ5IHJhdGUgbXVzdCBiZSBjb250cm9sbGVkLiBJIHdpbGwgYWNjb3VudCBmb3IgdGhpcyBmaXJzdCBieSBhcHBseWluZyBjb25zZXJ2YXRpdmUgQm9uZmVycm9uaSBjb3JyZWN0aW9uIHRvIHRoZSBQIHZhbHVlLgpgYGB7cn0KI2VzdGltYXRlIGYKIyBmID0gKG11MSAtIG11MikgL3NpZ21hCiNBIGRpZmZlcmVuY2Ugb2YgMC41IGZyb20gbWVhbiBpcyBkZXNpcmVkCm11MSA8LSBtZWFuKGRhdDIudG90JFdEU19JbmRleCwgbmEucm0gPSBUUlVFKQpzZDEgPC0gc2QoZGF0Mi50b3QkV0RTX0luZGV4LCBuYS5ybSA9IFRSVUUpCmYgPC0wLjUvMC42NAojIGNvbXBhcmlzb24gZm9yIHR3byAKcHdyLmFub3ZhLnRlc3Qoaz0yLCBuPU5VTEwsIGY9ZiwgcG93ZXI9MC44LCBzaWcubGV2ZWw9MC4wNS9uX3ZhcmlldGllcykKYGBgCgojIyBXb3JrIHNvIGZhcgoKVGhpcyBpcyBqdXN0IGEgc3RhcnRpbmcgcG9pbnQgZm9yIGRpc2N1c3Npb24gbm90IGFuIGVuZCBwb2ludCBvZiBvdXIgYW5hbHlzaXMuICBJIGxpa2UgdG8gY2hlY2sgaW4gd2l0aCBwZW9wbGUgb3VyIHVuaXQgaXMgaGVscGluZyBhcyBJIGdvIHNvIHdlIGNhbiBjbGFyaWZ5IG5lZWRzIGFuZCByZXNlYXJjaCBxdWVzdGlvbnMuCgojIyBJbnRlcmltIGNvbmNsdXNpb25zIGZvciBXRFMgaW5kZXgKCjEuIFRoZSBkaXN0cmlidXRpb24gb2YgV0RTIEluZGV4IGlzIG5vdCBub3JtYWwgc28gZGlmZmVyZW50IHBvd2VyIGNhbGN1bGF0aW9uIG1ldGhvZHMgbWF5IG5lZWQgdG8gYmUgYXBwbGllZC4gCjIuIEhvdyBpcyBXRFMgaW5kZXggZ2VuZXJhdGVkPyAgVW5kZXJzdGFuZGluZyB0aGF0IHdvdWxkIGhlbHAgdXMgc2VsZWN0IHRoZSBjb3JyZWN0IGRpc3RyaWJ1dGlvbiB0byBtb2RlbC4KMy4gVGhlIG1pbmltdW0gbnVtYmVyIHRvIGRldGVjdCBhbmQgYWJzb2x1dGUgV0RTIGVmZmVjdCBzaXplIG9mIDAuNSBnaXZlbiB0aGUgc3RhbmRhcmQgZGV2aWF0aW9uIG9mIHRoZSBkYXRhIGlzIGFib3V0IDggc2FtcGxlcyBpbiB0aGUgc2luZ2xlIGNhc2Ugb3IgMTYgaW4gdGhlIEJvbmZlcnJvbmkgY29ycmVjdGVkIGNhc2UuCjQuIFlvdSB3ZXJlIGludGVyZXN0ZWQgaW4gcmVkdWNpbmcgc2FtcGxlIHNpemUgdG8gc2F2ZSBtb25leS4gVGhlcmUgYXJlIHR3byBsZXZlbHMgeW91IGFyZSBzYW1wbGluZyBhdDogMTAgcGxhbnRzL3N0b3JhZ2Ugcm9vdHMgcGVyIHBsb3QgKiA0IHBsb3RzLiAgSSBkb24ndCB0aGluayB5b3UgY2FuIHJlZHVjZSB0aGUgbnVtYmVyIG9mIHBsb3RzIHlvdSBzYW1wbGUgd2l0aCB5b3VyIGN1cnJlbnQgZGVzaWduIGJ1dCB5b3UgY291bGQgcG90ZW50aWFsbHkgcmVkdWNlIHRoZSBudW1iZXIgb2YgcGxhbnRzIHlvdSBzYW1wbGUgaW4gYSBwbG90IGlmIHlvdSBrbm93IHRoZSB2YXJpYW5jZSB3aXRoaW4gYSBwbG90LiAgRG8gdG8gdGhhdCB3ZSB3b3VsZCBuZWVkIHRvIGtub3cgdGhlIHBsYW50L3N0b3JhZ2Ugcm9vdCBsZXZlbCB2YXJpYW5jZSB3aXRoaW4gYSBwbG90LiAgRG8geW91IGhhdmUgdGhhdCBkYXRhPwoKCiMjIEludGVyaW0gdGhvdWdodHMgb24gdGhlIGNsZWFuIHJvb3RzIGNvbXBhcnNpc29uCgpJIGhhdmUgbm90IGRvbmUgdGhpcyBjb21wYXJpc29uIHlldCBidXQgd291bGQgbGlrZWx5IHVzZSBhIHByb3AgdGVzdCBvciBjaGkgc3F1YXJlIHRlc3QuIHRoZSByZXN1bHRzIHdpbGwgZGVwZW5kIG9uIHRoZSB0b3RhbCBudW1iZXIgb3Igcm9vdHMgd2hpY2ggd2lsbCBnbyBkb3duIGlmIHNhbXBsZSBzaXplIGlzIHJlZHVjZWQuCg==