By:
Adam Rivers
USDA-ARS Genomics and Bioinformatics Research Unit
January 29, 2021
Import Data
library(tidyverse)
dat <- read.csv("~/Desktop/2019 Sweetpotato Trial_2 Data.csv", header=TRUE, sep=",", na.strings=c("#VALUE!","."))
dat2 <- tibble(dat)
dat2.tot <- filter(dat2, Grade=='Total')
dat2.tot
Check data distributions
ggplot(dat2.tot, aes(x=WDS_Index)) + geom_histogram()

These results are not normally distributed but may be close enough for an initial examination with standard models.
Power calculation for WDS index
library('pwr')
library('effectsize')
k = 2 not k = n_varieties is used because we are not interested in testing if all the varieties are the same, but if a specific variety is different from the mean.
# Calculate number of sweet potato varieties
n_varieties <- length(levels(factor(dat2.tot$Entry)))
# estimate f
# f = (mu1 - mu2) /sigma
#A difference of 0.5 from mean is desired
mu1 <- mean(dat2.tot$WDS_Index, na.rm = TRUE)
sd1 <- sd(dat2.tot$WDS_Index, na.rm = TRUE)
f <-0.5/sd1
# comparison between two type s of sweet potatoes
pwr.anova.test(k=2, n=NULL, f=f, power=0.8, sig.level=0.05)
Balanced one-way analysis of variance power calculation
k = 2
n = 7.695865
f = 0.7707629
sig.level = 0.05
power = 0.8
NOTE: n is number in each group
Multiple comparison correction
The goal is to detect differences between each variety and the average sweet potato for every variety, which means multiple comparisons are made so false discovery rate must be controlled. I will account for this first by applying conservative Bonferroni correction to the P value.
#estimate f
# f = (mu1 - mu2) /sigma
#A difference of 0.5 from mean is desired
mu1 <- mean(dat2.tot$WDS_Index, na.rm = TRUE)
sd1 <- sd(dat2.tot$WDS_Index, na.rm = TRUE)
f <-0.5/0.64
# comparison for two
pwr.anova.test(k=2, n=NULL, f=f, power=0.8, sig.level=0.05/n_varieties)
Balanced one-way analysis of variance power calculation
k = 2
n = 16.51077
f = 0.78125
sig.level = 0.001111111
power = 0.8
NOTE: n is number in each group
Work so far
This is just a starting point for discussion not an end point of our analysis. I like to check in with people our unit is helping as I go so we can clarify needs and research questions.
Interim conclusions for WDS index
- The distribution of WDS Index is not normal so different power calculation methods may need to be applied.
- How is WDS index generated? Understanding that would help us select the correct distribution to model.
- The minimum number to detect and absolute WDS effect size of 0.5 given the standard deviation of the data is about 8 samples in the single case or 16 in the Bonferroni corrected case.
- You were interested in reducing sample size to save money. There are two levels you are sampling at: 10 plants/storage roots per plot * 4 plots. I don’t think you can reduce the number of plots you sample with your current design but you could potentially reduce the number of plants you sample in a plot if you know the variance within a plot. Do to that we would need to know the plant/storage root level variance within a plot. Do you have that data?
Interim thoughts on the clean roots comparsison
I have not done this comparison yet but would likely use a prop test or chi square test. the results will depend on the total number or roots which will go down if sample size is reduced.
LS0tCnRpdGxlOiAiUG93ZXIgYW5hbHlzaXMgb2Ygc3dlZXQgcG90YXRvIGRpc2Vhc2Ugc2FtcGxpbmciCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KIyMgQnk6CkFkYW0gUml2ZXJzICAKVVNEQS1BUlMgR2Vub21pY3MgYW5kIEJpb2luZm9ybWF0aWNzIFJlc2VhcmNoIFVuaXQgIApKYW51YXJ5IDI5LCAyMDIxICAKCiMjIEZvcjoKUGhpbGxpcCBXYWRsCgojIyBJbXBvcnQgRGF0YQpgYGB7cn0KbGlicmFyeSh0aWR5dmVyc2UpCmRhdCA8LSByZWFkLmNzdigifi9EZXNrdG9wLzIwMTkgU3dlZXRwb3RhdG8gVHJpYWxfMiBEYXRhLmNzdiIsIGhlYWRlcj1UUlVFLCBzZXA9IiwiLCBuYS5zdHJpbmdzPWMoIiNWQUxVRSEiLCIuIikpCmRhdDIgPC0gdGliYmxlKGRhdCkKZGF0Mi50b3QgPC0gZmlsdGVyKGRhdDIsIEdyYWRlPT0nVG90YWwnKQpgYGAKCmBgYHtyfQpkYXQyLnRvdApgYGAKIyMgQ2hlY2sgZGF0YSBkaXN0cmlidXRpb25zCgpgYGB7cn0KZ2dwbG90KGRhdDIudG90LCBhZXMoeD1XRFNfSW5kZXgpKSArIGdlb21faGlzdG9ncmFtKCkKYGBgClRoZXNlIHJlc3VsdHMgYXJlIG5vdCBub3JtYWxseSBkaXN0cmlidXRlZCBidXQgbWF5IGJlIGNsb3NlIGVub3VnaCBmb3IgYW4gaW5pdGlhbCBleGFtaW5hdGlvbiB3aXRoIHN0YW5kYXJkIG1vZGVscy4KCiMjIFBvd2VyIGNhbGN1bGF0aW9uIGZvciBXRFMgaW5kZXgKCmBgYHtyfQpsaWJyYXJ5KCdwd3InKQpsaWJyYXJ5KCdlZmZlY3RzaXplJykKYGBgCgprID0gMiBub3QgayA9IG5fdmFyaWV0aWVzIGlzIHVzZWQgYmVjYXVzZSB3ZSBhcmUgbm90IGludGVyZXN0ZWQgaW4gdGVzdGluZyBpZiBhbGwgdGhlIHZhcmlldGllcyBhcmUgdGhlIHNhbWUsIGJ1dCBpZiBhIHNwZWNpZmljIHZhcmlldHkgaXMgZGlmZmVyZW50IGZyb20gdGhlIG1lYW4uCmBgYHtyfQojIENhbGN1bGF0ZSBudW1iZXIgb2Ygc3dlZXQgcG90YXRvIHZhcmlldGllcwpuX3ZhcmlldGllcyA8LSBsZW5ndGgobGV2ZWxzKGZhY3RvcihkYXQyLnRvdCRFbnRyeSkpKQojIGVzdGltYXRlIGYKIyBmID0gKG11MSAtIG11MikgL3NpZ21hCiNBIGRpZmZlcmVuY2Ugb2YgMC41IGZyb20gbWVhbiBpcyBkZXNpcmVkCm11MSA8LSBtZWFuKGRhdDIudG90JFdEU19JbmRleCwgbmEucm0gPSBUUlVFKQpzZDEgPC0gc2QoZGF0Mi50b3QkV0RTX0luZGV4LCBuYS5ybSA9IFRSVUUpCmYgPC0wLjUvc2QxCiMgY29tcGFyaXNvbiBiZXR3ZWVuIHR3byB0eXBlIHMgb2Ygc3dlZXQgcG90YXRvZXMKcHdyLmFub3ZhLnRlc3Qoaz0yLCBuPU5VTEwsIGY9ZiwgcG93ZXI9MC44LCBzaWcubGV2ZWw9MC4wNSkKYGBgCgoKIyMgTXVsdGlwbGUgY29tcGFyaXNvbiBjb3JyZWN0aW9uCgpUaGUgZ29hbCBpcyB0byBkZXRlY3QgZGlmZmVyZW5jZXMgYmV0d2VlbiBlYWNoIHZhcmlldHkgYW5kIHRoZSBhdmVyYWdlIHN3ZWV0IHBvdGF0byBmb3IgZXZlcnkgdmFyaWV0eSwgd2hpY2ggbWVhbnMgbXVsdGlwbGUgY29tcGFyaXNvbnMgYXJlIG1hZGUgc28gZmFsc2UgZGlzY292ZXJ5IHJhdGUgbXVzdCBiZSBjb250cm9sbGVkLiBJIHdpbGwgYWNjb3VudCBmb3IgdGhpcyBmaXJzdCBieSBhcHBseWluZyBjb25zZXJ2YXRpdmUgQm9uZmVycm9uaSBjb3JyZWN0aW9uIHRvIHRoZSBQIHZhbHVlLgpgYGB7cn0KI2VzdGltYXRlIGYKIyBmID0gKG11MSAtIG11MikgL3NpZ21hCiNBIGRpZmZlcmVuY2Ugb2YgMC41IGZyb20gbWVhbiBpcyBkZXNpcmVkCm11MSA8LSBtZWFuKGRhdDIudG90JFdEU19JbmRleCwgbmEucm0gPSBUUlVFKQpzZDEgPC0gc2QoZGF0Mi50b3QkV0RTX0luZGV4LCBuYS5ybSA9IFRSVUUpCmYgPC0wLjUvMC42NAojIGNvbXBhcmlzb24gZm9yIHR3byAKcHdyLmFub3ZhLnRlc3Qoaz0yLCBuPU5VTEwsIGY9ZiwgcG93ZXI9MC44LCBzaWcubGV2ZWw9MC4wNS9uX3ZhcmlldGllcykKYGBgCgojIyBXb3JrIHNvIGZhcgoKVGhpcyBpcyBqdXN0IGEgc3RhcnRpbmcgcG9pbnQgZm9yIGRpc2N1c3Npb24gbm90IGFuIGVuZCBwb2ludCBvZiBvdXIgYW5hbHlzaXMuICBJIGxpa2UgdG8gY2hlY2sgaW4gd2l0aCBwZW9wbGUgb3VyIHVuaXQgaXMgaGVscGluZyBhcyBJIGdvIHNvIHdlIGNhbiBjbGFyaWZ5IG5lZWRzIGFuZCByZXNlYXJjaCBxdWVzdGlvbnMuCgojIyBJbnRlcmltIGNvbmNsdXNpb25zIGZvciBXRFMgaW5kZXgKCjEuIFRoZSBkaXN0cmlidXRpb24gb2YgV0RTIEluZGV4IGlzIG5vdCBub3JtYWwgc28gZGlmZmVyZW50IHBvd2VyIGNhbGN1bGF0aW9uIG1ldGhvZHMgbWF5IG5lZWQgdG8gYmUgYXBwbGllZC4gCjIuIEhvdyBpcyBXRFMgaW5kZXggZ2VuZXJhdGVkPyAgVW5kZXJzdGFuZGluZyB0aGF0IHdvdWxkIGhlbHAgdXMgc2VsZWN0IHRoZSBjb3JyZWN0IGRpc3RyaWJ1dGlvbiB0byBtb2RlbC4KMy4gVGhlIG1pbmltdW0gbnVtYmVyIHRvIGRldGVjdCBhbmQgYWJzb2x1dGUgV0RTIGVmZmVjdCBzaXplIG9mIDAuNSBnaXZlbiB0aGUgc3RhbmRhcmQgZGV2aWF0aW9uIG9mIHRoZSBkYXRhIGlzIGFib3V0IDggc2FtcGxlcyBpbiB0aGUgc2luZ2xlIGNhc2Ugb3IgMTYgaW4gdGhlIEJvbmZlcnJvbmkgY29ycmVjdGVkIGNhc2UuCjQuIFlvdSB3ZXJlIGludGVyZXN0ZWQgaW4gcmVkdWNpbmcgc2FtcGxlIHNpemUgdG8gc2F2ZSBtb25leS4gVGhlcmUgYXJlIHR3byBsZXZlbHMgeW91IGFyZSBzYW1wbGluZyBhdDogMTAgcGxhbnRzL3N0b3JhZ2Ugcm9vdHMgcGVyIHBsb3QgKiA0IHBsb3RzLiAgSSBkb24ndCB0aGluayB5b3UgY2FuIHJlZHVjZSB0aGUgbnVtYmVyIG9mIHBsb3RzIHlvdSBzYW1wbGUgd2l0aCB5b3VyIGN1cnJlbnQgZGVzaWduIGJ1dCB5b3UgY291bGQgcG90ZW50aWFsbHkgcmVkdWNlIHRoZSBudW1iZXIgb2YgcGxhbnRzIHlvdSBzYW1wbGUgaW4gYSBwbG90IGlmIHlvdSBrbm93IHRoZSB2YXJpYW5jZSB3aXRoaW4gYSBwbG90LiAgRG8gdG8gdGhhdCB3ZSB3b3VsZCBuZWVkIHRvIGtub3cgdGhlIHBsYW50L3N0b3JhZ2Ugcm9vdCBsZXZlbCB2YXJpYW5jZSB3aXRoaW4gYSBwbG90LiAgRG8geW91IGhhdmUgdGhhdCBkYXRhPwoKCiMjIEludGVyaW0gdGhvdWdodHMgb24gdGhlIGNsZWFuIHJvb3RzIGNvbXBhcnNpc29uCgpJIGhhdmUgbm90IGRvbmUgdGhpcyBjb21wYXJpc29uIHlldCBidXQgd291bGQgbGlrZWx5IHVzZSBhIHByb3AgdGVzdCBvciBjaGkgc3F1YXJlIHRlc3QuIHRoZSByZXN1bHRzIHdpbGwgZGVwZW5kIG9uIHRoZSB0b3RhbCBudW1iZXIgb3Igcm9vdHMgd2hpY2ggd2lsbCBnbyBkb3duIGlmIHNhbXBsZSBzaXplIGlzIHJlZHVjZWQuCg==