PhD Analysis

## Last updated: Wed Jan 22 19:44:41 2014

NOTE: Always verify figure by manual calculation of a subset in Excel

Contents

  1. BQCV All data
    1.1 Barplots
    1.2 Stats

  2. Nosema All data
    2.1 Barplots
    2.2 Stats

  3. My data
    3.1 Barplots
    3.2 Stats

  4. Pan Dudek
    4.1 Barplots
    4.2 Stats

  5. Pulawy Olszyn Nosema
    5.1 Barplots 5.2 Stats
    5.3 Summaries


1. BQCV All Data

Data structure

## 'data.frame':    239 obs. of  8 variables:
##  $ well     : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ days     : int  7 7 10 12 13 15 16 16 17 20 ...
##  $ bqcv     : num  15800 218000 12500 46100 158000 15800 30000 50000 71500 3800000 ...
##  $ spores   : num  0.6 0.71 0.64 1.3 0.97 1.61 1.53 0.5 1.14 2 ...
##  $ cage     : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
##  $ virus    : Factor w/ 2 levels "with BQCV","without BQCV": 1 1 1 1 1 1 1 1 1 1 ...
##  $ nosema   : Factor w/ 2 levels "N.apis","N.ceranae": 1 1 1 1 1 1 1 1 1 1 ...
##  $ log10bqcv: num  4.2 5.34 4.1 4.66 5.2 ...

1.1 Barplots

Barplots for Nosema spores, log10 BQCV copies and Survival days summarised by Nosema species, BQCV presence and cages. The Mean and SD are computed over all individuals in each cage.

plot of chunk unnamed-chunk-5 plot of chunk unnamed-chunk-5 plot of chunk unnamed-chunk-5

Barplot combining Nosema spores, BQCV titres and survival days. Mean and SD computed over all individuals in all cages.

plot of chunk unnamed-chunk-6

1.2 Stats


# Stat Stats are usually wilcoxon tests to compare if values in groups are significantly different. A boxplot is shown to visually compare groups. And the test
# results are shown underneath. Blue dot denotes mean. Red + denotes the actual data points.

# 1.2.1 Is there sig diff in nosema spores between colonies with and without BQCV? Yes. the difference is significant. Strange that its sig, since it is not
# visible on the boxplot.

cnt <- ddply(b1, .(virus), summarise, count = fn3(spores))$count
boxplot(b1$spores ~ b1$virus)
stripchart(b1$spores ~ b1$virus, method = "jitter", jitter = 0.3, add = T, col = "red", vert = T, pch = "+")
dp1 <- ddply(b1, .(virus), summarise, spores = fn1(spores))
points(x = 1:2, y = dp1$spores, pch = 16, col = "steelblue", cex = 2)
axis(at = 1:length(cnt), labels = cnt, side = 1, line = 1, cex.axis = 0.8, tick = F)

plot of chunk unnamed-chunk-7

wilcox.test(spores ~ virus, data = b1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  spores by virus
## W = 5360, p-value = 0.0008672
## alternative hypothesis: true location shift is not equal to 0


# 1.2.2 Is there sig diff in nosema spores between colonies with and without BQCV in both nosema subspecies? In samples with N. ceranae, withoutBQCV produces sig
# higher spores.

cnt <- ddply(b1, .(nosema, virus), summarise, count = fn3(spores))$count
boxplot(b1$spores ~ b1$virus + b1$nosema)
stripchart(b1$spores ~ b1$virus + b1$nosema, method = "jitter", jitter = 0.3, add = T, col = "red", vert = T, pch = "+")
dp1 <- ddply(b1, .(nosema, virus), summarise, spores = fn1(spores))
points(x = 1:4, y = dp1$spores, pch = 16, col = "steelblue", cex = 2)

axis(at = 1:length(cnt), labels = cnt, side = 1, line = 1, cex.axis = 0.8, tick = F)

plot of chunk unnamed-chunk-7

b1.1 <- subset(b1, as.character(b1$nosema) == "N.apis")
b1.1$nosema <- b1.1$nosema[drop = T]
b1.2 <- subset(b1, as.character(b1$nosema) == "N.ceranae")
b1.2$nosema <- b1.2$nosema[drop = T]
wilcox.test(spores ~ virus, data = b1.1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  spores by virus
## W = 1565, p-value = 0.2183
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(spores ~ virus, data = b1.2)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  spores by virus
## W = 1160, p-value = 0.001195
## alternative hypothesis: true location shift is not equal to 0

# 1.2.3 Is there sig diff in BQCV titres between the two nosema species? No
boxplot(b1$log10bqcv ~ b1$nosema)

plot of chunk unnamed-chunk-7

wilcox.test(log10bqcv ~ nosema, data = b1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  log10bqcv by nosema
## W = 1431, p-value = 0.072
## alternative hypothesis: true location shift is not equal to 0

# 1.2.4 Is there sig diff in survival days between samples with and without BQCV? Yes. Samples withBQCV have sig lower survival days.
boxplot(b1$days ~ b1$virus)

plot of chunk unnamed-chunk-7

wilcox.test(days ~ virus, data = b1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  days by virus
## W = 5801, p-value = 0.01185
## alternative hypothesis: true location shift is not equal to 0

# 1.2.5 Is there sig diff in survival days between nosema species? Yes. Samples with N.ceranae has sig lower survival days.
boxplot(b1$days ~ b1$nosema)

plot of chunk unnamed-chunk-7

wilcox.test(days ~ nosema, data = b1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  days by nosema
## W = 9552, p-value = 5.757e-06
## alternative hypothesis: true location shift is not equal to 0

# 1.2.6 Is there sig diff in survival days between samples with and without BQCV in both nosema subspecies? In N. apis infected samples, survival days were sig
# lower withBQCV. In samples withoutBQCV, N.ceranae had sig lower survival days.
cnt <- ddply(b1, .(nosema, virus), summarise, count = fn3(days))$count
boxplot(b1$days ~ b1$virus + b1$nosema)
axis(at = 1:length(cnt), labels = cnt, side = 1, line = 1, cex.axis = 0.8, tick = F)

plot of chunk unnamed-chunk-7

# make sure 1.2.2 has been run before running below
wilcox.test(days ~ virus, data = b1.1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  days by virus
## W = 1420, p-value = 0.0396
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(days ~ virus, data = b1.2)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  days by virus
## W = 1542, p-value = 0.2253
## alternative hypothesis: true location shift is not equal to 0

b1.1 <- subset(b1, as.character(b1$virus) == "with BQCV")
b1.1$virus <- b1.1$virus[drop = T]
b1.2 <- subset(b1, as.character(b1$virus) == "without BQCV")
b1.2$virus <- b1.2$virus[drop = T]

wilcox.test(days ~ nosema, data = b1.1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  days by nosema
## W = 2095, p-value = 0.08307
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(days ~ nosema, data = b1.2)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  days by nosema
## W = 2614, p-value = 1.743e-05
## alternative hypothesis: true location shift is not equal to 0
# summary(glm(spores~virus,data=b1)) wilcox.test(spores~virus,data=b1) t.test(spores~virus,data=b1)

1.3 Correlation

# Correlation Nosema vs Days in all samples
ggplot(data = b1, aes(x = days, y = spores)) + geom_point() + facet_wrap(~nosema) + stat_smooth(method = lm) + theme_bw() + ggtitle("Nosema vs days in all samples")

plot of chunk unnamed-chunk-8


# Correlation test
cor.test(b1$days, b1$spores)
## 
##  Pearson's product-moment correlation
## 
## data:  b1$days and b1$spores
## t = 19.51, df = 237, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7309 0.8293
## sample estimates:
##   cor 
## 0.785

# Correlation BQCV vs Nosema in samples with BQCV
ggplot(data = b1, aes(x = spores, y = log10bqcv)) + geom_point() + stat_smooth(method = lm) + facet_wrap(~nosema) + theme_bw() + guides(size = FALSE) + ggtitle("BQCV vs spores in samples with BQCV")

plot of chunk unnamed-chunk-8


# Correlation test
cor.test(b1$log10bqcv, b1$spores)
## 
##  Pearson's product-moment correlation
## 
## data:  b1$log10bqcv and b1$spores
## t = 12.71, df = 117, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.6741 0.8281
## sample estimates:
##    cor 
## 0.7617

# Correlation BQCV vs Days in samples with BQCV
ggplot(data = b1, aes(x = days, y = log10bqcv)) + geom_point() + stat_smooth(method = lm) + facet_wrap(~nosema) + theme_bw() + ggtitle("BQCV vs days in samples with BQCV")

plot of chunk unnamed-chunk-8


# Correlation test
cor.test(b1$days, b1$log10bqcv)
## 
##  Pearson's product-moment correlation
## 
## data:  b1$days and b1$log10bqcv
## t = 9.878, df = 117, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.5626 0.7619
## sample estimates:
##    cor 
## 0.6743

# Correlation after splitting up samples with and without BQCV Run b1.1 and b1.2 above

# Correlation Nosema vs Days in samples without BQCV
ggplot(data = b1.1, aes(x = days, y = spores)) + geom_point() + stat_smooth(method = lm) + facet_wrap(~nosema) + theme_bw() + ggtitle("Nosema vs days in samples with BQCV")

plot of chunk unnamed-chunk-8


# Correlation test
cor.test(b1.1$days, b1.1$spores)
## 
##  Pearson's product-moment correlation
## 
## data:  b1.1$days and b1.1$spores
## t = 11.94, df = 117, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.6475 0.8127
## sample estimates:
##    cor 
## 0.7411

# Correlation Nosema vs Days in samples without BQCV
ggplot(data = b1.2, aes(x = days, y = spores)) + geom_point() + stat_smooth(method = lm) + facet_wrap(~nosema) + theme_bw() + ggtitle("Nosema vs days in samples without BQCV")

plot of chunk unnamed-chunk-8


# Correlation test
cor.test(b1.2$days, b1.2$spores)
## 
##  Pearson's product-moment correlation
## 
## data:  b1.2$days and b1.2$spores
## t = 16.58, df = 118, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7731 0.8832
## sample estimates:
##    cor 
## 0.8364

2. Nosema All data

Data structure

## 'data.frame':    238 obs. of  7 variables:
##  $ well       : int  4 5 6 7 8 9 10 11 12 13 ...
##  $ copies     : num  1.22e+06 1.38e+07 9.77e+06 1.64e+07 3.47e+07 1.11e+07 1.67e+06 2.04e+08 3.32e+07 2.39e+08 ...
##  $ days       : int  10 11 12 14 15 15 16 17 21 21 ...
##  $ cage       : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
##  $ nosema     : Factor w/ 2 levels "N.apis","N.ceranae": 1 1 1 1 1 1 1 1 1 1 ...
##  $ type       : Factor w/ 2 levels "mixed","single": 2 2 2 2 2 2 2 2 2 2 ...
##  $ log10copies: num  6.09 7.14 6.99 7.21 7.54 ...

2.1 Barplots

Barplots of Nosema spores and survival days showing Mean and SD.

plot of chunk unnamed-chunk-10 plot of chunk unnamed-chunk-10

2.2 Stats

# 2.2.1 Is there sig diff in nosema copies between single and mixed infections? No
cnt <- ddply(n1, .(type), summarise, count = fn3(log10copies))$count
boxplot(n1$log10copies ~ n1$type)
axis(at = 1:length(cnt), labels = cnt, side = 1, line = 1, cex.axis = 0.8, tick = F)

plot of chunk unnamed-chunk-11

wilcox.test(log10copies ~ type, data = n1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  log10copies by type
## W = 8040, p-value = 0.07065
## alternative hypothesis: true location shift is not equal to 0

# 2.2.2 Is there sig diff in nosema copies between single and mixed infections in both nosema species? In samples infected with N.apis, mixed infections had sig
# higher N. apis copies. In case of single infection, N. ceranae had sig higher copies than N.apis. In case of mixed infection, N. cerane still had sig higher
# copies than N. apis.

cnt <- ddply(n1, .(nosema, type), summarise, count = fn3(log10copies))$count
boxplot(n1$log10copies ~ n1$type + n1$nosema)
axis(at = 1:length(cnt), labels = cnt, side = 1, line = 1, cex.axis = 0.8, tick = F)

plot of chunk unnamed-chunk-11

n1.1 <- subset(n1, as.character(n1$nosema) == "N.apis")
n1.1$nosema <- n1.1$nosema[drop = T]
n1.2 <- subset(n1, as.character(n1$nosema) == "N.ceranae")
n1.2$nosema <- n1.2$nosema[drop = T]
wilcox.test(log10copies ~ type, data = n1.1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  log10copies by type
## W = 3009, p-value = 4.624e-11
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(log10copies ~ type, data = n1.2)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  log10copies by type
## W = 1287, p-value = 0.01033
## alternative hypothesis: true location shift is not equal to 0

n1.1 <- subset(n1, as.character(n1$type) == "single")
n1.1$type <- n1.1$type[drop = T]
n1.2 <- subset(n1, as.character(n1$type) == "mixed")
n1.2$type <- n1.2$type[drop = T]
wilcox.test(log10copies ~ nosema, data = n1.1)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  log10copies by nosema
## W = 270.5, p-value = 1.014e-15
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(log10copies ~ nosema, data = n1.2)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  log10copies by nosema
## W = 966.5, p-value = 3.138e-05
## alternative hypothesis: true location shift is not equal to 0

3. My data

Data structure

## 'data.frame':    42 obs. of  7 variables:
##  $ colony : Factor w/ 3 levels "Colony 1","Colony 3",..: 1 2 3 1 2 3 1 2 3 1 ...
##  $ spores : num  3.6 33.8 17.6 0 15.2 12.3 7.8 4 5.7 1.4 ...
##  $ perinf : int  60 100 26 3 33 20 87 60 47 17 ...
##  $ nosema : Factor w/ 2 levels "Mixed","N.ceranae": 2 2 2 2 2 2 2 2 2 2 ...
##  $ nosema1: Factor w/ 3 levels "Mixed","N.ceranae",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ season : Factor w/ 3 levels "spring","autumn",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ year   : Factor w/ 6 levels "2007","2008",..: 3 3 3 4 4 4 5 5 5 6 ...

3.1 Barplots

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-13

3.2 Stats

What Stats to do with this dataset?


4. Pan Dudek

Data structure

## 'data.frame':    30 obs. of  6 variables:
##  $ colony: Factor w/ 6 levels "3","5","6","7",..: 1 2 3 4 5 6 1 2 3 4 ...
##  $ spores: num  8.1 0.5 3.1 10.7 4.9 2.3 0.1 NA 0.4 0.8 ...
##  $ year  : Factor w/ 2 levels "2010","2011": 1 1 1 1 1 1 1 1 1 1 ...
##  $ season: Factor w/ 3 levels "spring","autumn",..: 1 1 1 1 1 1 2 2 2 2 ...
##  $ nosema: Factor w/ 2 levels "Mixed","N.ceranae": 1 1 1 1 1 1 2 NA 1 1 ...
##  $ perinf: num  83 56.6 70 96.6 93.3 56.6 35 NA 13.3 16.6 ...

4.1 Barplots

plot of chunk unnamed-chunk-16 plot of chunk unnamed-chunk-16

4.2 Stats

What Stats to do with this dataset?

5. Pulawy Olsztyn Nosema

Data structure

## 'data.frame':    840 obs. of  12 variables:
##  $ colony  : Factor w/ 105 levels "100","101","103",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ spores  : num  0.3 2.25 3.25 0 0 5.2 5.6 0 0.5 0 ...
##  $ perinf  : num  3 15 35 0 0 20 5 0 9 0 ...
##  $ nosema  : Factor w/ 3 levels "Mixed","N.apis",..: 1 2 1 NA NA 3 1 NA 1 NA ...
##  $ nosema1 : Factor w/ 5 levels "Mixed","N.apis",..: 1 2 1 NA NA 3 1 NA 1 NA ...
##  $ season  : Factor w/ 3 levels "Summer","Autumn",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ year    : Factor w/ 3 levels "2009","2010",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ seasyear: Factor w/ 8 levels "Summer 2009",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ status  : Factor w/ 2 levels "Alive","Dead": 1 1 1 1 1 1 1 1 1 1 ...
##  $ status1 : Factor w/ 8 levels "Dead in winter 2009",..: 8 3 7 4 2 2 2 2 2 2 ...
##  $ status2 : Factor w/ 4 levels "2009","2010",..: 4 2 3 3 2 2 2 2 2 2 ...
##  $ status3 : Factor w/ 2 levels "Alive","Dead": 1 2 2 2 2 2 2 2 2 2 ...

5.1 Barplots

Note: Y-Axis not fixed
General barplots

plot of chunk unnamed-chunk-19

plot of chunk unnamed-chunk-20

Barplots for survival status

plot of chunk unnamed-chunk-21 plot of chunk unnamed-chunk-21 plot of chunk unnamed-chunk-21

5.2 Stats

What stats to do?

5.3 Summaries

5.3.1 Nosema types over season and years

##        Season    Nosema Counts
## 1 Summer 2009     Mixed     56
## 2 Summer 2009    N.apis      3
## 3 Summer 2009 N.ceranae      3
## 4 Summer 2010     Mixed     14
## 5 Summer 2010 N.ceranae     22
## 6 Summer 2011     Mixed      3
## 7 Summer 2011 N.ceranae     20

plot of chunk unnamed-chunk-22


End of Document.