R Week 3 Assignment:

Background: The codling moth (Cydia pomonella) is a pest which invades apples and many other fruit crops. This pest has not yet been detected in Japan. New Zealand grows and exports many apple cultivars. Methyl bromide is used to fumigate many crops necessary to exterminate the codling moth prior to export to Japan.

In 1988–1989 ‘Braeburn’, ‘Fuji’, ‘Granny Smith’, ‘Red Delicious’, ‘Royal Gala’, ‘Gala’ and ‘Splendour’ apples were infested with freshly laid eggs of codling moth, Cydia pomonella (L), and were fumigated at a range of methyl bromide doses.

In a World Trade Organization report on “Japan - Measures Affecting Agricultural Products (WTO, 1998, p. 30)” questions if pest control treatment varies by variety (cultivar).

Meaningful question for analysis: For a given treatment dose of methyl bromide, is there a difference between Cultivars on the level of high mortality of the codling moth in a sample of apple cultivars?

Sample Dataset:

Fumigation experiments with methyl bromide (MeBr) were carried out in New Zealand over several seasons. In 1988–1989 ‘Braeburn’, ‘Fuji’, ‘Granny Smith’, ‘Red Delicious’, ‘Royal Gala’, ‘Gala’ and ‘Splendour’ apples were infested with freshly laid eggs of codling moth, Cydia pomonella (L), and were fumigated at a range of methyl bromide doses.

Data are from trials that studied the mortality response of codling moth to fumigation with methyl bromide.

The research that generated these data was in part funded by New Zealand pipfruit growers.

# Read  csv file
# BONUS: Place the original .csv in a github file and have R read from the link.  This will be a very useful skill as you progress in your data science and career.

theURL <- "https://raw.githubusercontent.com/CUNYSPS-RickRN/CUNYSPS-Bridge/master/codling.csv"
Apple_MB_dose_df <- read.table(theURL, header=TRUE, sep = ",")

Data Structure Exploration:

dim(Apple_MB_dose_df)
## [1] 99 11
head(Apple_MB_dose_df)
##   X dose  tot dead   pobs     cm       ct Cultivar gp year numcm
## 1 1    5  866  246 0.2841 0.2178 15.59417    ROYAL  1 1988  1676
## 2 2    8  911  220 0.2415 0.2178 20.26042    ROYAL  1 1988  1676
## 3 3   12  906  360 0.3974 0.2178 28.60292    ROYAL  1 1988  1676
## 4 4   16  712  271 0.3806 0.2178 32.68833    ROYAL  1 1988  1676
## 5 5   20  582  414 0.7113 0.2178 45.42708    ROYAL  1 1988  1676
## 6 6   24 1183  742 0.6272 0.2178 45.44292    ROYAL  1 1988  1676
tail(Apple_MB_dose_df)
##       X dose  tot dead   pobs     cm       ct  Cultivar gp year numcm
## 94 3529   30 2097 1968 0.9385 0.1879 62.51875 Splendour 16 1989  1474
## 95 3831   12 1430  531 0.3713 0.2481 34.92292 Splendour 17 1989  2648
## 96 3932   16  558  273 0.4892 0.2481 42.51292 Splendour 17 1989  2648
## 97 4033   23 1094  911 0.8327 0.2481 54.66375 Splendour 17 1989  2648
## 98 4134   24 1156  937 0.8106 0.2481 53.87000 Splendour 17 1989  2648
## 99 4235   30  795  788 0.9912 0.2481 64.13375 Splendour 17 1989  2648
str(Apple_MB_dose_df)   # examine data.frame structure
## 'data.frame':    99 obs. of  11 variables:
##  $ X       : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ dose    : int  5 8 12 16 20 24 5 8 12 16 ...
##  $ tot     : int  866 911 906 712 582 1183 603 640 627 788 ...
##  $ dead    : int  246 220 360 271 414 742 154 168 180 240 ...
##  $ pobs    : num  0.284 0.241 0.397 0.381 0.711 ...
##  $ cm      : num  0.218 0.218 0.218 0.218 0.218 ...
##  $ ct      : num  15.6 20.3 28.6 32.7 45.4 ...
##  $ Cultivar: chr  "ROYAL" "ROYAL" "ROYAL" "ROYAL" ...
##  $ gp      : int  1 1 1 1 1 1 2 2 2 2 ...
##  $ year    : int  1988 1988 1988 1988 1988 1988 1988 1988 1988 1988 ...
##  $ numcm   : int  1676 1676 1676 1676 1676 1676 1597 1597 1597 1597 ...
class(Apple_MB_dose_df)
## [1] "data.frame"
#Rename some columns
names(Apple_MB_dose_df)
##  [1] "X"        "dose"     "tot"      "dead"     "pobs"     "cm"      
##  [7] "ct"       "Cultivar" "gp"       "year"     "numcm"
colnames(Apple_MB_dose_df) <- c("Observation", "Dose_gm","T_Inchamber","Dead", 
                        "PCT_dying","ControlMortality","ConcentrationTime",
                        "Cultivar","gp_factor","year_factor","T_ctrl_insects")

names(Apple_MB_dose_df)  # see renamed columns
##  [1] "Observation"       "Dose_gm"           "T_Inchamber"      
##  [4] "Dead"              "PCT_dying"         "ControlMortality" 
##  [7] "ConcentrationTime" "Cultivar"          "gp_factor"        
## [10] "year_factor"       "T_ctrl_insects"

Data Exploration:

cat ("Number of Observations: ", as.character(count(Apple_MB_dose_df)))
## Number of Observations:  99
summary(Apple_MB_dose_df)
##   Observation        Dose_gm       T_Inchamber          Dead     
##  Min.   :   1.0   Min.   : 5.00   Min.   : 239.0   Min.   :  96  
##  1st Qu.:  25.5   1st Qu.:12.00   1st Qu.: 451.0   1st Qu.: 201  
##  Median :  50.0   Median :16.00   Median : 603.0   Median : 330  
##  Mean   : 721.3   Mean   :18.13   Mean   : 734.8   Mean   : 426  
##  3rd Qu.: 599.5   3rd Qu.:24.00   3rd Qu.: 881.5   3rd Qu.: 533  
##  Max.   :4235.0   Max.   :30.00   Max.   :2965.0   Max.   :1968  
##    PCT_dying      ControlMortality ConcentrationTime   Cultivar        
##  Min.   :0.2232   Min.   :0.1506   Min.   :15.59     Length:99         
##  1st Qu.:0.3503   1st Qu.:0.1978   1st Qu.:32.85     Class :character  
##  Median :0.5461   Median :0.2160   Median :42.51     Mode  :character  
##  Mean   :0.5853   Mean   :0.2178   Mean   :43.30                       
##  3rd Qu.:0.8279   3rd Qu.:0.2368   3rd Qu.:53.70                       
##  Max.   :0.9973   Max.   :0.2765   Max.   :65.00                       
##    gp_factor       year_factor   T_ctrl_insects
##  Min.   : 1.000   Min.   :1988   Min.   :1067  
##  1st Qu.: 5.000   1st Qu.:1988   1st Qu.:1474  
##  Median : 9.000   Median :1988   Median :1773  
##  Mean   : 8.919   Mean   :1988   Mean   :2066  
##  3rd Qu.:13.000   3rd Qu.:1989   3rd Qu.:2333  
##  Max.   :17.000   Max.   :1989   Max.   :4177

Graphing Histograms:

These set of histograms will graph the frequency of “Percent of Dying”

ggplot(data=Apple_MB_dose_df) + aes(x=PCT_dying, fill=Cultivar) + geom_histogram( bins=10) + labs(x="Percent Dying")

ggplot(data=Apple_MB_dose_df) + aes(x=PCT_dying, fill=Cultivar) + geom_histogram( bins=10) + labs(x="Percent Dying") +
  facet_wrap(~Cultivar)

hist(Apple_MB_dose_df$PCT_dying, main="Percent Dying Histogram", xlab="PCT_dying")

ggplot(data=Apple_MB_dose_df) + geom_histogram(aes(x=PCT_dying), fill="grey50",bins=10)

Graphing: Scatterplot featuring Cultivar

These set of graphs represent for each individual apple Cultivar the “Percent of Control Mortality” in relation to “Dose”.

Most all culivars achieved near complete mortality except for the Royal cultivar.

#Scatter plot by Cultivar ControlMortality
g <- ggplot(Apple_MB_dose_df, aes(x=ControlMortality,y=Dose_gm)) +
  geom_point() 
g + geom_point(aes(color=Cultivar)) + facet_wrap(~Cultivar)

g # display graph

#Scatter plot by Cultivar
g <- ggplot(Apple_MB_dose_df, aes(x=PCT_dying,y=Dose_gm)) +
  geom_point() 
g + geom_point(aes(color=Cultivar)) + facet_wrap(~Cultivar)

g # display graph

Graphing: Boxplot by Cultivar

This boxplot shows most all cultivars will have an effective mortality rate at a dose of 24 g m-3 while the cultivar Royal can achieve an effective mortality rate at a lower dose.

ggplot(Apple_MB_dose_df, aes(x=Cultivar, y=Dose_gm)) + geom_boxplot()

Graphing: Boxplot by Dose

This boxplot shows most an effective mortality rate at a dose of 24 g m-3.

ggplot(Apple_MB_dose_df, aes(x = Dose_gm,y = PCT_dying, group = Dose_gm)) + geom_boxplot()

Graphing: Simple Linear Regression

# Simple Linear Regression

ggplot(Apple_MB_dose_df, aes(x=Dose_gm, y=PCT_dying, fill=Cultivar)) + geom_point() + geom_smooth(method="lm") + labs(x="Dose", y="PCT Dying")
## `geom_smooth()` using formula 'y ~ x'

Closer inspection of data at 24gm dose.

Graphing: Scatterplot by Cultivar.

First scatterplot exhibits the control mortality percent. Second scatterplot exhibits the experimental mortality percent at 24 gm dose. All cultivars except Royal exhibited higher mortality rates at the 24 gm dose.

# Closer inspection of data at 24 gm dose

Dose_24gm <- Apple_MB_dose_df[Apple_MB_dose_df$Dose_gm == 24, ]
Dose_24gm
##    Observation Dose_gm T_Inchamber Dead PCT_dying ControlMortality
## 6            6      24        1183  742    0.6272           0.2178
## 12          12      24        1137  628    0.5523           0.2160
## 16          17      24         472  417    0.8835           0.2641
## 22          23      24         462  423    0.9156           0.2765
## 28          29      24         524  402    0.7672           0.2126
## 34          35      24         522  345    0.6609           0.1997
## 40          41      24         743  584    0.7860           0.2141
## 46          47      24         743  569    0.7658           0.2368
## 52          53      24         405  381    0.9407           0.1779
## 58          59      24         429  413    0.9627           0.1506
## 64          65      24         446  412    0.9238           0.1978
## 70         655      24         278  274    0.9856           0.2353
## 76        1311      24         650  535    0.8231           0.2523
## 82        2016      24         601  421    0.7005           0.1946
## 87        2722      24        1019  873    0.8567           0.2357
## 93        3428      24        2510 1729    0.6888           0.1879
## 98        4134      24        1156  937    0.8106           0.2481
##    ConcentrationTime      Cultivar gp_factor year_factor T_ctrl_insects
## 6           45.44292         ROYAL         1        1988           1676
## 12          53.71208         ROYAL         2        1988           1597
## 16          53.24750      BRAEBURN         3        1988           1662
## 22          54.53625      BRAEBURN         4        1988           2123
## 28          51.48250          FUJI         5        1988           1392
## 34          53.94625          FUJI         6        1988           1773
## 40          54.77875        GRANNY         7        1988           2284
## 46          53.51708        GRANNY         8        1988           3277
## 52          51.77208 Red Delicious         9        1988           1147
## 58          54.03728 Red Delicious        10        1988           1122
## 64          53.76875 Red Delicious        11        1988           1067
## 70          53.31667          Gala        12        1989           2214
## 76          53.68266          Gala        13        1989           3226
## 82          54.32658 Red Delicious        14        1989           4177
## 87          56.20417 Red Delicious        15        1989           2333
## 93          51.92708     Splendour        16        1989           1474
## 98          53.87000     Splendour        17        1989           2648
#Scatter plot by Cultivar of Control Mortality
ggplot(Dose_24gm,  aes(x=Cultivar, y=ControlMortality)) +
  geom_point(aes(color=Cultivar))

#Scatter plot by Cultivar of PCT_dying
ggplot(Dose_24gm, aes(x=Cultivar, y=PCT_dying)) +
  geom_point(aes(color=Cultivar))

Conclusion:

In this controlled experiment, fumigation using methyl bromide for pest control of the codling moth, Cydia pomonella (L), there is a difference in the level of effective mortality rate among various cultivars of apples. The treatment level of the Royal cultivar is different than the other cultivars. Therefore, product by product testing is necessary for pest control of the codling moth for apple export to Japan (WTO, 1998, p. 30). Each cultivar should require individual export approval to Japan based on data.

References:

Maindonald, J. H., Waddell, B. C., & Petry, R. J. (2001). Apple cultivar effects on codling moth (Lepidoptera: Tortricidae) egg mortality following fumigation with methyl bromide. Postharvest Biology and Technology, 22(2), 99-110. Retrieved from https://doi.org/10.1016/S0925-5214(01)00082-5

World Trade Organization (WTO). (1998). Japan – Measures Affecting Agricultural Products - Report of the Panel. Retrieved from http://www.worldtradelaw.net/reports/wtopanelsfull/japan-agproducts(panel)(full).pdf.download