Analyzing the Raw Data

The petri dishes show the growth of bacteria on plates with and without antibiotics (in this case, the antibiotic ampicillin). Each plate is overlaid by a grid that limits measurement error when counting colonies. Each colony intersected by a grid line should be counted.

Work in pairs to quantify the number of colonies per grid, enter the data into EXCEL and save the data as a comma-separate values file, and visualize the data in a manner that effectively and clearly illustrates the results relative to the hypothesis being tested.

DATA AND ANALYSIS

  1. In excel, calculate the frequency of antibiotic resistance at each site, described in the lab hand-out:

Number of colonies on experimental plate / number of colonies on control plate

Store this data in the column “Fabr”, in the rows corresponding to Treatment=Ab_plus. We’ll use this for our first hypothesis, about the frequency of antibiotic resistance.

  1. In excel, calculate the diversity of bacteria at each site, for each treatment. We’ve given you excel formulas for the first row - you can check these out (see whether the formulas match the equation in the hand-out) and drag them down to other rows.

  2. Save your data file as a .csv file.

Load the data and attach it so it’s searchable.

getwd()
## [1] "/Users/hannahreyes/Desktop/EVOLUTION"
ab_data <- read.csv("dataab.csv")

Look at your data. Did it import as you expected? What class is it?

str(ab_data)
## 'data.frame':    12 obs. of  20 variables:
##  $ Site           : int  1 1 2 2 3 3 4 4 5 5 ...
##  $ Distance_kms   : int  0 0 11 11 23 23 35 35 49 49 ...
##  $ Treatment      : Factor w/ 2 levels "Ab_plus","Control": 2 1 2 1 2 1 2 1 2 1 ...
##  $ Type_1         : int  0 0 8 0 23 0 0 0 2 0 ...
##  $ Type_2         : int  0 2 20 5 17 16 25 13 22 13 ...
##  $ Type_3         : int  10 0 3 0 4 0 0 0 4 2 ...
##  $ Type_4         : int  1 4 7 25 6 6 0 1 23 0 ...
##  $ Fabr           : num  0.545 0.545 0.789 0.789 0.44 0.44 0.56 0.56 0.294 0.294 ...
##  $ totalBact      : int  11 6 38 30 50 22 25 14 51 15 ...
##  $ pi_Type1       : num  0 0 0.211 0 0.46 ...
##  $ pi_Type2       : num  0 0.333 0.526 0.167 0.34 ...
##  $ pi_Type3       : num  0.9091 0 0.0789 0 0.08 ...
##  $ pi_Type4       : num  0.0909 0.6667 0.1842 0.8333 0.12 ...
##  $ pi.ln.pi._Type1: num  0 0 -0.328 0 -0.357 ...
##  $ pi.ln.pi._Type2: num  0 -0.366 -0.338 -0.299 -0.367 ...
##  $ pi.ln.pi._Type3: num  -0.0866 0 -0.2004 0 -0.2021 ...
##  $ pi.ln.pi._Type4: num  -0.218 -0.27 -0.312 -0.152 -0.254 ...
##  $ Diversity      : num  0.305 0.637 1.178 0.451 1.18 ...
##  $ Type           : int  1 2 3 4 NA NA NA NA NA NA ...
##  $ Description    : Factor w/ 5 levels "","orange_diffuse",..: 2 4 5 3 1 1 1 1 1 1 ...
attach(ab_data)
#I needed to attach the file to save it in R's memory

Make a vector with 6 distances. Why do we have to do this, rather than using the full Distance_kms variable as-is, for our plots below?

distance <- Distance_kms[Treatment=="Ab_plus"]

We are only looking at antibiotic resistant bacteria and by using that line of code to get our vector we are pulling strictly distances that correlate with AB_plus.

Similarly, make a vector of the frequency of antibiotic resistance.

ab_res <- Fabr[Treatment=="Ab_plus"]

Create a caption for the plot below. Include the output from your statistical test run below.

Construct a graph that clearly communicates the relationship between the frequency of antibiotic resistance and spatial location along the watershed.

plot(distance, ab_res, xlab = "Distance in Km", ylab = "AB resistant bacteria frequency", main = "Distance vs AB Resistant Bacteria Frequency") ###YOUR CODE HERE - insert the relevant information to correctly format your graph, including axis labels
This plot shows the relationship between the AB resistanct bacteria present at different distnaces along Boulder Creek. THe rho value was -0.657and the p-value is 0.175.

This plot shows the relationship between the AB resistanct bacteria present at different distnaces along Boulder Creek. THe rho value was -0.657and the p-value is 0.175.

Use Spearman’s rank correlation (a non-parametric test of association) to test the relationship plotted above.

#The basic structure is cor.test(x,y,method="spearman"). 
cor.test(distance, ab_res, method = "spearman")
## 
##  Spearman's rank correlation rho
## 
## data:  distance and ab_res
## S = 58, p-value = 0.175
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##        rho 
## -0.6571429
#(###YOUR CODE HERE, method="spearman") #Fill in the appropriate x and y variable names

Make an evidence-based claim about the frequency of antibiotic resistant bacteria along the Boulder Creek watershed.

My rho value,~ -0.657,is indicative of a strong negative directional slope trend that as distance increases AB resistant bacteria decreases.However, our p-value is greater than 0.175 which let’s us know that the relationship between distance and antibiotic resistance isn’t significant enough to dispute the null hypothesis.

Now we’ll look at hypothesis 2, about the diversity of bacteria found at each site. Relativize your measure of diversity - divide what you found on the antibiotic plates by what you found on the control plates.

#Relative diversity = treatment / control

Diversity[2] / Diversity[1] #Gives you the relativized diversity at site 1
## [1] 2.089425
#We want to repeat this type of calculation, so we can use a for-loop
rel_diversity <- rep(NA, 6) #make an empty bin to store one value for each site

for(i in 1:6){
  rel_diversity[i] <- Diversity[(2*i)] / Diversity[(2*i-1)]
}

rel_diversity #Look at the output
## [1] 2.0894246 0.3825064 0.4963646       Inf 0.3745189 0.0000000

Why would this step (creating a relative measure of diversity) be important?

We need the relative diversity since it gives us a value btween treated and control bacteria plates. The control gives us a background for any untold information- i.e. if there should be naturally occuring diversity in bacteria. This proportion gives us a better relative understanding and reading of the bacteria present in our different sites.

Create a caption for the plot below. Include the output from your statistical test run below.

Construct a graph that clearly communicates the relationship between the (relative) diversity of antibiotic resistant bacteria and spatial location along the watershed.

###YOUR CODE HERE - modify the plot code you used previously. Include axis labels.

plot(distance, rel_diversity, xlab = "Distance in Km", ylab = "Relative diversity of AB resistant bacteria", main = "Distance vs Relative Diversity of AB esistant Bacteria")
This plot shows the relationship of bacteria diversity with distance. The rho value is -0.6 and the pvalue is 0.2417.

This plot shows the relationship of bacteria diversity with distance. The rho value is -0.6 and the pvalue is 0.2417.

Use Spearman’s rank correlation (a non-parametric test of association) to test the relationship plotted above.

#The basic structure is cor.test(x,y,method="spearman"). 
cor.test(distance, rel_diversity, method = "spearman")
## 
##  Spearman's rank correlation rho
## 
## data:  distance and rel_diversity
## S = 56, p-value = 0.2417
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##  rho 
## -0.6
#(###YOUR CODE HERE, method="spearman") #Fill in the appropriate x and y variable names

Make an evidence-based claim about the diversity of antibiotic resistant bacteria along the Boulder Creek watershed.

My rho value takes a strong negative directional slope (rho= -0.6)- which let’s us know that as distance continues there is less relative bacterial diversity. However, since the p-value is high (.2417), we know that the relationship between distance and diversity aren’t significant enough to refute the null hypothesis.

Propose a hypothesis that explains why there were antibiotic resistant bacteria at relatively pristine sites high above agricultural and municipal environments (assuming there are no antibiotics in that water).

Antibiotics are transferred through several methods. One reason for the presence of antibiotic resistant bacterias in areas we usually wouldn’t suspect could be because of animal transferral. For instance a bird living near an agricultral site could contract AB resistant bacteria and fly north for the summer. Their waste could also make it’s way into the waterpaths and for this reason AB resistant bacteria can be found in places least expected.