This R Markdown details methods used in R to analyze data collected for my plant ecology lab at Smith College. I began by imputing data as usual:
rm(list=ls(all=TRUE))
BallGall <- read.csv("/Users/matthewhecking/Documents/Ball Gall Data 2.csv")
attach(BallGall)
head(BallGall)
## PARASITE NPHight Npflowermass Phight Pflowermass BallGallWidth
## 1 ball gall 147 25.5 122 15.5 2.41
## 2 ball gall 122 26.5 132 15.5 2.65
## 3 ball gall 142 14.5 145 15.0 0.79
## 4 ball gall 135 6.3 137 16.0 1.01
## 5 ball gall 123 2.0 93 0.0 0.81
## 6 ball gall 126 10.0 112 0.0 2.43
The first command removed all data from the environment, ensuring that no values are confused with one another, and uploaded a revised data set.
names(BallGall)
## [1] "PARASITE" "NPHight" "Npflowermass" "Phight"
## [5] "Pflowermass" "BallGallWidth"
mean(Phight)
## [1] 117.9649
sd(Phight)
## [1] 28.9778
sd(Phight)/57
## [1] 0.5083825
examples of how to generate mean, standard deviation, and standard error from the data set are shown above. This was done for each variable (excluding “PARASITE”)
Next, I wanted to graphically display the data, so I used the following commands to overlap data sets from different variables
plot(Pflowermass, col="blue", ylim=c(0,50), xlab="Plant Number", ylab="Flower Weight (g)", main="Average Flower Weight (Ball)")
par(new=TRUE)
plot(Npflowermass, col="red", ylim=c(0,50), xlab="", ylab="")
Make sure that the limits are equal when doing this.
Other graphs I made included best fit lines:
The best fit line command is as follows
fit = aov(lm(Phight ~ Pflowermass)) co <- coef(fit) abline(fit, lwd = 2, col=“blue”)
If you want to get the R squared values as well, you can use:
Pmodel = lm(Phight ~ Pflowermass)
summary(Pmodel)$r.squared
## [1] 0.26891
NPmodel = lm(NPHight ~ Npflowermass)
summary(NPmodel)$r.squared
## [1] 0.1154893
Here is another graph made using the same commands with different data:
## The following objects are masked from BallGall:
##
## NPHight, PARASITE, Pflowermass
## [1] 0.4040706
## [1] 0.4652524