Things need to be completed: Plot data Check for normality(using qqnorm and qqline and hist) Check to homoscedasticity(I think this also uses qqnorm and qq line) transform data if need be Plot each thing against each other We’re running linear models on the 5 data points and finding out which have the highest Probs multiple linear shit
Things to do tonight: Plot data Check for normality using qqnorm qqline and histogram worry about the linear models tomorrow and same with what the fuck is connected? ###
Plot data
Check for normality and homoscedasticity using qqnorm qqline and hist for each data set :)
tips import file look at distribution of each raw data
Plot
garlicMustard<-read.csv("/Users/jonathan/Downloads/ENVS 286 Garlic Mustard Data.csv")
summary(garlicMustard)## transect plantHeight flowerNo leafNo
## 0m : 1 Min. :42.00 Min. : 0.00 Min. : 7.00
## 10m : 1 1st Qu.:56.25 1st Qu.: 1.00 1st Qu.: 31.00
## 11m : 1 Median :59.75 Median : 4.00 Median : 47.00
## 12m : 1 Mean :63.65 Mean : 7.50 Mean : 62.64
## 13m : 1 3rd Qu.:73.30 3rd Qu.:10.25 3rd Qu.: 75.75
## 14m : 1 Max. :84.00 Max. :30.00 Max. :176.00
## (Other):16
## seedPodNo stalkNo
## Min. : 4.0 Min. :1.0
## 1st Qu.: 45.5 1st Qu.:2.0
## Median : 78.0 Median :3.0
## Mean :111.9 Mean :3.5
## 3rd Qu.:162.5 3rd Qu.:4.0
## Max. :297.0 Max. :7.0
##
qqnorm(garlicMustard$plantHeight) ###all good
qqline(garlicMustard$plantHeight)qqnorm(garlicMustard$flowerNo)
qqline(garlicMustard$flowerNo)qqnorm(garlicMustard$leafNo)
qqline(garlicMustard$leafNo)qqnorm(garlicMustard$seedPodNo)
qqline(garlicMustard$seedPodNo)qqnorm(garlicMustard$stalkNo)
qqline(garlicMustard$stalkNo)###LOG R
garlicMustard$flowerNoLog<-log10(garlicMustard$flowerNo+0.0001)
hist(garlicMustard$flowerNoLog)qqnorm(garlicMustard$flowerNoLog)
qqline(garlicMustard$flowerNoLog)garlicMustard$leafNoLog<-log10(garlicMustard$leafNo+0.0001)####yes
hist(garlicMustard$leafNoLog)qqnorm(garlicMustard$leafNoLog)
qqline(garlicMustard$leafNoLog)garlicMustard$seedPodNoLog<-log10(garlicMustard$seedPodNo+0.0001)
hist(garlicMustard$seedPodNoLog)qqnorm(garlicMustard$seedPodNoLog)
qqline(garlicMustard$seedPodNoLog)garlicMustard$stalkNoLog<-log10(garlicMustard$stalkNo+0.0001)###Yes
hist(garlicMustard$stalkNoLog)qqnorm(garlicMustard$stalkNoLog)
qqline(garlicMustard$stalkNoLog)###Sqrt R
garlicMustard$flowerNoSqrt<-sqrt(garlicMustard$flowerNo)#### this one
hist(garlicMustard$flowerNoSqrt)qqnorm(garlicMustard$flowerNoSqrt)
qqline(garlicMustard$flowerNoSqrt)garlicMustard$leafNoSqrt<-sqrt(garlicMustard$leafNo)
hist(garlicMustard$leafNoSqrt)qqnorm(garlicMustard$leafNoSqrt)
qqline(garlicMustard$leafNoSqrt)garlicMustard$seedPodNoSqrt<-sqrt(garlicMustard$seedPodNo)###This one
hist(garlicMustard$seedPodNoSqrt)qqnorm(garlicMustard$seedPodNoSqrt)
qqline(garlicMustard$seedPodNoSqrt)garlicMustard$stalkNoSqrt<-sqrt(garlicMustard$stalkNo)
hist(garlicMustard$stalkNoSqrt)qqnorm(garlicMustard$stalkNoSqrt)
qqline(garlicMustard$stalkNoSqrt)garlicMustard$flowerNoSqrd<-(garlicMustard$flowerNo)^2#### this one
hist(garlicMustard$flowerNoSqrd)qqnorm(garlicMustard$flowerNoSqrd)
qqline(garlicMustard$flowerNoSqrd)garlicMustard$leafNoSqrd<-(garlicMustard$leafNo)^2
hist(garlicMustard$leafNoSqrd)qqnorm(garlicMustard$leafNoSqrd)
qqline(garlicMustard$leafNoSqrd)garlicMustard$seedPodNoSqrd<-(garlicMustard$seedPodNo)^2
hist(garlicMustard$seedPodNoSqrd)qqnorm(garlicMustard$seedPodNoSqrd)
qqline(garlicMustard$seedPodNoSqrd)garlicMustard$stalkNoSqrd<-(garlicMustard$stalkNo)^2
hist(garlicMustard$stalkNoSqrd)qqnorm(garlicMustard$stalkNoSqrd)
qqline(garlicMustard$stalkNoSqrd)Points used: garlicMustard\(plantHeight garlicMustard\)leafNoLog garlicMustard\(stalkNoLog garlicMustard\)flowerNoSqrt garlicMustard$seedPodNoSqrt
plot(garlicMustard$plantHeight~garlicMustard$leafNoLog)plot(garlicMustard$plantHeight~garlicMustard$stalkNoLog)plot(garlicMustard$leafNoLog~garlicMustard$stalkNoLog)plot(garlicMustard$leafNoLog~garlicMustard$flowerNoSqrt)plot(garlicMustard$stalkNoLog~garlicMustard$flowerNoSqrt)plot(garlicMustard$stalkNoLog~garlicMustard$seedPodNoSqrt)plot(garlicMustard$flowerNoSqrt~garlicMustard$plantHeight)plot(garlicMustard$flowerNoSqrt~garlicMustard$seedPodNoSqrt)plot(garlicMustard$seedPodNoSqrt~garlicMustard$leafNoLog)plot(garlicMustard$seedPodNoSqrt~garlicMustard$plantHeight) seedpodno and leaf no seedpodno and height
one.LM<-lm(garlicMustard$plantHeight~garlicMustard$leafNoLog) ##p-value=0.005 R^2=0.2953 sinigifcant
summary(one.LM)##
## Call:
## lm(formula = garlicMustard$plantHeight ~ garlicMustard$leafNoLog)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.247 -6.984 -3.161 6.310 27.573
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 26.700 12.005 2.224 0.03782 *
## garlicMustard$leafNoLog 21.830 6.974 3.130 0.00527 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.25 on 20 degrees of freedom
## Multiple R-squared: 0.3288, Adjusted R-squared: 0.2953
## F-statistic: 9.799 on 1 and 20 DF, p-value: 0.00527
plot(garlicMustard$plantHeight~garlicMustard$leafNoLog)
abline(one.LM)two.LM<-lm(garlicMustard$plantHeight~garlicMustard$stalkNoLog) ##p-value=0.188 R^2=0.2086 Not significant
summary(two.LM)##
## Call:
## lm(formula = garlicMustard$plantHeight ~ garlicMustard$stalkNoLog)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.1611 -8.6117 -0.6111 7.5752 24.1389
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 53.109 4.730 11.229 4.35e-10 ***
## garlicMustard$stalkNoLog 22.427 8.774 2.556 0.0188 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.86 on 20 degrees of freedom
## Multiple R-squared: 0.2462, Adjusted R-squared: 0.2086
## F-statistic: 6.534 on 1 and 20 DF, p-value: 0.01883
plot(garlicMustard$plantHeight~garlicMustard$stalkNoLog)
abline(two.LM)three.LM<-lm(garlicMustard$leafNoLog~garlicMustard$stalkNoLog)##p-value<0.001 R^2=0.6575 significant
summary(three.LM)##
## Call:
## lm(formula = garlicMustard$leafNoLog ~ garlicMustard$stalkNoLog)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.42337 -0.08211 0.05785 0.10236 0.24250
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.23458 0.08172 15.107 2.11e-12 ***
## garlicMustard$stalkNoLog 0.97452 0.15160 6.428 2.86e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1877 on 20 degrees of freedom
## Multiple R-squared: 0.6738, Adjusted R-squared: 0.6575
## F-statistic: 41.32 on 1 and 20 DF, p-value: 2.863e-06
plot(garlicMustard$leafNoLog~garlicMustard$stalkNoLog)
abline(three.LM)four.LM<-lm(garlicMustard$leafNoLog~garlicMustard$flowerNoSqrt)#p-value=0.0566 R^2=0.1284 not significant
summary(four.LM)##
## Call:
## lm(formula = garlicMustard$leafNoLog ~ garlicMustard$flowerNoSqrt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.75018 -0.17481 -0.00405 0.16245 0.67800
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.51513 0.10848 13.967 8.9e-12 ***
## garlicMustard$flowerNoSqrt 0.08016 0.03961 2.024 0.0566 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2995 on 20 degrees of freedom
## Multiple R-squared: 0.17, Adjusted R-squared: 0.1284
## F-statistic: 4.095 on 1 and 20 DF, p-value: 0.05658
plot(garlicMustard$leafNoLog~garlicMustard$flowerNoSqrt)
abline(four.LM)five.LM<-lm(garlicMustard$stalkNoLog~garlicMustard$flowerNoSqrt)#p-value=0.0518 R2=0.1351 not significant
summary(five.LM)##
## Call:
## lm(formula = garlicMustard$stalkNoLog ~ garlicMustard$flowerNoSqrt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.48614 -0.16640 0.04827 0.14679 0.52735
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.31776 0.09103 3.491 0.0023 **
## garlicMustard$flowerNoSqrt 0.06876 0.03324 2.069 0.0518 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2513 on 20 degrees of freedom
## Multiple R-squared: 0.1762, Adjusted R-squared: 0.1351
## F-statistic: 4.279 on 1 and 20 DF, p-value: 0.05176
plot(garlicMustard$stalkNoLog~garlicMustard$flowerNoSqrt)
abline(five.LM)six.LM<-lm(garlicMustard$stalkNoLog~garlicMustard$seedPodNoSqrt)#p-value<0.001 R2=0.6874 significant
summary(six.LM)##
## Call:
## lm(formula = garlicMustard$stalkNoLog ~ garlicMustard$seedPodNoSqrt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.42633 -0.07971 -0.00597 0.08853 0.23965
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.030372 0.079645 -0.381 0.707
## garlicMustard$seedPodNoSqrt 0.051716 0.007529 6.869 1.13e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1511 on 20 degrees of freedom
## Multiple R-squared: 0.7023, Adjusted R-squared: 0.6874
## F-statistic: 47.18 on 1 and 20 DF, p-value: 1.128e-06
plot(garlicMustard$stalkNoLog~garlicMustard$seedPodNoSqrt)
abline(six.LM)seven.LM<-lm(garlicMustard$flowerNoSqrt~garlicMustard$plantHeight)#p-value= 0.329 R2=0.1685 not significant
summary(seven.LM)##
## Call:
## lm(formula = garlicMustard$flowerNoSqrt ~ garlicMustard$plantHeight)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.1600 -0.6871 0.0240 1.0864 2.5000
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.70842 1.74087 -0.981 0.3381
## garlicMustard$plantHeight 0.06163 0.02688 2.292 0.0329 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.504 on 20 degrees of freedom
## Multiple R-squared: 0.2081, Adjusted R-squared: 0.1685
## F-statistic: 5.255 on 1 and 20 DF, p-value: 0.03286
plot(garlicMustard$flowerNoSqrt~garlicMustard$plantHeight)
abline(seven.LM)eight.LM<-lm(garlicMustard$flowerNoSqrt~garlicMustard$seedPodNoSqrt)#p-value=0.0863 R2=0.09693 Not significant
summary(eight.LM)##
## Call:
## lm(formula = garlicMustard$flowerNoSqrt ~ garlicMustard$seedPodNoSqrt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.2794 -0.6691 0.1808 1.1853 2.2704
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.85032 0.82657 1.029 0.3159
## garlicMustard$seedPodNoSqrt 0.14095 0.07814 1.804 0.0863 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.568 on 20 degrees of freedom
## Multiple R-squared: 0.1399, Adjusted R-squared: 0.09693
## F-statistic: 3.254 on 1 and 20 DF, p-value: 0.08633
plot(garlicMustard$flowerNoSqrt~garlicMustard$seedPodNoSqrt)
abline(eight.LM)nine.LM<-lm(garlicMustard$seedPodNoSqrt~garlicMustard$leafNoLog)#p-value<0.001 R2=.7904 significant
summary(nine.LM)##
## Call:
## lm(formula = garlicMustard$seedPodNoSqrt ~ garlicMustard$leafNoLog)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.8641 -1.4178 0.1954 1.6161 3.8364
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -10.994 2.347 -4.684 0.000143 ***
## garlicMustard$leafNoLog 12.212 1.364 8.955 1.96e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.005 on 20 degrees of freedom
## Multiple R-squared: 0.8004, Adjusted R-squared: 0.7904
## F-statistic: 80.2 on 1 and 20 DF, p-value: 1.956e-08
plot(garlicMustard$seedPodNoSqrt~garlicMustard$leafNoLog)
abline(nine.LM)ten.LM<-lm(garlicMustard$seedPodNoSqrt~garlicMustard$plantHeight)#p-value=0.00534 R2=.2944 significant
summary(ten.LM)##
## Call:
## lm(formula = garlicMustard$seedPodNoSqrt ~ garlicMustard$plantHeight)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.1633 -1.0625 0.1933 2.3699 4.4067
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.39428 4.25624 -0.797 0.43454
## garlicMustard$plantHeight 0.20533 0.06572 3.124 0.00534 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.678 on 20 degrees of freedom
## Multiple R-squared: 0.328, Adjusted R-squared: 0.2944
## F-statistic: 9.76 on 1 and 20 DF, p-value: 0.005344
plot(garlicMustard$seedPodNoSqrt~garlicMustard$plantHeight)
abline(ten.LM)Not significant list: # P R 2 0.188 0.2462 4 0.0566 0.1284 5 0.0518 0.1351 7 0.329 0.1685 8 0.0863 0.09693 ## worst 9 the best