#Step #1
Garlic <-read.csv("/Users/Amilian/Desktop/ENVS 286 Garlic Mustard Data.csv")
summary (Garlic)
## transect plantHeight flowerNo leafNo
## : 1 Min. : 42.0 Min. : 0.00 Min. : 7.0
## 0m : 1 1st Qu.: 56.5 1st Qu.: 1.00 1st Qu.: 32.0
## 10m : 1 Median : 60.0 Median : 4.00 Median : 53.0
## 11m : 1 Mean : 121.8 Mean : 14.35 Mean : 119.8
## 12m : 1 3rd Qu.: 75.6 3rd Qu.: 13.50 3rd Qu.: 84.0
## 13m : 1 Max. :1400.3 Max. :165.00 Max. :1378.0
## (Other):17
## seedPodNo stalkNo
## Min. : 4.0 Min. : 1.000
## 1st Qu.: 52.0 1st Qu.: 2.000
## Median : 78.0 Median : 3.000
## Mean : 214.1 Mean : 6.696
## 3rd Qu.: 182.0 3rd Qu.: 5.000
## Max. :2462.0 Max. :77.000
##
## X p.value.
## :12 Min. :0.000000
## 0 : 1 1st Qu.:0.000782
## Garlic$flowerNosqur~Garlic$leafNosqur : 1 Median :0.006986
## Garlic$flowerNosqur~Garlic$seedPodNosqur: 1 Mean :0.023168
## Garlic$flowerNosqur~Garlic$stalkNosqur : 1 3rd Qu.:0.039820
## Garlic$leafNosqur~Garlic$seedPodNosqur : 1 Max. :0.086330
## (Other) : 6 NA's :13
## Adjusted.R.squared. X.1
## Min. :0.09693 :21
## 1st Qu.:0.15485 least: 1
## Median :0.27850 most : 1
## Mean :0.37465
## 3rd Qu.:0.63040
## Max. :0.84220
## NA's :13
# "Most corelated data: Leafs vs Seed Pods
Garlic$leafNosqur<- sqrt (Garlic$leafNo)
qqnorm(Garlic$leafNosqur) #Transformed data
qqline(Garlic$leafNosqur)
QQ-Plot of sqrt-trasformed data for Leaf
Garlic$seedPodNosqur<- sqrt (Garlic$seedPodNo)
qqnorm(Garlic$seedPodNosqur) #transformed data
qqline(Garlic$seedPodNosqur)
QQ-Plot of sqrt-trasformed data for Seed Pod
Garlic7.LM <- lm (Garlic$leafNosqur~Garlic$seedPodNosqur, data = Garlic)
plot (Garlic$leafNosqur~Garlic$seedPodNosqur, data = Garlic)
abline (Garlic7.LM)
Distribution of variables: Leaf vs. Seed Pod
summary (Garlic7.LM)
##
## Call:
## lm(formula = Garlic$leafNosqur ~ Garlic$seedPodNosqur, data = Garlic)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7074 -0.9178 -0.2942 0.8271 3.2590
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.71652 0.41654 1.72 0.1
## Garlic$seedPodNosqur 0.70484 0.02847 24.76 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.25 on 21 degrees of freedom
## Multiple R-squared: 0.9669, Adjusted R-squared: 0.9653
## F-statistic: 613 on 1 and 21 DF, p-value: < 2.2e-16
plot (Garlic7.LM)
Residual QQ-Plot
Residual QQ-Plot
Residual QQ-Plot
Residual QQ-Plot
#The overal linear regression model for Garlic leaves vs seed pods was significant (p-value <0.0001); R2 =0.84). A significant positive relationship exists between the number of leaves and seed pods in the sample (p-value <0.001). Garlic leaf number data and Seed pod number data were transformed to approximate normality and homogeneity of variance of the residuals in the model. Plants absorbed light energy from the sun through the leaves. The energy absorbed is converted into a usable form by photosynthesis. The seed pods stored sugars and minerals, holding the seeds for offspring of the plant. Because of photosynthesis it was not unexpected to find out that leaves and seed pods were the most significant data and correlated to each other.
# "Least corelated data: Flowers vs Seed Pods
Garlic$flowerNosqur<- sqrt (Garlic$flowerNo)
qqnorm(Garlic$flowerNosqur) #Transformed data
qqline(Garlic$flowerNosqur)
QQ-Plot of sqrt-trasformed data for flower
Garlic$seedPodNosqur<- sqrt (Garlic$seedPodNo)
qqnorm(Garlic$seedPodNosqur) #transformed data
qqline(Garlic$seedPodNosqur)
QQ-Plot of sqrt-trasformed data for Seed Pod
Garlic5.LM <- lm(Garlic$flowerNosqur~Garlic$seedPodNosqur, data = Garlic)
summary (Garlic5.LM)
##
## Call:
## lm(formula = Garlic$flowerNosqur ~ Garlic$seedPodNosqur, data = Garlic)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0736 -0.4991 0.3038 0.8765 2.2688
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.06282 0.53501 -0.117 0.908
## Garlic$seedPodNosqur 0.24002 0.03657 6.564 1.68e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.606 on 21 degrees of freedom
## Multiple R-squared: 0.6723, Adjusted R-squared: 0.6567
## F-statistic: 43.09 on 1 and 21 DF, p-value: 1.68e-06
plot (Garlic$flowerNosqur~Garlic$seedPodNosqur, data = Garlic)
abline (Garlic5.LM)
Distribution of variables: flower vs. Seed Pod
summary (Garlic5.LM)
##
## Call:
## lm(formula = Garlic$flowerNosqur ~ Garlic$seedPodNosqur, data = Garlic)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0736 -0.4991 0.3038 0.8765 2.2688
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.06282 0.53501 -0.117 0.908
## Garlic$seedPodNosqur 0.24002 0.03657 6.564 1.68e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.606 on 21 degrees of freedom
## Multiple R-squared: 0.6723, Adjusted R-squared: 0.6567
## F-statistic: 43.09 on 1 and 21 DF, p-value: 1.68e-06
plot (Garlic5.LM)
Residual QQ-Plot
Residual QQ-Plot
Residual QQ-Plot
Residual QQ-Plot
#The overal linear regression model fro Garlic flowers vs seed pods was not significant (p-value > 0.05); R2 =0.097). There was an insignificant relationship between the number of flowers and seed pods in the sample (p-value <0.31). Garlic flower data and Seed pod data was transformed to approximate normality and homogeneity of variance of the residuals in the model. It is expected that there would be more seed pods than flowers in a plant because the seed pod produce the offspring which are the flowers. The flowers grow slowly because there are a small number of reproductive individuals per unit of time.