NAME:Joey Kehoe Student ID:920697315

In this assignment you will further dive into t.test in R. You will form different hypothesis and use a dataset(before_after.xlsx) to assess how biodiversity has changed due to a wildfire by comparing pre- and post-wildfire biodiversity indices for 30 forest plots.

#PART 1

  1. import the data and check if it has normal distribution(1 point)
  2. visualize the data using boxplot and briefly inrepert the distribution and boxplot (2 points)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
#load the dataset as csv!!
before_after<- read.csv("C:/Users/Joey/Downloads/before_after.csv")

#check the structure of the data and change class if needed 
str(before_after)
## 'data.frame':    30 obs. of  3 variables:
##  $ Plot  : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Before: num  1.89 1.82 2.01 1.93 1.94 1.52 1.74 2.23 2.43 2.1 ...
##  $ After : num  3.03 2.32 2.33 3.23 2.53 3.03 2.42 2.35 2.37 2.51 ...
#use histogram to check the distribution
hist(before_after$Before, xlab = "plants before", main = "plants before fire")

hist(before_after$After, xlab = "plants after", main = "plants after fire")

#boxplot
before_afterbp<- boxplot(before_after$Before, before_after$After, names = c("plants before","after"))

#PART 2

Form proper hypothesis for each question and use correct t.test to answer it.

###Forming hypothesis for each question receives 1 point and performing t.test and answering each question receives 2 points.

  1. Has the recent wildfire significantly affected the biodiversity of the forest?(3 points)
#perform proper t.test to answer this question
#H0: meanflowersbefore = meanflowersafter
#H1: meanflowersbefore not equal to meanflowersafter
t_test_2tailflowers <- t.test(before_after$Before, before_after$After,
                                   alternative = "two.sided", var.equal = FALSE)
print(t_test_2tailflowers)
## 
##  Welch Two Sample t-test
## 
## data:  before_after$Before and before_after$After
## t = -7.4366, df = 56.277, p-value = 6.391e-10
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.7996878 -0.4603122
## sample estimates:
## mean of x mean of y 
##  1.966333  2.596333
#Since the P value is less than the critical value, we can reject the null hypothesis

B)Has the recent wildfire significantly increased the biodiversity of the forest?(3 points)

#perform proper t.test to answer this question
#HO: meanflowersbefore = meanflowersafter
#H1: meanflowersbefore less than or equal to meanflowersafter
t_test_beforebiggerafter <- t.test(before_after$Before, before_after$After,
                                   alternative = "greater", var.equal = FALSE)
print(t_test_beforebiggerafter)
## 
##  Welch Two Sample t-test
## 
## data:  before_after$Before and before_after$After
## t = -7.4366, df = 56.277, p-value = 1
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  -0.7716774        Inf
## sample estimates:
## mean of x mean of y 
##  1.966333  2.596333
#Since the P value is above our critical value we fail to reject the null hypothesis
  1. Has the recent wildfire significantly reduced the biodiversity of the forest?(3 points)
#perform proper t.test to answer this question
#HO: meanflowersbefore = meanflowersafter
#H1: meanflowersbefore greater than or equal to
t_test_beforelessererafter <- t.test(before_after$Before, before_after$After,
                                   alternative = "less", var.equal = FALSE)
print(t_test_beforelessererafter)
## 
##  Welch Two Sample t-test
## 
## data:  before_after$Before and before_after$After
## t = -7.4366, df = 56.277, p-value = 3.196e-10
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##        -Inf -0.4883226
## sample estimates:
## mean of x mean of y 
##  1.966333  2.596333
#Since the P value is less than the critical value, we can reject the null hypothesis

#PART 3

What do you think the limitations of your analysis are?(3 points) Well we don’t know how long after the fire this data was taken, we also don’t know the size of the sample.