Problem 1

# read from file
tbl_project <- as.tibble(read.csv("project.csv", header = TRUE))
# some clean work
tbl_project <- tbl_project[,1:10]
#colnames(tbl_project)
#head(tbl_project)
#summary(tbl_project)

From the Q-Q plot, we see that the sample x fits the normal distribution well.

non-parametric test based on the median

# non-parametric test based on the median
B <- sum(data1>0)
n <- length(data1)
Z_B <- (B-0.5*n)/sqrt(0.25*n)
pvalue_mean <- 1-pnorm(Z_B)
wilcox.test(data1,mu=0,conf.int=TRUE) 

    Wilcoxon signed rank test

data:  data1
V = 310, p-value = 8.166e-06
alternative hypothesis: true location is not equal to 0
95 percent confidence interval:
 0.7385 1.7095
sample estimates:
(pseudo)median 
        1.1845 

As the p-value is smaller than \(\alpha = 0.05\), we reject the null hypothesis.

t-test


    One Sample t-test

data:  data1
t = 1.9567, df = 24, p-value = 0.03106
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
 0.2927749       Inf
sample estimates:
mean of x 
  2.33088 

In the t-test, the p-value is smaller than given \(\alpha\), so we reject the null hypothesis.

Problem 2

problem 2 (a)


    Wilcoxon rank sum test

data:  y1 and y2
W = 159, p-value = 0.002471
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
 -3.600 -0.536
sample estimates:
difference in location 
                -1.637 


    Welch Two Sample t-test

data:  y1 and y2
t = -3.095, df = 30.098, p-value = 0.002115
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
      -Inf -1.345468
sample estimates:
mean of x mean of y 
  1.68216   4.66104 

In our permutation test, we find the pvalue_permute significantly less than 0.05, we reject the hypothesis.

In the t-test, the p-value is less than \(\alpha = 0.05\), so we reject the hypothesis.

Problem 2 (b)


    F test to compare two variances

data:  y1 and y2
F = 0.12916, num df = 24, denom df = 24, p-value = 3.996e-06
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.05691824 0.29310725
sample estimates:
ratio of variances 
         0.1291633 

We find that U1 and U2 is greater than the critical value after searching the table, we fail to reject the null hypothesis which means that the variances are equal in the population

In the F-test for equity of variance, the F statistic lies in the confidence interval, hence there is not enough evidence to reject the null hypothesis.

Problem 3


    Two-sample Kolmogorov-Smirnov test

data:  z1 and z2
D = 0.4, p-value = 0.03561
alternative hypothesis: two-sided

Problem 4

Permutation F-test

By ussing permutation test, do we accept the hyothesis? TRUE 
                Df Sum Sq Mean Sq F value Pr(>F)
factor(data4$g)  3   6856    2285   0.592  0.622
Residuals       96 370815    3863               

Since the p-value is greater than 0.05, we accept the hypothesis. Both the permutation test and F-test from ANOVA support the result.

