Problem 1
# read from file
tbl_project <- as.tibble(read.csv("project.csv", header = TRUE))
# some clean work
tbl_project <- tbl_project[,1:10]
#colnames(tbl_project)
#head(tbl_project)
#summary(tbl_project)



From the Q-Q plot, we see that the sample x fits the normal distribution well.
t-test
One Sample t-test
data: data1
t = 1.9567, df = 24, p-value = 0.03106
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
0.2927749 Inf
sample estimates:
mean of x
2.33088
In the t-test, the p-value is smaller than given \(\alpha\), so we reject the null hypothesis.
Problem 2
problem 2 (a)
Wilcoxon rank sum test
data: y1 and y2
W = 159, p-value = 0.002471
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
-3.600 -0.536
sample estimates:
difference in location
-1.637

Welch Two Sample t-test
data: y1 and y2
t = -3.095, df = 30.098, p-value = 0.002115
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf -1.345468
sample estimates:
mean of x mean of y
1.68216 4.66104
In our permutation test, we find the pvalue_permute significantly less than 0.05, we reject the hypothesis.
In the t-test, the p-value is less than \(\alpha = 0.05\), so we reject the hypothesis.
Problem 2 (b)
F test to compare two variances
data: y1 and y2
F = 0.12916, num df = 24, denom df = 24, p-value = 3.996e-06
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.05691824 0.29310725
sample estimates:
ratio of variances
0.1291633
We find that U1 and U2 is greater than the critical value after searching the table, we fail to reject the null hypothesis which means that the variances are equal in the population
In the F-test for equity of variance, the F statistic lies in the confidence interval, hence there is not enough evidence to reject the null hypothesis.
Problem 3

Two-sample Kolmogorov-Smirnov test
data: z1 and z2
D = 0.4, p-value = 0.03561
alternative hypothesis: two-sided
Problem 4
Permutation F-test
By ussing permutation test, do we accept the hyothesis? TRUE
Df Sum Sq Mean Sq F value Pr(>F)
factor(data4$g) 3 6856 2285 0.592 0.622
Residuals 96 370815 3863
Since the p-value is greater than 0.05, we accept the hypothesis. Both the permutation test and F-test from ANOVA support the result.
