t-test kiểm định sự khác biệt giữa 2 NHÓM mean của 1 BIỂN LIÊN TỤC

library(readxl)
## Warning: package 'readxl' was built under R version 4.0.3
Data_for_statistic_1_2_3_anova <- read_excel("C:/Users/Admin/Desktop/My Project/R/Data for statistic 1,2,3 anova.xlsx",col_types = c("text", "text", "text", "numeric", "numeric", "numeric","numeric", "numeric"))
attach(Data_for_statistic_1_2_3_anova)
require(ggplot2)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.0.3

Mô hình hóa bộ dữ liệu bằng boxlot (xem giá trị trung vị, ngoại biên, bất phân vị).

Nhận xét ban đầu

###1. Body weight của Female trung bình thấp hơn male. (Nhưng khác biệt có ý nghĩa thống kê hay ko thì chưa rõ) ###2. Không có giá trị ngoại biên ở cả 2 giới

ttest=ggplot(Data_for_statistic_1_2_3_anova,aes(x=Gender,y=`Body Weight (g)`))
ttest+geom_boxplot()+theme_bw()+theme_bw()

Assumption 1: All samples are indepentdent (no repeated measured in 1 individual). Cá lúc thu mẫu cân khối lượng từng con riêng lẽ, không cân 1 con 2 lần.

Assumption 2: Test all samples are equal variences or not

Ho variences of samples are the same if p value > 0.05 => false to reject Ho => accept Ho => variences of samples are the same. In this case, p=0.9288

var.test(`Body Weight (g)`~Gender)
## 
##  F test to compare two variances
## 
## data:  Body Weight (g) by Gender
## F = 0.92886, num df = 11, denom df = 11, p-value = 0.9048
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.267399 3.226593
## sample estimates:
## ratio of variances 
##          0.9288637

Assumption 3: Test normal distribution.p đều lớn hơn 0.05 là ổn

male=subset(Data_for_statistic_1_2_3_anova,Gender=="Male", select = `Body Weight (g)`)
female=subset(Data_for_statistic_1_2_3_anova,Gender=="Female", select = `Body Weight (g)`)
shapiro.test(male$`Body Weight (g)`)
## 
##  Shapiro-Wilk normality test
## 
## data:  male$`Body Weight (g)`
## W = 0.92639, p-value = 0.3434
shapiro.test(female$`Body Weight (g)`)
## 
##  Shapiro-Wilk normality test
## 
## data:  female$`Body Weight (g)`
## W = 0.88494, p-value = 0.1014
qqnorm(male$`Body Weight (g)`)
qqline(male$`Body Weight (g)`)

qqnorm(female$`Body Weight (g)`)
qqline(female$`Body Weight (g)`)

#Chạy t-test. p=0.75=> fail to reject Ho (Ho= no sig. different)=> No sig body weight betbween 2 genders

t.test(`Body Weight (g)`~Gender)
## 
##  Welch Two Sample t-test
## 
## data:  Body Weight (g) by Gender
## t = -0.318, df = 21.97, p-value = 0.7535
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -562.9118  413.2452
## sample estimates:
## mean in group Female   mean in group Male 
##             1850.917             1925.750