t=file.choose() pisa=read.csv(t) head(pisa) School SchoolSize ClassSize STratio SchoolType Area Region Age Gender PARED HISCED WEALTH INSTSCIE JOYSCIE ICTRES 1 70400001 883 18 22.075 3 URBAN SOUTH 15.58 Boys 9 2 -2.0697 0.9798 2.1635 -1.5244 2 70400001 883 18 22.075 3 URBAN SOUTH 15.92 Boys 12 4 -1.7903 1.7359 2.1635 -1.9305 3 70400001 883 18 22.075 3 URBAN SOUTH 15.42 Girls 9 2 -2.1942 -0.2063 -0.1808 -1.6093 4 70400001 883 18 22.075 3 URBAN SOUTH 15.58 Girls 5 1 -2.0301 -0.3115 -0.4318 -1.6250 5 70400001 883 18 22.075 3 URBAN SOUTH 15.92 Girls 9 2 -1.0522 0.7648 1.3031 -0.5305 6 70400001 883 18 22.075 3 URBAN SOUTH 16.25 Girls 5 1 -3.0570 0.3708 0.5094 -2.5873 Math Read Science 1 439.923 412.290 475.612 2 406.251 409.598 450.320 3 414.369 384.307 405.787 4 468.801 459.104 462.968 5 355.432 402.435 453.736 6 458.955 483.885 529.866 ggplot(data=pisa, aes(x=Gender, y=Math, fill=Gender))+geom_boxplot()

t.test(pisa\(Math ~ pisa\)Gender)

Welch Two Sample t-test

data: pisa\(Math by pisa\)Gender t = 1.4324, df = 5699.7, p-value = 0.1521 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.131851 7.273642 sample estimates: mean in group Boys mean in group Girls 497.6832 494.6123

t.test(pisa\(Sience ~ pisa\)Gender) Error in model.frame.default(formula = pisa\(Sience ~ pisa\)Gender) : invalid type (NULL) for variable ’pisa\(Sience' t.test(pisa\)Read ~ pisa$Gender)

Welch Two Sample t-test

data: pisa\(Read by pisa\)Gender t = -11.397, df = 5665.5, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -24.56452 -17.35398 sample estimates: mean in group Boys mean in group Girls 478.9437 499.9029

ggplot(data=pisa, aes(x=Gender, y=Read, fill=Gender))+geom_boxplot() #Analysis of Variance (ANOVA) ggplot(data=pisa, aes(x=Area, y=Math, fill=Area))+geom_boxplot() t.test(pisa\(Math ~ pisa\)Area) Error in t.test.formula(pisa\(Math ~ pisa\)Area) : grouping factor must have exactly 2 levels #t.test ko dung duoc cho so sanh nhieu bien (hon 2 bien) phai dung cai duoi day av=aov(pisa\(Math~pisa\)Area) TurkeyHSD(av) Error in TurkeyHSD(av) : could not find function “TurkeyHSD” TukeyHSD(av) Tukey multiple comparisons of means 95% family-wise confidence level

Fit: aov(formula = pisa\(Math ~ pisa\)Area)

\(`pisa\)Area` diff lwr upr p adj RURAL-REMOTE 49.9364556 39.834658 60.038253 0.0000000 URBAN-REMOTE 49.1036840 39.169587 59.037781 0.0000000 URBAN-RURAL -0.8327716 -6.005936 4.340393 0.9245069 > table1(~Area|Region,data=pisa) > tt=table(pisa\(Area,pisa\)Region) > tt

     CENTRAL NORTH SOUTH

REMOTE 198 148 64 RURAL 857 764 747 URBAN 951 1046 1051 > chisq.test(tt)

Pearson's Chi-squared test

data: tt X-squared = 77.219, df = 4, p-value = 6.76e-16