hw1

Author

劉庭瑜

載入資料集

setwd("C:/02.Rstatistics/hw1")
df<-read.csv("C:/02.Rstatistics/hw1/StudentsPerformance.csv")
str(df)
'data.frame':   1000 obs. of  8 variables:
 $ gender        : chr  "female" "female" "female" "male" ...
 $ race.ethnicity: chr  "group B" "group C" "group B" "group A" ...
 $ parents       : chr  "bachelor's degree" "some college" "master's degree" "associate's degree" ...
 $ lunch         : chr  "standard" "standard" "standard" "free/reduced" ...
 $ preparation   : chr  "none" "completed" "none" "none" ...
 $ math          : int  72 69 90 47 76 71 88 40 64 38 ...
 $ reading       : int  72 90 95 57 78 83 95 43 64 60 ...
 $ writing       : int  74 88 93 44 75 78 92 39 67 50 ...

Q1.根據學生「性別」分別計算寫作成績的平均數與標準差

female_student<-subset(df,gender=="female")
male_student<-subset(df,gender=="male")
mean(female_student$writing)
[1] 72.46718
sd(female_student$writing)
[1] 14.84484
mean(male_student$writing)
[1] 63.3112
sd(male_student$writing)
[1] 14.11383

Q2.請畫出宗教-數學成績的盒鬚圖(boxplot),以及數學成績的長條圖(histogram).
請分別於圖中標示圖形名稱(e.g., 盒鬚圖、長條圖)、X及Y 軸標示

boxplot(math~race.ethnicity,data=df,
        xlab = "race.ethnicity",ylab="math score",
        main="Student Math Score by Race")

hist(df$math,labels = TRUE,
     xlab = "Math",main = "Histogram of Math Score")

*Q3.計算在閱讀成績「(>=70)」的人數中,考試準備程度「None」與 「completed」的人之比例?

df1<-subset(df,reading>=70)
prop.table(table(df1$preparation))

completed      none 
0.4561404 0.5438596 

. 請使用ggplot2套件畫出學生數學成績-寫作成績的點狀圖,並適當標示。 2. 請展示並計算學生數學、寫作與閱讀成績之相關係數並解釋其關係

library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.5.3
figure1<-ggplot(df,aes(x=math,y=reading))+geom_point();figure1

df2<-df[,6:8]
cor(df2)
             math   reading   writing
math    1.0000000 0.8175797 0.8026420
reading 0.8175797 1.0000000 0.9545981
writing 0.8026420 0.9545981 1.0000000

寫作成績越好的學生通常閱讀成績也越佳,同時閱讀成績高的數學成績也通常還不錯,在數學成績和寫作成績也是呈高度相關。

「及格(>=60)」且父母教育程度為「master’s degree」,佔所有學生中的比例多少?

df3<-subset(df,writing>=60,parents=="master's degree")
nrow(df3)/nrow(df)
[1] 0.719