載入資料集
setwd("C:/02.Rstatistics/hw1")
df<-read.csv("C:/02.Rstatistics/hw1/StudentsPerformance.csv")
str(df)
'data.frame': 1000 obs. of 8 variables:
$ gender : chr "female" "female" "female" "male" ...
$ race.ethnicity: chr "group B" "group C" "group B" "group A" ...
$ parents : chr "bachelor's degree" "some college" "master's degree" "associate's degree" ...
$ lunch : chr "standard" "standard" "standard" "free/reduced" ...
$ preparation : chr "none" "completed" "none" "none" ...
$ math : int 72 69 90 47 76 71 88 40 64 38 ...
$ reading : int 72 90 95 57 78 83 95 43 64 60 ...
$ writing : int 74 88 93 44 75 78 92 39 67 50 ...
Q1.根據學生「性別」分別計算寫作成績的平均數與標準差
female_student<-subset(df,gender=="female")
male_student<-subset(df,gender=="male")
mean(female_student$writing)
sd(female_student$writing)
mean(male_student$writing)
Q2.請畫出宗教-數學成績的盒鬚圖(boxplot),以及數學成績的長條圖(histogram).
請分別於圖中標示圖形名稱(e.g., 盒鬚圖、長條圖)、X及Y 軸標示
boxplot(math~race.ethnicity,data=df,
xlab = "race.ethnicity",ylab="math score",
main="Student Math Score by Race")
hist(df$math,labels = TRUE,
xlab = "Math",main = "Histogram of Math Score")
*Q3.計算在閱讀成績「(>=70)」的人數中,考試準備程度「None」與 「completed」的人之比例?
df1<-subset(df,reading>=70)
prop.table(table(df1$preparation))
completed none
0.4561404 0.5438596
. 請使用ggplot2套件畫出學生數學成績-寫作成績的點狀圖,並適當標示。 2. 請展示並計算學生數學、寫作與閱讀成績之相關係數並解釋其關係
Warning: package 'ggplot2' was built under R version 4.5.3
figure1<-ggplot(df,aes(x=math,y=reading))+geom_point();figure1
math reading writing
math 1.0000000 0.8175797 0.8026420
reading 0.8175797 1.0000000 0.9545981
writing 0.8026420 0.9545981 1.0000000
寫作成績越好的學生通常閱讀成績也越佳,同時閱讀成績高的數學成績也通常還不錯,在數學成績和寫作成績也是呈高度相關。
「及格(>=60)」且父母教育程度為「master’s degree」,佔所有學生中的比例多少?
df3<-subset(df,writing>=60,parents=="master's degree")
nrow(df3)/nrow(df)