library(ggplot2)
head(mpg)
getwd()
[1] "/cloud/project/statprogramming"
setwd("/cloud/project/statprogramming")
write.csv(mpg, "mpg.csv", row.names = F)
자료를 불러오고, 잘 불러졌는지 head를 이용해서 확인해보자.
mpg <- read.csv("mpg.csv")
head(mpg)
NA
str(mpg)
'data.frame': 234 obs. of 14 variables:
$ manufacturer: Factor w/ 15 levels "audi","chevrolet",..: 1 1 1 1 1 1 1 1 1 1 ...
$ model : Factor w/ 38 levels "4runner 4wd",..: 2 2 2 2 2 2 2 3 3 3 ...
$ displ : num 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
$ year : int 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
$ cyl : int 4 4 4 4 6 6 6 4 4 4 ...
$ trans : Factor w/ 10 levels "auto(av)","auto(l3)",..: 4 9 10 1 4 9 1 9 4 10 ...
$ drv : Factor w/ 3 levels "4","f","r": 2 2 2 2 2 2 2 1 1 1 ...
$ cty : int 18 21 20 21 16 18 18 18 16 20 ...
$ hwy : int 29 29 31 30 26 26 27 26 25 28 ...
$ fl : Factor w/ 5 levels "c","d","e","p",..: 4 4 4 4 4 4 4 4 4 4 ...
$ class : Factor w/ 7 levels "2seater","compact",..: 2 2 2 2 2 2 2 2 2 2 ...
$ total : int 47 50 51 51 42 44 45 44 41 48 ...
$ grade : Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
$ test : Factor w/ 1 level "pass": 1 1 1 1 1 1 1 1 1 1 ...
mpg데이터를 반복적으로 사용할 예정이므로, attach를 이용하여 데이터에 빠르게 접근하도록 하자.
attach(mpg)
The following object is masked _by_ .GlobalEnv:
test
The following objects are masked from mpg (pos = 3):
class, cty, cyl, displ, drv, fl, grade, hwy, manufacturer, model,
test, total, trans, year
The following objects are masked from mpg (pos = 4):
class, cty, cyl, displ, drv, fl, grade, hwy, manufacturer, model,
total, trans, year
The following objects are masked from mpg (pos = 5):
class, cty, cyl, displ, drv, fl, grade, hwy, manufacturer, model,
total, trans, year
The following objects are masked from mpg (pos = 6):
class, cty, cyl, displ, drv, fl, hwy, manufacturer, model, total,
trans, year
The following objects are masked from mpg (pos = 7):
class, cty, cyl, displ, drv, fl, hwy, manufacturer, model, trans,
year
mpg$total<- cty+hwy
head(mpg[,c("cty","hwy","total")])
평균은 40이다.
mean(total)
[1] 40.29915
if(mean(total)>=20) {
cat("good")
} else {
cat("normal")
}
good
test <- ifelse(mean(test)>=20, "pass","fail")
argument is not numeric or logical: returning NA
test <- factor(test, levels=c("pass","fail"))
table(test)
test
pass fail
0 0
A를 받은 사람이 204명, B를 받은 사람은 25명, C를 받은 사람은 5명 나온다.
mpg$grade <- ifelse(mpg$total>=30, "A",
ifelse(mpg$total>=25, "B",
ifelse(mpg$total>=20,"C", "D")))
mpg$grade <- factor(mpg$grade, levels=c("A","B","C","D"))
table(mpg$grade)
A B C D
204 25 5 0
mpg_data <- data.frame(total,
test,
grade=mpg$grade)
head(mpg_data)
str(mpg_data)
'data.frame': 234 obs. of 3 variables:
$ total: int 47 50 51 51 42 44 45 44 41 48 ...
$ test : Factor w/ 2 levels "pass","fail": NA NA NA NA NA NA NA NA NA NA ...
$ grade: Factor w/ 4 levels "A","B","C","D": 1 1 1 1 1 1 1 1 1 1 ...
write.table(mpg_data, "mpg_data.csv", row.names = F)
psy <- function(x){
abs <- ifelse(x>=0, x, -x) #절대값 구하기
ifelse (abs<=1 , abs^2, 2*abs-1) # 조건에 따라 함수 값 출력하기
}
psy(x)
[1] 0