displ
(배기량)이 4
이하인 자동차와 5
이상인 자동차 중 어떤 자동차의 hyw
(고속도로 연비)가 평균적으로 더 높은지 알아보세요.library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
mpg <- as.data.frame(ggplot2::mpg)
summary(mpg)
## manufacturer model displ year
## Length:234 Length:234 Min. :1.600 Min. :1999
## Class :character Class :character 1st Qu.:2.400 1st Qu.:1999
## Mode :character Mode :character Median :3.300 Median :2004
## Mean :3.472 Mean :2004
## 3rd Qu.:4.600 3rd Qu.:2008
## Max. :7.000 Max. :2008
## cyl trans drv cty
## Min. :4.000 Length:234 Length:234 Min. : 9.00
## 1st Qu.:4.000 Class :character Class :character 1st Qu.:14.00
## Median :6.000 Mode :character Mode :character Median :17.00
## Mean :5.889 Mean :16.86
## 3rd Qu.:8.000 3rd Qu.:19.00
## Max. :8.000 Max. :35.00
## hwy fl class
## Min. :12.00 Length:234 Length:234
## 1st Qu.:18.00 Class :character Class :character
## Median :24.00 Mode :character Mode :character
## Mean :23.44
## 3rd Qu.:27.00
## Max. :44.00
a <- mpg %>% filter(displ <= 4)
b <- mpg %>% filter(displ >= 5)
mean(a$hwy)
## [1] 25.96319
mean(b$hwy)
## [1] 18.07895
그러므로, displ
(배기량)이 4
이하인 자동차의 고속도로 연비가 5
이하인 자동차보다 평균적으로 더 높다.
audi
”와 “toyota
”중 어느 manufacturer
(자동차 제조 회사)의 cty
(도시연비)가 평균적으로 더 높은지 알아보세요.library("dplyr")
audi <- mpg %>% filter(manufacturer == "audi")
toyota <- mpg %>% filter(manufacturer == "toyota")
mean(audi$cty)
## [1] 17.61111
mean(toyota$cty)
## [1] 18.52941
그러므로, “toyota
”의 도시 연비가 “audi
”의 도시연비 보다 평균적으로 더 높다.
chevrolet
”, “ford
”, “honda
” 자동차의 고속도로 연비 평균을 알아보려고 합니다. 이 회사들의 데이터를 추출한 후 hwy
전체 평균을 구해 보세요.library("dplyr")
Z <- mpg %>% filter(manufacturer %in% c("chevrolet", "ford", "honda"))
summary(Z)
## manufacturer model displ year
## Length:53 Length:53 Min. :1.600 Min. :1999
## Class :character Class :character 1st Qu.:3.600 1st Qu.:1999
## Mode :character Mode :character Median :4.600 Median :1999
## Mean :4.245 Mean :2003
## 3rd Qu.:5.400 3rd Qu.:2008
## Max. :7.000 Max. :2008
## cyl trans drv cty
## Min. :4.000 Length:53 Length:53 Min. :11.00
## 1st Qu.:6.000 Class :character Class :character 1st Qu.:13.00
## Median :8.000 Mode :character Mode :character Median :15.00
## Mean :6.679 Mean :16.13
## 3rd Qu.:8.000 3rd Qu.:18.00
## Max. :8.000 Max. :28.00
## hwy fl class
## Min. :14.00 Length:53 Length:53
## 1st Qu.:17.00 Class :character Class :character
## Median :21.00 Mode :character Mode :character
## Mean :22.51
## 3rd Qu.:26.00
## Max. :36.00
mean(Z$hwy)
## [1] 22.50943
그러므로, “chevrolet
”, “ford
”, “honda
” 자동차의 고속도로 전체 평균은 “22.50943
”입니다.
class
(자동차 종류), cty
(도시 연비) 변수를 추출해 새로운 데이터를 만드세요. 새로 만든 데이터의 일부를 출력해 두 변수로만 구성되어 있는지 확인하세요.mpg_class_cty <- as.data.frame(mpg %>% select(class, cty))
head(mpg_class_cty)
class
(자동차 종류)가 “suv
”인 자동차와 “compact
”인 자동차 중 어떤 자동차의 cty
(도시 연비) 평균이 더 높은지 알아보세요.library(dplyr)
suv <- mpg_class_cty %>% filter(class == "suv")
compact <- mpg_class_cty %>% filter(class == "compact")
mean(suv$cty)
## [1] 13.5
mean(compact$cty)
## [1] 20.12766
그러므로, “compact
”인 자동차의 cty
(도시 연비) 평균(20.12766)이 “suv
”인 자동차의 평균(13.5)보다 더 높습니다.
hwy
(고속도로 연비)가 높은지 알아보려고 합니다. “audi”에서 생산한 자동차 중 hwy가 1~5위에 해당하는 자동차의 데이터를 출력하세요.library(dplyr)
audi_hwy <- mpg %>% filter(manufacturer == "audi") %>% select(model, hwy)
audi_hwy %>% arrange(desc(hwy)) %>% head(5)
cty
와 hwy
를 더한 ’합산 연비 변수’를 추가하세요.library(dplyr)
mpg_copy <- as.data.frame(mpg)
mpg_copy <- mpg_copy %>% mutate(total = mpg_copy$cty + mpg_copy$hwy)
summary(mpg_copy)
## manufacturer model displ year
## Length:234 Length:234 Min. :1.600 Min. :1999
## Class :character Class :character 1st Qu.:2.400 1st Qu.:1999
## Mode :character Mode :character Median :3.300 Median :2004
## Mean :3.472 Mean :2004
## 3rd Qu.:4.600 3rd Qu.:2008
## Max. :7.000 Max. :2008
## cyl trans drv cty
## Min. :4.000 Length:234 Length:234 Min. : 9.00
## 1st Qu.:4.000 Class :character Class :character 1st Qu.:14.00
## Median :6.000 Mode :character Mode :character Median :17.00
## Mean :5.889 Mean :16.86
## 3rd Qu.:8.000 3rd Qu.:19.00
## Max. :8.000 Max. :35.00
## hwy fl class total
## Min. :12.00 Length:234 Length:234 Min. :21.0
## 1st Qu.:18.00 Class :character Class :character 1st Qu.:31.0
## Median :24.00 Mode :character Mode :character Median :41.0
## Mean :23.44 Mean :40.3
## 3rd Qu.:27.00 3rd Qu.:47.0
## Max. :44.00 Max. :79.0
library(dplyr)
mpg_copy <- mpg_copy %>% mutate(mean = mpg_copy$total/2)
summary(mpg_copy)
## manufacturer model displ year
## Length:234 Length:234 Min. :1.600 Min. :1999
## Class :character Class :character 1st Qu.:2.400 1st Qu.:1999
## Mode :character Mode :character Median :3.300 Median :2004
## Mean :3.472 Mean :2004
## 3rd Qu.:4.600 3rd Qu.:2008
## Max. :7.000 Max. :2008
## cyl trans drv cty
## Min. :4.000 Length:234 Length:234 Min. : 9.00
## 1st Qu.:4.000 Class :character Class :character 1st Qu.:14.00
## Median :6.000 Mode :character Mode :character Median :17.00
## Mean :5.889 Mean :16.86
## 3rd Qu.:8.000 3rd Qu.:19.00
## Max. :8.000 Max. :35.00
## hwy fl class total
## Min. :12.00 Length:234 Length:234 Min. :21.0
## 1st Qu.:18.00 Class :character Class :character 1st Qu.:31.0
## Median :24.00 Mode :character Mode :character Median :41.0
## Mean :23.44 Mean :40.3
## 3rd Qu.:27.00 3rd Qu.:47.0
## Max. :44.00 Max. :79.0
## mean
## Min. :10.50
## 1st Qu.:15.50
## Median :20.50
## Mean :20.15
## 3rd Qu.:23.50
## Max. :39.50
mpg_copy %>% arrange(desc(mpg_copy$mean)) %>% head(3)
dplyr
구문을 만들어 실행해 보세요. 데이터는 복사본 대신 mpg원본을 이용하세요.mpg %>% dplyr::mutate(total = mpg$cty + mpg$hwy,
mean = total/2) %>%
arrange(desc(mean)) %>%
head(3)
mpg %>% group_by(class) %>%
summarise(mean_cty = mean(cty))
## `summarise()` ungrouping output (override with `.groups` argument)
mpg %>% group_by(class) %>%
summarise(mean_cty = mean(cty)) %>%
arrange(desc(mean_cty))
## `summarise()` ungrouping output (override with `.groups` argument)
mpg %>% group_by(manufacturer) %>%
summarise(mean_hwy = mean(hwy)) %>%
arrange(desc(mean_hwy))
## `summarise()` ungrouping output (override with `.groups` argument)
mpg %>% filter(class == "compact") %>%
group_by(manufacturer) %>%
summarise(count = n()) %>%
arrange(desc(count))
## `summarise()` ungrouping output (override with `.groups` argument)