I want to find out if small cars have better gas mileage than large cars. I’ve defined small cars to be either compacts or subcontacts and big cars to be either pickups or suvs. I don’t care about other cars.
my_cars <- filter(mpg,class %in% c("suv","pickup","compact","subcompact"))
Next, let’s group and count to see what we’ve got.
group_by(my_cars,class) %>%
summarize(n())
that looks decent. Now I only need a few columns, so I’ll try select.
select(my_cars,class,hwy,cty)
let’s add a column for whether its a big car or a small car
select(my_cars,class,hwy,cty) %>%
mutate(big = class %in% c("pickup","suv")) %>%
group_by(big) %>%
summarize(n())
Now I can test my hypothesis:
select(my_cars,class,hwy,cty) %>%
mutate(big = class %in% c("pickup","suv")) %>%
group_by(big) %>%
summarize(n(), mean(hwy),mean(cty))
Whoopie! small cars have better gas mileage that big cars. But the true and false for the new “big” variable bugs me. So I will make a function to assign meaningful name.
size_for_class <- function(name) {
if(name=="compact"| name=="subcompact") {"small"} else {"big"}}
mutate(my_cars,size = mapply(size_for_class,class)) %>%
group_by(size) %>%
summarize(n(), mean(hwy),mean(cty))
That’s a winner!
LS0tCnRpdGxlOiAiSmVyZW15J3MgZGF0YSBzdW1tYXJ5IgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgpJIHdhbnQgdG8gZmluZCBvdXQgaWYgc21hbGwgY2FycyBoYXZlIGJldHRlciBnYXMgbWlsZWFnZSB0aGFuIGxhcmdlIGNhcnMuIEkndmUgZGVmaW5lZCBzbWFsbCBjYXJzIHRvIGJlIGVpdGhlciBjb21wYWN0cyBvciBzdWJjb250YWN0cyBhbmQgYmlnIGNhcnMgdG8gYmUgZWl0aGVyIHBpY2t1cHMgb3Igc3V2cy4gSSBkb24ndCBjYXJlIGFib3V0IG90aGVyIGNhcnMuCgoKCgpgYGB7cn0KbXlfY2FycyA8LSBmaWx0ZXIobXBnLGNsYXNzICVpbiUgYygic3V2IiwicGlja3VwIiwiY29tcGFjdCIsInN1YmNvbXBhY3QiKSkKYGBgCgpOZXh0LCBsZXQncyBncm91cCBhbmQgY291bnQgdG8gc2VlIHdoYXQgd2UndmUgZ290LgoKYGBge3J9Cmdyb3VwX2J5KG15X2NhcnMsY2xhc3MpICU+JSAKICBzdW1tYXJpemUobigpKQpgYGAKCnRoYXQgbG9va3MgZGVjZW50LiBOb3cgSSBvbmx5IG5lZWQgYSBmZXcgY29sdW1ucywgc28gSSdsbCB0cnkgc2VsZWN0LgoKYGBge3J9CnNlbGVjdChteV9jYXJzLGNsYXNzLGh3eSxjdHkpCmBgYAoKbGV0J3MgYWRkIGEgY29sdW1uIGZvciB3aGV0aGVyIGl0cyBhIGJpZyBjYXIgb3IgYSBzbWFsbCBjYXIKCgpgYGB7cn0Kc2VsZWN0KG15X2NhcnMsY2xhc3MsaHd5LGN0eSkgJT4lCiAgbXV0YXRlKGJpZyA9IGNsYXNzICVpbiUgYygicGlja3VwIiwic3V2IikpICU+JQogIGdyb3VwX2J5KGJpZykgJT4lIAogIHN1bW1hcml6ZShuKCkpCmBgYAoKTm93IEkgY2FuIHRlc3QgbXkgaHlwb3RoZXNpczoKCmBgYHtyfQpzZWxlY3QobXlfY2FycyxjbGFzcyxod3ksY3R5KSAlPiUKICBtdXRhdGUoYmlnID0gY2xhc3MgJWluJSBjKCJwaWNrdXAiLCJzdXYiKSkgJT4lCiAgZ3JvdXBfYnkoYmlnKSAlPiUgCiAgc3VtbWFyaXplKG4oKSwgbWVhbihod3kpLG1lYW4oY3R5KSkKYGBgCgpXaG9vcGllISBzbWFsbCBjYXJzIGhhdmUgYmV0dGVyIGdhcyBtaWxlYWdlIHRoYXQgYmlnIGNhcnMuIEJ1dCB0aGUgdHJ1ZSBhbmQgZmFsc2UgZm9yIHRoZSBuZXcgImJpZyIgdmFyaWFibGUgYnVncyBtZS4gU28gSSB3aWxsIG1ha2UgYSBmdW5jdGlvbiB0byBhc3NpZ24gbWVhbmluZ2Z1bCBuYW1lLgoKYGBge3J9CnNpemVfZm9yX2NsYXNzIDwtIGZ1bmN0aW9uKG5hbWUpIHsKICBpZihuYW1lPT0iY29tcGFjdCJ8IG5hbWU9PSJzdWJjb21wYWN0IikgeyJzbWFsbCJ9IGVsc2UgeyJiaWcifX0KYGBgCgoKYGBge3J9Cm11dGF0ZShteV9jYXJzLHNpemUgPSBtYXBwbHkoc2l6ZV9mb3JfY2xhc3MsY2xhc3MpKSAlPiUKICBncm91cF9ieShzaXplKSAlPiUKICBzdW1tYXJpemUobigpLCBtZWFuKGh3eSksbWVhbihjdHkpKQpgYGAKClRoYXQncyBhIHdpbm5lciEKCgo=