suppressPackageStartupMessages(library("tidyverse"))
package 㤼㸱tidyverse㤼㸲 was built under R version 3.6.3
suppressPackageStartupMessages(library("modelr"))
package 㤼㸱modelr㤼㸲 was built under R version 3.6.3
suppressPackageStartupMessages(library("lubridate"))
1. Create one plot on the fuel economy data with customized title
, subtitle
, caption
, x
, y
, and color
labels.
ggplot(
data = mpg,
mapping = aes(x = fct_reorder(class, hwy), y = hwy)
) +
geom_boxplot() +
coord_flip() +
labs(
title = "Compact Cars have > 10 Hwy MPG than Pickup Trucks",
subtitle = "Comparing the median highway mpg in each class",
caption = "Data from fueleconomy.gov",
x = "Car Class",
y = "Highway Miles per Gallon"
)

2 The geom_smooth()
is somewhat misleading because the hwy
for large engines is skewed upwards due to the inclusion of lightweight sports cars with big engines. Use your modeling tools to fit and display a better model.
First, I’ll plot the relationship between fuel efficiency and engine size (displacement) using all cars. The plot shows a strong negative relationship.
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
labs(
title = "Fuel Efficiency Decreases with Engine Size",
caption = "Data from fueleconomy.gov",
y = "Highway Miles per Gallon",
x = "Engine Displacement"
)

However, if I disaggregate by car class, and plot the relationship between fuel efficiency and engine displacement within each class, I see a different relationship.
For all car class except subcompact cars, there is no relationship or only a small negative relationship between fuel efficiency and engine size.
For subcompact cars, there is a strong negative relationship between fuel efficiency and engine size. As the question noted, this is because the subcompact car class includes both small cheap cars, and sports cars with large engines.
ggplot(mpg, aes(displ, hwy, colour = class)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
labs(
title = "Fuel Efficiency Mostly Varies by Car Class",
subtitle = "Subcompact caries fuel efficiency varies by engine size",
caption = "Data from fueleconomy.gov",
y = "Highway Miles per Gallon",
x = "Engine Displacement"
)

Another way to model and visualize the relationship between fuel efficiency and engine displacement after accounting for car class is to regress fuel efficiency on car class, and plot the residuals of that regression against engine displacement. The residuals of the first regression are the variation in fuel efficiency not explained by engine displacement. The relationship between fuel efficiency and engine displacement is attenuated after accounting for car class.
mod <- lm(hwy ~ class, data = mpg)
mpg %>%
add_residuals(mod) %>%
ggplot(aes(x = displ, y = resid)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
labs(
title = "Engine size has little effect on fuel efficiency",
subtitle = "After accounting for car class",
caption = "Data from fueleconomy.gov",
y = "Highway MPG Relative to Class Average",
x = "Engine Displacement"
)

LS0tDQp0aXRsZTogIkxhYmVsIg0Kb3V0cHV0OiANCiAgaHRtbF9ub3RlYm9vazoNCiAgICB0b2M6IHRydWUNCiAgICB0b2NfZmxvYXQ6IHRydWUNCi0tLQ0KDQpgYGB7cn0NCnN1cHByZXNzUGFja2FnZVN0YXJ0dXBNZXNzYWdlcyhsaWJyYXJ5KCJ0aWR5dmVyc2UiKSkNCnN1cHByZXNzUGFja2FnZVN0YXJ0dXBNZXNzYWdlcyhsaWJyYXJ5KCJtb2RlbHIiKSkNCnN1cHByZXNzUGFja2FnZVN0YXJ0dXBNZXNzYWdlcyhsaWJyYXJ5KCJsdWJyaWRhdGUiKSkNCmBgYA0KDQojIyMgMS4gQ3JlYXRlIG9uZSBwbG90IG9uIHRoZSBmdWVsIGVjb25vbXkgZGF0YSB3aXRoIGN1c3RvbWl6ZWQgYHRpdGxlYCwgYHN1YnRpdGxlYCwgYGNhcHRpb25gLCBgeGAsIGB5YCwgYW5kIGBjb2xvcmAgbGFiZWxzLg0KDQpgYGB7cn0NCmdncGxvdCgNCiAgZGF0YSA9IG1wZywNCiAgbWFwcGluZyA9IGFlcyh4ID0gZmN0X3Jlb3JkZXIoY2xhc3MsIGh3eSksIHkgPSBod3kpDQopICsNCiAgZ2VvbV9ib3hwbG90KCkgKw0KICBjb29yZF9mbGlwKCkgKw0KICBsYWJzKA0KICAgIHRpdGxlID0gIkNvbXBhY3QgQ2FycyBoYXZlID4gMTAgSHd5IE1QRyB0aGFuIFBpY2t1cCBUcnVja3MiLA0KICAgIHN1YnRpdGxlID0gIkNvbXBhcmluZyB0aGUgbWVkaWFuIGhpZ2h3YXkgbXBnIGluIGVhY2ggY2xhc3MiLA0KICAgIGNhcHRpb24gPSAiRGF0YSBmcm9tIGZ1ZWxlY29ub215LmdvdiIsDQogICAgeCA9ICJDYXIgQ2xhc3MiLA0KICAgIHkgPSAiSGlnaHdheSBNaWxlcyBwZXIgR2FsbG9uIg0KICApDQpgYGANCg0KIyMjIDIgVGhlIGBnZW9tX3Ntb290aCgpYCBpcyBzb21ld2hhdCBtaXNsZWFkaW5nIGJlY2F1c2UgdGhlIGBod3lgIGZvciBsYXJnZSBlbmdpbmVzIGlzIHNrZXdlZCB1cHdhcmRzIGR1ZSB0byB0aGUgaW5jbHVzaW9uIG9mIGxpZ2h0d2VpZ2h0IHNwb3J0cyBjYXJzIHdpdGggYmlnIGVuZ2luZXMuIFVzZSB5b3VyIG1vZGVsaW5nIHRvb2xzIHRvIGZpdCBhbmQgZGlzcGxheSBhIGJldHRlciBtb2RlbC4NCg0KRmlyc3QsIEnigJlsbCBwbG90IHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiBmdWVsIGVmZmljaWVuY3kgYW5kIGVuZ2luZSBzaXplIChkaXNwbGFjZW1lbnQpIHVzaW5nIGFsbCBjYXJzLiBUaGUgcGxvdCBzaG93cyBhIHN0cm9uZyBuZWdhdGl2ZSByZWxhdGlvbnNoaXAuDQoNCmBgYHtyfQ0KZ2dwbG90KG1wZywgYWVzKGRpc3BsLCBod3kpKSArDQogIGdlb21fcG9pbnQoKSArDQogIGdlb21fc21vb3RoKG1ldGhvZCA9ICJsbSIsIHNlID0gRkFMU0UpICsNCiAgbGFicygNCiAgICB0aXRsZSA9ICJGdWVsIEVmZmljaWVuY3kgRGVjcmVhc2VzIHdpdGggRW5naW5lIFNpemUiLA0KICAgIGNhcHRpb24gPSAiRGF0YSBmcm9tIGZ1ZWxlY29ub215LmdvdiIsDQogICAgeSA9ICJIaWdod2F5IE1pbGVzIHBlciBHYWxsb24iLA0KICAgIHggPSAiRW5naW5lIERpc3BsYWNlbWVudCINCiAgKQ0KYGBgDQoNCkhvd2V2ZXIsIGlmIEkgZGlzYWdncmVnYXRlIGJ5IGNhciBjbGFzcywgYW5kIHBsb3QgdGhlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIGZ1ZWwgZWZmaWNpZW5jeSBhbmQgZW5naW5lIGRpc3BsYWNlbWVudCB3aXRoaW4gZWFjaCBjbGFzcywgSSBzZWUgYSBkaWZmZXJlbnQgcmVsYXRpb25zaGlwLg0KDQoxLiBGb3IgYWxsIGNhciBjbGFzcyBleGNlcHQgc3ViY29tcGFjdCBjYXJzLCB0aGVyZSBpcyBubyByZWxhdGlvbnNoaXAgb3Igb25seSBhIHNtYWxsIG5lZ2F0aXZlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIGZ1ZWwgZWZmaWNpZW5jeSBhbmQgZW5naW5lIHNpemUuDQoNCjIuIEZvciBzdWJjb21wYWN0IGNhcnMsIHRoZXJlIGlzIGEgc3Ryb25nIG5lZ2F0aXZlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIGZ1ZWwgZWZmaWNpZW5jeSBhbmQgZW5naW5lIHNpemUuIEFzIHRoZSBxdWVzdGlvbiBub3RlZCwgdGhpcyBpcyBiZWNhdXNlIHRoZSBzdWJjb21wYWN0IGNhciBjbGFzcyBpbmNsdWRlcyBib3RoIHNtYWxsIGNoZWFwIGNhcnMsIGFuZCBzcG9ydHMgY2FycyB3aXRoIGxhcmdlIGVuZ2luZXMuDQoNCmBgYHtyfQ0KZ2dwbG90KG1wZywgYWVzKGRpc3BsLCBod3ksIGNvbG91ciA9IGNsYXNzKSkgKw0KICBnZW9tX3BvaW50KCkgKw0KICBnZW9tX3Ntb290aChtZXRob2QgPSAibG0iLCBzZSA9IEZBTFNFKSArDQogIGxhYnMoDQogICAgdGl0bGUgPSAiRnVlbCBFZmZpY2llbmN5IE1vc3RseSBWYXJpZXMgYnkgQ2FyIENsYXNzIiwNCiAgICBzdWJ0aXRsZSA9ICJTdWJjb21wYWN0IGNhcmllcyBmdWVsIGVmZmljaWVuY3kgdmFyaWVzIGJ5IGVuZ2luZSBzaXplIiwNCiAgICBjYXB0aW9uID0gIkRhdGEgZnJvbSBmdWVsZWNvbm9teS5nb3YiLA0KICAgIHkgPSAiSGlnaHdheSBNaWxlcyBwZXIgR2FsbG9uIiwNCiAgICB4ID0gIkVuZ2luZSBEaXNwbGFjZW1lbnQiDQogICkNCmBgYA0KDQpBbm90aGVyIHdheSB0byBtb2RlbCBhbmQgdmlzdWFsaXplIHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiBmdWVsIGVmZmljaWVuY3kgYW5kIGVuZ2luZSBkaXNwbGFjZW1lbnQgYWZ0ZXIgYWNjb3VudGluZyBmb3IgY2FyIGNsYXNzIGlzIHRvIHJlZ3Jlc3MgZnVlbCBlZmZpY2llbmN5IG9uIGNhciBjbGFzcywgYW5kIHBsb3QgdGhlIHJlc2lkdWFscyBvZiB0aGF0IHJlZ3Jlc3Npb24gYWdhaW5zdCBlbmdpbmUgZGlzcGxhY2VtZW50LiBUaGUgcmVzaWR1YWxzIG9mIHRoZSBmaXJzdCByZWdyZXNzaW9uIGFyZSB0aGUgdmFyaWF0aW9uIGluIGZ1ZWwgZWZmaWNpZW5jeSBub3QgZXhwbGFpbmVkIGJ5IGVuZ2luZSBkaXNwbGFjZW1lbnQuIFRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiBmdWVsIGVmZmljaWVuY3kgYW5kIGVuZ2luZSBkaXNwbGFjZW1lbnQgaXMgYXR0ZW51YXRlZCBhZnRlciBhY2NvdW50aW5nIGZvciBjYXIgY2xhc3MuDQoNCmBgYHtyfQ0KbW9kIDwtIGxtKGh3eSB+IGNsYXNzLCBkYXRhID0gbXBnKQ0KbXBnICU+JQ0KICBhZGRfcmVzaWR1YWxzKG1vZCkgJT4lDQogIGdncGxvdChhZXMoeCA9IGRpc3BsLCB5ID0gcmVzaWQpKSArDQogIGdlb21fcG9pbnQoKSArDQogIGdlb21fc21vb3RoKG1ldGhvZCA9ICJsbSIsIHNlID0gRkFMU0UpICsNCiAgbGFicygNCiAgICB0aXRsZSA9ICJFbmdpbmUgc2l6ZSBoYXMgbGl0dGxlIGVmZmVjdCBvbiBmdWVsIGVmZmljaWVuY3kiLA0KICAgIHN1YnRpdGxlID0gIkFmdGVyIGFjY291bnRpbmcgZm9yIGNhciBjbGFzcyIsDQogICAgY2FwdGlvbiA9ICJEYXRhIGZyb20gZnVlbGVjb25vbXkuZ292IiwNCiAgICB5ID0gIkhpZ2h3YXkgTVBHIFJlbGF0aXZlIHRvIENsYXNzIEF2ZXJhZ2UiLA0KICAgIHggPSAiRW5naW5lIERpc3BsYWNlbWVudCINCiAgKQ0KYGBgDQo=