Execute the following cell to load the tidyverse
library:
library(tidyverse)
Execute the following cell to load the data. Refer to this
website http://archive.ics.uci.edu/ml/datasets/Auto+MPG for
details on the dataset:
autompg = read.table(
"http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data",
quote = "\"",
comment.char = "",
stringsAsFactors = FALSE)
head(autompg,20)
Question 1.1: print the structure of the unedited
data set. How many samples and features are there?
str()
Execute the following cell to assign names to the columns of
the dataframe:
colnames(autompg) = c("mpg", "cyl", "disp", "hp", "wt", "acc", "year", "origin", "name")
Question 1.2: complete the code segment below to
remove samples with missing horsepower (hp) values represented as a “?”
in the dataset.
autompg = autompg %>% filter()
Question 1.3: complete the code segment below to
remove samples with the name “plymouth reliant”
autompg = autompg %>%
Question 2.1: complete the code segment below to
select all features except ‘name’
autompg = autompg %>% select()
Execute the following cell to change the type of hp values
from character to numeric:
autompg$hp = as.numeric(autompg$hp)
Question 2.2: complete the code cell to modify
‘origin’ column to reflect local (1) and international models (0)
autompg = autompg %>% mutate(origin = ifelse())
head(autompg, 20)
Question 2.3: print the structure of the dataframe.
What types are the columns ‘cyl’ and ‘origin’?
str()
Question 3.1: complete the code segment below to
change the types of ‘cyl’ and ‘origin’ columns to factor
catcols = c()
autompg[catcols] =
Question 3.2: complete the code segment below to
create a scatter plot of mpg vs. displacement by color coding the points
according to the origin (local or international). Add axes labels and
title for the plot. Comment on what you observe:
p = ggplot(data = , aes(x = , y = , color = )) +
p
LS0tDQp0aXRsZTogIkFzc2lnbm1lbnQtMSINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KZWRpdG9yX29wdGlvbnM6IA0KICBjaHVua19vdXRwdXRfdHlwZTogaW5saW5lDQotLS0NCg0KKipFeGVjdXRlIHRoZSBmb2xsb3dpbmcgY2VsbCB0byBsb2FkIHRoZSB0aWR5dmVyc2UgbGlicmFyeToqKg0KDQpgYGB7cn0NCmxpYnJhcnkodGlkeXZlcnNlKQ0KYGBgDQoNCioqRXhlY3V0ZSB0aGUgZm9sbG93aW5nIGNlbGwgdG8gbG9hZCB0aGUgZGF0YS4gUmVmZXIgdG8gdGhpcyB3ZWJzaXRlIDxodHRwOi8vYXJjaGl2ZS5pY3MudWNpLmVkdS9tbC9kYXRhc2V0cy9BdXRvK01QRz4gZm9yIGRldGFpbHMgb24gdGhlIGRhdGFzZXQ6KioNCg0KYGBge3J9DQphdXRvbXBnID0gcmVhZC50YWJsZSgNCiAgImh0dHA6Ly9hcmNoaXZlLmljcy51Y2kuZWR1L21sL21hY2hpbmUtbGVhcm5pbmctZGF0YWJhc2VzL2F1dG8tbXBnL2F1dG8tbXBnLmRhdGEiLA0KICBxdW90ZSA9ICJcIiIsDQogIGNvbW1lbnQuY2hhciA9ICIiLA0KICBzdHJpbmdzQXNGYWN0b3JzID0gRkFMU0UpDQpoZWFkKGF1dG9tcGcsMjApDQpgYGANCg0KKipRdWVzdGlvbiAxLjEqKjogcHJpbnQgdGhlIHN0cnVjdHVyZSBvZiB0aGUgdW5lZGl0ZWQgZGF0YSBzZXQuIEhvdyBtYW55IHNhbXBsZXMgYW5kIGZlYXR1cmVzIGFyZSB0aGVyZT8NCg0KYGBge3J9DQpzdHIoKQ0KYGBgDQoNCioqRXhlY3V0ZSB0aGUgZm9sbG93aW5nIGNlbGwgdG8gYXNzaWduIG5hbWVzIHRvIHRoZSBjb2x1bW5zIG9mIHRoZSBkYXRhZnJhbWU6KioNCg0KYGBge3J9DQpjb2xuYW1lcyhhdXRvbXBnKSA9IGMoIm1wZyIsICJjeWwiLCAiZGlzcCIsICJocCIsICJ3dCIsICJhY2MiLCAieWVhciIsICJvcmlnaW4iLCAibmFtZSIpDQpgYGANCg0KKipRdWVzdGlvbiAxLjIqKjogY29tcGxldGUgdGhlIGNvZGUgc2VnbWVudCBiZWxvdyB0byByZW1vdmUgc2FtcGxlcyB3aXRoIG1pc3NpbmcgaG9yc2Vwb3dlciAoaHApIHZhbHVlcyByZXByZXNlbnRlZCBhcyBhICI/IiBpbiB0aGUgZGF0YXNldC4NCg0KYGBge3J9DQphdXRvbXBnID0gYXV0b21wZyAlPiUgZmlsdGVyKCkNCmBgYA0KDQoqKlF1ZXN0aW9uIDEuMyoqOiBjb21wbGV0ZSB0aGUgY29kZSBzZWdtZW50IGJlbG93IHRvIHJlbW92ZSBzYW1wbGVzIHdpdGggdGhlIG5hbWUgInBseW1vdXRoIHJlbGlhbnQiDQoNCmBgYHtyfQ0KYXV0b21wZyA9IGF1dG9tcGcgJT4lIA0KYGBgDQoNCioqUXVlc3Rpb24gMi4xKio6IGNvbXBsZXRlIHRoZSBjb2RlIHNlZ21lbnQgYmVsb3cgdG8gc2VsZWN0IGFsbCBmZWF0dXJlcyBleGNlcHQgJ25hbWUnDQoNCmBgYHtyfQ0KYXV0b21wZyA9IGF1dG9tcGcgJT4lIHNlbGVjdCgpDQpgYGANCg0KKipFeGVjdXRlIHRoZSBmb2xsb3dpbmcgY2VsbCB0byBjaGFuZ2UgdGhlIHR5cGUgb2YgaHAgdmFsdWVzIGZyb20gY2hhcmFjdGVyIHRvIG51bWVyaWM6KioNCg0KYGBge3J9DQphdXRvbXBnJGhwID0gYXMubnVtZXJpYyhhdXRvbXBnJGhwKQ0KYGBgDQoNCioqUXVlc3Rpb24gMi4yKio6IGNvbXBsZXRlIHRoZSBjb2RlIGNlbGwgdG8gbW9kaWZ5ICdvcmlnaW4nIGNvbHVtbiB0byByZWZsZWN0IGxvY2FsICgxKSBhbmQgaW50ZXJuYXRpb25hbCBtb2RlbHMgKDApDQoNCmBgYHtyfQ0KYXV0b21wZyA9IGF1dG9tcGcgJT4lIG11dGF0ZShvcmlnaW4gPSBpZmVsc2UoKSkNCmhlYWQoYXV0b21wZywgMjApDQpgYGANCg0KKipRdWVzdGlvbiAyLjMqKjogcHJpbnQgdGhlIHN0cnVjdHVyZSBvZiB0aGUgZGF0YWZyYW1lLiBXaGF0IHR5cGVzIGFyZSB0aGUgY29sdW1ucyAnY3lsJyBhbmQgJ29yaWdpbic/DQoNCmBgYHtyfQ0Kc3RyKCkNCmBgYA0KDQoqKlF1ZXN0aW9uIDMuMSoqOiBjb21wbGV0ZSB0aGUgY29kZSBzZWdtZW50IGJlbG93IHRvIGNoYW5nZSB0aGUgdHlwZXMgb2YgJ2N5bCcgYW5kICdvcmlnaW4nIGNvbHVtbnMgdG8gZmFjdG9yDQoNCmBgYHtyfQ0KY2F0Y29scyA9IGMoKQ0KYXV0b21wZ1tjYXRjb2xzXSA9IA0KYGBgDQoNCioqUXVlc3Rpb24gMy4yKio6IGNvbXBsZXRlIHRoZSBjb2RlIHNlZ21lbnQgYmVsb3cgdG8gY3JlYXRlIGEgc2NhdHRlciBwbG90IG9mIG1wZyB2cy4gZGlzcGxhY2VtZW50IGJ5IGNvbG9yIGNvZGluZyB0aGUgcG9pbnRzIGFjY29yZGluZyB0byB0aGUgb3JpZ2luIChsb2NhbCBvciBpbnRlcm5hdGlvbmFsKS4gQWRkIGF4ZXMgbGFiZWxzIGFuZCB0aXRsZSBmb3IgdGhlIHBsb3QuIENvbW1lbnQgb24gd2hhdCB5b3Ugb2JzZXJ2ZToNCg0KYGBge3J9DQpwID0gZ2dwbG90KGRhdGEgPSAsIGFlcyh4ID0gLCB5ID0gLCBjb2xvciA9ICkpICsgDQogIA0KcA0KYGBgDQo=