Execute the following cell to load the tidyverse library:
library(tidyverse)
Execute the following cell to load the data. Refer to this website http://archive.ics.uci.edu/ml/datasets/Auto+MPG for
details on the dataset:
autompg = read.table(
"http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data",
quote = "\"",
comment.char = "",
stringsAsFactors = FALSE)
head(autompg,20)
Task 1: print the structure of the unedited data
set. How many samples and features are there?
str()
Execute the following cell to assign names to the columns of the
dataframe:
colnames(autompg) = c("mpg", "cyl", "disp", "hp", "wt", "acc", "year", "origin", "name")
Task-2: complete the code segment below to remove
samples with missing horsepower (hp) values represented as a “?” in the
dataset.
autompg = autompg %>% filter(hp )
Task-3: complete the code segment below to remove
samples with the name “plymouth reliant”
autompg = autompg %>%
Task-4: complete the code segment below to select
all features except ‘name’
autompg = autompg %>% select()
Execute the following cell to change the type of hp values from
character to numeric:
autompg$hp = as.numeric(autompg$hp)
Execute the following code cell to modify ‘origin’ column to reflect
local (1) and international models (0)
autompg = autompg %>% mutate(origin = ifelse(!(origin %in% c(2, 3)), 'local', 'international'))
head(autompg, 20)
Task 5: print the structure of the dataframe. What
types are the columns ‘cyl’ and ‘origin’?
str()
Task-6: complete the code segment below to change
the types of ‘cyl’ and ‘origin’ columns to factor
catcols = c('cyl', 'origin')
autompg[catcols] = lapply(, )
Task-7: complete the code segment below to create a
scatter plot of mpg vs. displacement by color coding the points
according to the origin (local or international), Comment on what you
observe:
p = ggplot(data = , aes(x = , y = , color = )) +
p
LS0tDQp0aXRsZTogIkFjdGl2aXR5IChHcmFkZWQpIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCg0KRXhlY3V0ZSB0aGUgZm9sbG93aW5nIGNlbGwgdG8gbG9hZCB0aGUgdGlkeXZlcnNlIGxpYnJhcnk6DQpgYGB7cn0NCmxpYnJhcnkodGlkeXZlcnNlKQ0KYGBgDQpFeGVjdXRlIHRoZSBmb2xsb3dpbmcgY2VsbCB0byBsb2FkIHRoZSBkYXRhLiBSZWZlciB0byB0aGlzIHdlYnNpdGUgaHR0cDovL2FyY2hpdmUuaWNzLnVjaS5lZHUvbWwvZGF0YXNldHMvQXV0bytNUEcNCmZvciBkZXRhaWxzIG9uIHRoZSBkYXRhc2V0Og0KYGBge3J9DQphdXRvbXBnID0gcmVhZC50YWJsZSgNCiAgImh0dHA6Ly9hcmNoaXZlLmljcy51Y2kuZWR1L21sL21hY2hpbmUtbGVhcm5pbmctZGF0YWJhc2VzL2F1dG8tbXBnL2F1dG8tbXBnLmRhdGEiLA0KICBxdW90ZSA9ICJcIiIsDQogIGNvbW1lbnQuY2hhciA9ICIiLA0KICBzdHJpbmdzQXNGYWN0b3JzID0gRkFMU0UpDQpoZWFkKGF1dG9tcGcsMjApDQpgYGANCioqVGFzayAxKio6IHByaW50IHRoZSBzdHJ1Y3R1cmUgb2YgdGhlIHVuZWRpdGVkIGRhdGEgc2V0LiBIb3cgbWFueSBzYW1wbGVzIGFuZCBmZWF0dXJlcyBhcmUgdGhlcmU/DQpgYGB7cn0NCnN0cigpDQpgYGANCkV4ZWN1dGUgdGhlIGZvbGxvd2luZyBjZWxsIHRvIGFzc2lnbiBuYW1lcyB0byB0aGUgY29sdW1ucyBvZiB0aGUgZGF0YWZyYW1lOg0KYGBge3J9DQpjb2xuYW1lcyhhdXRvbXBnKSA9IGMoIm1wZyIsICJjeWwiLCAiZGlzcCIsICJocCIsICJ3dCIsICJhY2MiLCAieWVhciIsICJvcmlnaW4iLCAibmFtZSIpDQpgYGANCioqVGFzay0yKio6IGNvbXBsZXRlIHRoZSBjb2RlIHNlZ21lbnQgYmVsb3cgdG8gcmVtb3ZlIHNhbXBsZXMgd2l0aCBtaXNzaW5nIGhvcnNlcG93ZXIgKGhwKSB2YWx1ZXMgcmVwcmVzZW50ZWQgYXMgYSAiPyIgaW4gdGhlIGRhdGFzZXQuDQpgYGB7cn0NCmF1dG9tcGcgPSBhdXRvbXBnICU+JSBmaWx0ZXIoaHAgKQ0KYGBgDQoNCioqVGFzay0zKio6IGNvbXBsZXRlIHRoZSBjb2RlIHNlZ21lbnQgYmVsb3cgdG8gcmVtb3ZlIHNhbXBsZXMgd2l0aCB0aGUgbmFtZSAicGx5bW91dGggcmVsaWFudCINCmBgYHtyfQ0KYXV0b21wZyA9IGF1dG9tcGcgJT4lIA0KYGBgDQoqKlRhc2stNCoqOiBjb21wbGV0ZSB0aGUgY29kZSBzZWdtZW50IGJlbG93IHRvIHNlbGVjdCBhbGwgZmVhdHVyZXMgZXhjZXB0ICduYW1lJw0KYGBge3J9DQphdXRvbXBnID0gYXV0b21wZyAlPiUgc2VsZWN0KCkNCmBgYA0KRXhlY3V0ZSB0aGUgZm9sbG93aW5nIGNlbGwgdG8gY2hhbmdlIHRoZSB0eXBlIG9mIGhwIHZhbHVlcyBmcm9tIGNoYXJhY3RlciB0byBudW1lcmljOg0KYGBge3J9DQphdXRvbXBnJGhwID0gYXMubnVtZXJpYyhhdXRvbXBnJGhwKQ0KYGBgDQpFeGVjdXRlIHRoZSBmb2xsb3dpbmcgY29kZSBjZWxsIHRvIG1vZGlmeSAnb3JpZ2luJyBjb2x1bW4gdG8gcmVmbGVjdCBsb2NhbCAoMSkgYW5kIGludGVybmF0aW9uYWwgbW9kZWxzICgwKQ0KYGBge3J9DQphdXRvbXBnID0gYXV0b21wZyAlPiUgbXV0YXRlKG9yaWdpbiA9IGlmZWxzZSghKG9yaWdpbiAlaW4lIGMoMiwgMykpLCAnbG9jYWwnLCAnaW50ZXJuYXRpb25hbCcpKQ0KaGVhZChhdXRvbXBnLCAyMCkNCmBgYA0KKipUYXNrIDUqKjogcHJpbnQgdGhlIHN0cnVjdHVyZSBvZiB0aGUgZGF0YWZyYW1lLiBXaGF0IHR5cGVzIGFyZSB0aGUgY29sdW1ucyAnY3lsJyBhbmQgJ29yaWdpbic/DQpgYGB7cn0NCnN0cigpDQpgYGANCioqVGFzay02Kio6IGNvbXBsZXRlIHRoZSBjb2RlIHNlZ21lbnQgYmVsb3cgdG8gY2hhbmdlIHRoZSB0eXBlcyBvZiAnY3lsJyBhbmQgJ29yaWdpbicgY29sdW1ucyB0byBmYWN0b3INCmBgYHtyfQ0KY2F0Y29scyA9IGMoJ2N5bCcsICdvcmlnaW4nKQ0KYXV0b21wZ1tjYXRjb2xzXSA9IGxhcHBseSgsICkNCmBgYA0KKipUYXNrLTcqKjogY29tcGxldGUgdGhlIGNvZGUgc2VnbWVudCBiZWxvdyB0byBjcmVhdGUgYSBzY2F0dGVyIHBsb3Qgb2YgbXBnIHZzLiBkaXNwbGFjZW1lbnQgYnkgY29sb3IgY29kaW5nIHRoZSBwb2ludHMgYWNjb3JkaW5nIHRvIHRoZSBvcmlnaW4gKGxvY2FsIG9yIGludGVybmF0aW9uYWwpLCBDb21tZW50IG9uIHdoYXQgeW91IG9ic2VydmU6DQpgYGB7cn0NCnAgPSBnZ3Bsb3QoZGF0YSA9ICwgYWVzKHggPSAsIHkgPSAsIGNvbG9yID0gKSkgKw0KICANCnANCmBgYA0KDQoNCg0K