Exercise 9.5.1

Download from GitHub the data file Example_5.xls. Open it in Excel and figure out which sheet of data we should import into R. At the same time figure out how many initial rows need to be skipped. Import the data set into a data frame and show the structure of the imported data using the str() command. Make sure that your data has n=31 observations and the three columns are appropriately named. If you make any modifications to the data file, comment on those modifications.

data.5 <- read_excel(path='Example_5.xls', col_names=T, sheet='RawData', skip=4)

str(data.5) 
## tibble [31 × 3] (S3: tbl_df/tbl/data.frame)
##  $ Girth : num [1:31] 8.3 8.6 8.8 10.5 10.7 10.8 11 11 11.1 11.2 ...
##  $ Height: num [1:31] 70 65 63 72 81 83 66 75 80 75 ...
##  $ Volume: num [1:31] 10.3 10.3 10.2 16.4 18.8 19.7 15.6 18.2 22.6 19.9 ...
#Initially reading it in as above led to two additional columns (all filled with NA), likely due to cells not being fully deleted. I went back and did some broad sweeping deletes in the Excel file and reimported to make sure that only the three variable columns transferred to the data frame.

Exercise 9.5.2

Download from GitHub the data file Example_3.xls. Import the data set into a data frame and show the structure of the imported data using the tail() command which shows the last few rows of a data table. Make sure the Tesla values are NA where appropriate and that both -9999 and NA are imported as NA values. If you make any modifications to the data file, comment on those modifications.

data.3 <- read_excel(path='Example_3.xls', col_names=T, sheet='data', na=c("NA", "-9999"))

tail(data.3) %>% kable()
model mpg cyl disp hp drat wt qsec vs am gear carb
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
Tesla ModelS P100D 98.0 NA NA 778 NA 4.941 10.41 NA 0 1 NA
#So many extra columns and rows! Did a bunch of deletes, but I'm wondering if there's a more efficient process?