Download from GitHub the data file Example_5.xls. Open it in Excel and figure out which sheet of data we should import into R. At the same time figure out how many initial rows need to be skipped. Import the data set into a data frame and show the structure of the imported data using the str() command. Make sure that your data has n=31 observations and the three columns are appropriately named. If you make any modifications to the data file, comment on those modifications.
data.5 <- read_excel(path='Example_5.xls', col_names=T, sheet='RawData', skip=4)
str(data.5)
## tibble [31 × 3] (S3: tbl_df/tbl/data.frame)
## $ Girth : num [1:31] 8.3 8.6 8.8 10.5 10.7 10.8 11 11 11.1 11.2 ...
## $ Height: num [1:31] 70 65 63 72 81 83 66 75 80 75 ...
## $ Volume: num [1:31] 10.3 10.3 10.2 16.4 18.8 19.7 15.6 18.2 22.6 19.9 ...
#Initially reading it in as above led to two additional columns (all filled with NA), likely due to cells not being fully deleted. I went back and did some broad sweeping deletes in the Excel file and reimported to make sure that only the three variable columns transferred to the data frame.
Download from GitHub the data file Example_3.xls. Import the data set into a data frame and show the structure of the imported data using the tail() command which shows the last few rows of a data table. Make sure the Tesla values are NA where appropriate and that both -9999 and NA are imported as NA values. If you make any modifications to the data file, comment on those modifications.
data.3 <- read_excel(path='Example_3.xls', col_names=T, sheet='data', na=c("NA", "-9999"))
tail(data.3) %>% kable()
| model | mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Lotus Europa | 30.4 | 4 | 95.1 | 113 | 3.77 | 1.513 | 16.90 | 1 | 1 | 5 | 2 |
| Ford Pantera L | 15.8 | 8 | 351.0 | 264 | 4.22 | 3.170 | 14.50 | 0 | 1 | 5 | 4 |
| Ferrari Dino | 19.7 | 6 | 145.0 | 175 | 3.62 | 2.770 | 15.50 | 0 | 1 | 5 | 6 |
| Maserati Bora | 15.0 | 8 | 301.0 | 335 | 3.54 | 3.570 | 14.60 | 0 | 1 | 5 | 8 |
| Volvo 142E | 21.4 | 4 | 121.0 | 109 | 4.11 | 2.780 | 18.60 | 1 | 1 | 4 | 2 |
| Tesla ModelS P100D | 98.0 | NA | NA | 778 | NA | 4.941 | 10.41 | NA | 0 | 1 | NA |
#So many extra columns and rows! Did a bunch of deletes, but I'm wondering if there's a more efficient process?