1. Purpose.

The purpose of this noteboook is to illustrate how the readr package can be used to read .csv files into R.

2. Load libraries.

library(readr)

3. Read a clean csv in to R.

df <- readr::read_csv("test_data_1.csv")
## Parsed with column specification:
## cols(
##   fruit = col_character(),
##   count = col_integer()
## )
df
## # A tibble: 3 x 2
##   fruit  count
##   <chr>  <int>
## 1 Apple      1
## 2 Banana     2
## 3 Carrot    NA

4. Read a messy csv into R.

Make sure headers are in the right place, no footer observations, and non-blank NAs are identified as NA. Any other data cleaning can be done with stringr and dplyr.

readr::read_csv("test_data_2.csv")
## Warning: Missing column names filled in: 'X2' [2]
## Parsed with column specification:
## cols(
##   `A dataset proudly brought to you by StatsNZ` = col_character(),
##   X2 = col_character()
## )
## # A tibble: 7 x 2
##   `A dataset proudly brought to you by StatsNZ` X2   
##   <chr>                                         <chr>
## 1 Jun-18                                        <NA> 
## 2 fruit                                         count
## 3 Apple                                         1    
## 4 Banana                                        2    
## 5 Carrot                                        C    
## 6 <NA>                                          <NA> 
## 7 C means confidential                          <NA>
readr::read_csv("test_data_2.csv", skip = 2)
## Parsed with column specification:
## cols(
##   fruit = col_character(),
##   count = col_character()
## )
## # A tibble: 5 x 2
##   fruit                count
##   <chr>                <chr>
## 1 Apple                1    
## 2 Banana               2    
## 3 Carrot               C    
## 4 <NA>                 <NA> 
## 5 C means confidential <NA>
readr::read_csv("test_data_2.csv", skip = 2, n_max = 3)
## Parsed with column specification:
## cols(
##   fruit = col_character(),
##   count = col_character()
## )
## # A tibble: 3 x 2
##   fruit  count
##   <chr>  <chr>
## 1 Apple  1    
## 2 Banana 2    
## 3 Carrot C
readr::read_csv("test_data_2.csv", skip = 2, n_max = 3, na = c("C"))
## Parsed with column specification:
## cols(
##   fruit = col_character(),
##   count = col_integer()
## )
## # A tibble: 3 x 2
##   fruit  count
##   <chr>  <int>
## 1 Apple      1
## 2 Banana     2
## 3 Carrot    NA

5. Read an excel spreadsheet into R.

readxl::excel_sheets("test_data_1.xlsx")
## [1] "fruit"      "vegetables"
readxl::read_excel("test_data_1.xlsx", sheet = 1)
## # A tibble: 3 x 2
##   fruit  count
##   <chr>  <dbl>
## 1 Apple      1
## 2 Banana     2
## 3 Carrot    NA
readxl::read_excel("test_data_1.xlsx", sheet = "vegetables")
## # A tibble: 3 x 2
##   vegetable count
##   <chr>     <dbl>
## 1 Tomato        5
## 2 Beans         2
## 3 Spinach       1