readr training

1. Purpose.

The purpose of this noteboook is to illustrate how the readr package can be used to read .csv files into R.

2. Load libraries.

library(readr)

3. Read a clean csv in to R.

df <- readr::read_csv("test_data_1.csv")

## Parsed with column specification:
## cols(
##   fruit = col_character(),
##   count = col_integer()
## )

df

## # A tibble: 3 x 2
##   fruit  count
##   <chr>  <int>
## 1 Apple      1
## 2 Banana     2
## 3 Carrot    NA

4. Read a messy csv into R.

Make sure headers are in the right place, no footer observations, and non-blank NAs are identified as NA. Any other data cleaning can be done with stringr and dplyr.

readr::read_csv("test_data_2.csv")

## Warning: Missing column names filled in: 'X2' [2]

## Parsed with column specification:
## cols(
##   `A dataset proudly brought to you by StatsNZ` = col_character(),
##   X2 = col_character()
## )

## # A tibble: 7 x 2
##   `A dataset proudly brought to you by StatsNZ` X2   
##   <chr>                                         <chr>
## 1 Jun-18                                        <NA> 
## 2 fruit                                         count
## 3 Apple                                         1    
## 4 Banana                                        2    
## 5 Carrot                                        C    
## 6 <NA>                                          <NA> 
## 7 C means confidential                          <NA>

readr::read_csv("test_data_2.csv", skip = 2)

## Parsed with column specification:
## cols(
##   fruit = col_character(),
##   count = col_character()
## )

## # A tibble: 5 x 2
##   fruit                count
##   <chr>                <chr>
## 1 Apple                1    
## 2 Banana               2    
## 3 Carrot               C    
## 4 <NA>                 <NA> 
## 5 C means confidential <NA>

readr::read_csv("test_data_2.csv", skip = 2, n_max = 3)

## Parsed with column specification:
## cols(
##   fruit = col_character(),
##   count = col_character()
## )

## # A tibble: 3 x 2
##   fruit  count
##   <chr>  <chr>
## 1 Apple  1    
## 2 Banana 2    
## 3 Carrot C

readr::read_csv("test_data_2.csv", skip = 2, n_max = 3, na = c("C"))

## Parsed with column specification:
## cols(
##   fruit = col_character(),
##   count = col_integer()
## )

## # A tibble: 3 x 2
##   fruit  count
##   <chr>  <int>
## 1 Apple      1
## 2 Banana     2
## 3 Carrot    NA

5. Read an excel spreadsheet into R.

readxl::excel_sheets("test_data_1.xlsx")

## [1] "fruit"      "vegetables"

readxl::read_excel("test_data_1.xlsx", sheet = 1)

## # A tibble: 3 x 2
##   fruit  count
##   <chr>  <dbl>
## 1 Apple      1
## 2 Banana     2
## 3 Carrot    NA

readxl::read_excel("test_data_1.xlsx", sheet = "vegetables")

## # A tibble: 3 x 2
##   vegetable count
##   <chr>     <dbl>
## 1 Tomato        5
## 2 Beans         2
## 3 Spinach       1

`readr` training

David Hodge

June 2018

1. Purpose.

2. Load libraries.

3. Read a clean csv in to R.

4. Read a messy csv into R.

5. Read an excel spreadsheet into R.