Learning to Import Data

Load Packages

Haven allows you to load foreign data formats (SAS, Spss and Stata) in to R by wrapping the fantastic ReadStat C library written by Evan Miller.

https://www.rdocumentation.org/packages/haven/versions/0.2.0

if (!require(haven)){
  install.packages("haven", dependencies = TRUE)
  require(haven)
}
Loading required package: haven

the tidyverse makes it easy to us to tidy, clean, manipulate, and rearrange our data.

https://www.rdocumentation.org/packages/tidyverse/versions/1.3.2

if (!require(tidyverse)){
  install.packages("tidyverse", dependencies = TRUE)
  require(tidyverse)
}
Loading required package: tidyverse
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0      ✔ purrr   1.0.1 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.2.1      ✔ stringr 1.5.0 
✔ readr   2.1.3      ✔ forcats 0.5.2 
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

the openxlsx simplifies the creation of .xlsx files by providing a high level interface to writing, styling and editing worksheets.

https://www.rdocumentation.org/packages/openxlsx/versions/4.2.5.1

if (!require(openxlsx)){
  install.packages("openxlsx", dependencies = TRUE)
  require(openxlsx)
}
Loading required package: openxlsx

Import Data

dataset.csv <-read_csv("Harry Potter Data.csv")
Rows: 124 Columns: 90
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (90): StartDate, EndDate, Status, IPAddress, Progress, Duration (in seco...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
dataset.xls <-read.xlsx("Harry Potter Data.xlsx")
dataset.sav <-read_spss("Harry Potter Data.sav")
dataset.spss.web <- read_sav ("https://osf.io/kd4ej/download")
dataset.spss.web <- read_sav ("https://osf.io/download/kd4ej/")

Bonus Points

CSV from the web

dataset.csv.web <- read_csv ("https://osf.io/download/wtghz/")
Rows: 124 Columns: 90
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (90): StartDate, EndDate, Status, IPAddress, Progress, Duration (in seco...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Excel from the web

dataset.xlsx.web <- read.xlsx ("https://osf.io/download/7fz89/")