Welcome to this guide on importing data into R! This handout will walk you through importing CSV, Excel, and Stata (dta) files into R, as well as provide tips for addressing issues with other data formats.
Comma-separated values (CSV) files are one of the most common data formats. You can use base R or the tidyverse package for efficient imports.
Using Base R:
# Import a CSV file
my_data <- read.csv("path/to/your/file.csv")
Using the tidyverse package:
# Install and load the package
install.packages("tidyverse")
library(tidyverse)
# Import a CSV file
my_data <- read_csv("path/to/your/file.csv")
Note: You only need to install either tidyverse or use base R for CSV imports, not both. You can also install packages through RStudio’s point-and-click interface under the “Packages” tab.
For Excel files, you can use the readxl or openxlsx packages.
Using the readxl package:
# Install and load the package
install.packages("readxl")
library(readxl)
# Import an Excel file
my_data <- read_excel("path/to/your/file.xlsx", sheet = 1)
Using the openxlsx package:
# Install and load the package
install.packages("openxlsx")
library(openxlsx)
# Import an Excel file
my_data <- read.xlsx("path/to/your/file.xlsx", sheet = 1)
Note: readxl supports .xls and .xlsx files but does not allow writing files. Use openxlsx for both reading and writing. You can install either package based on your needs.
Stata files have the .dta extension. The haven and foreign packages are popular options.
Using the haven package:
# Install and load the package
install.packages("haven")
library(haven)
# Import a Stata file
my_data <- read_dta("path/to/your/file.dta")
Using the foreign package:
# Install and load the package
install.packages("foreign")
library(foreign)
# Import a Stata file
my_data <- read.dta("path/to/your/file.dta")
Tip: The haven package preserves variable labels, making it ideal for social science data.
R has extensive support for various file types. Here are some general tips to handle potential issues:
Missing Packages: Install missing packages using
install.packages()
. You can also use RStudio’s “Packages”
tab for easy installation.
File Encoding: For files with special characters, specify the
encoding. Example:
read.csv("file.csv", fileEncoding = "UTF-8")
.
Large Files: Use the data.table or vroom package for efficient handling of large datasets.
install.packages("data.table")
library(data.table)
large_data <- fread("path/to/large_file.csv")
For SPSS files: Use read_sav()
from the haven
package.
For JSON: Use the jsonlite package.
For SQL Databases: Use the DBI and RSQLite packages.