The haven package allows us to import many file types, including .sav. First, you need to install the haven package:
install.packages("haven")
From here, we can use the haven package to import the .sav file
dat_haven <- haven::read_sav(file.choose())
Now, we can use the head function to look examine our data:
head(dat_haven)
## # A tibble: 6 x 13
## ID gender age class agen_1 agen_2 agen_3 achiev_1 achiev_2 achiev_3
## <dbl> <chr> <dbl> <chr> <dbl+lb> <dbl+lb> <dbl+lb> <dbl+lb> <dbl+lb> <dbl+lb>
## 1 1028 female 14 A 2 [Disa~ 2 [Disa~ 3 [Neit~ 3 [Neit~ 4 [Agre~ 3 [Neit~
## 2 1049 male 13 A 3 [Neit~ 3 [Neit~ 2 [Disa~ 2 [Disa~ 3 [Neit~ 3 [Neit~
## 3 1052 female 14 B 3 [Neit~ 3 [Neit~ 3 [Neit~ 2 [Disa~ 2 [Disa~ 3 [Neit~
## 4 1063 male 14 A 3 [Neit~ 3 [Neit~ 3 [Neit~ 2 [Disa~ 2 [Disa~ 3 [Neit~
## 5 1119 male 13 A 3 [Neit~ 3 [Neit~ 3 [Neit~ 3 [Neit~ 4 [Agre~ 4 [Agre~
## 6 1142 female 13 A 3 [Neit~ 3 [Neit~ 3 [Neit~ 3 [Neit~ 4 [Agre~ 3 [Neit~
## # ... with 3 more variables: motiv_1 <dbl+lbl>, motiv_2 <dbl+lbl>,
## # motiv_3 <dbl+lbl>
Looking at the output, we can see that our data was read in as a tibble (a type of data frame used in the tidyverse), and includes label information that was part of the original .sav file.
Another package that can be used to import a .sav file is the foreign package. This package comes pre-installed with R as part of the default package list, so we do not need to install it.
By default, foreign reads the data as a list, but we can include the argument to.data.frame=T to have foreign convert the data into a dataframe.
dat_foreign <- foreign::read.spss(file.choose(),
to.data.frame = T)
## re-encoding from UTF-8
Now, we can use the head function again to examine our foreign data:
head(dat_foreign)
## ID gender age class agen_1 agen_2
## 1 1028 female 14 A Disagree Disagree
## 2 1049 male 13 A Neither Agree or Disagree Neither Agree or Disagree
## 3 1052 female 14 B Neither Agree or Disagree Neither Agree or Disagree
## 4 1063 male 14 A Neither Agree or Disagree Neither Agree or Disagree
## 5 1119 male 13 A Neither Agree or Disagree Neither Agree or Disagree
## 6 1142 female 13 A Neither Agree or Disagree Neither Agree or Disagree
## agen_3 achiev_1 achiev_2
## 1 Neither Agree or Disagree Neither Agree or Disagree Agree
## 2 Disagree Disagree Neither Agree or Disagree
## 3 Neither Agree or Disagree Disagree Disagree
## 4 Neither Agree or Disagree Disagree Disagree
## 5 Neither Agree or Disagree Neither Agree or Disagree Agree
## 6 Neither Agree or Disagree Neither Agree or Disagree Agree
## achiev_3 motiv_1 motiv_2
## 1 Neither Agree or Disagree Neither Agree or Disagree Disagree
## 2 Neither Agree or Disagree Neither Agree or Disagree Agree
## 3 Neither Agree or Disagree Disagree Neither Agree or Disagree
## 4 Neither Agree or Disagree Neither Agree or Disagree Neither Agree or Disagree
## 5 Agree Disagree Neither Agree or Disagree
## 6 Neither Agree or Disagree Disagree Agree
## motiv_3
## 1 Neither Agree or Disagree
## 2 Agree
## 3 Neither Agree or Disagree
## 4 Disagree
## 5 Disagree
## 6 Agree
Looking at this output, we can see that our data was imported as a data frame and without some label information that was part of the original .sav file. We also see that our scale variables were read in as factors with the labels of the factor included.
The final option we’ll cover involves exporting a .csv file from SPSS and importing that .csv file directly into R. This option is only available if you have access to SPSS, of course. Once you save out a .csv file, you can import the .csv using the base R function read.csv().
dat_base <- read.csv(file.choose())
Now, we will once again use head() to examine our data:
head(dat_base)
## ï..ID gender age class agen_1 agen_2 agen_3 achiev_1 achiev_2 achiev_3
## 1 1028 female 14 A 2 2 3 3 4 3
## 2 1049 male 13 A 3 3 2 2 3 3
## 3 1052 female 14 B 3 3 3 2 2 3
## 4 1063 male 14 A 3 3 3 2 2 3
## 5 1119 male 13 A 3 3 3 3 4 4
## 6 1142 female 13 A 3 3 3 3 4 3
## motiv_1 motiv_2 motiv_3
## 1 3 2 3
## 2 3 4 4
## 3 2 3 3
## 4 3 3 2
## 5 2 3 2
## 6 2 4 4
We can see from the output that the data was read in as a data frame, and all of the original label information has been stripped. R will assign classes to the data columns based on the kind of data included, and we can see that our scale items are now read in as integers instead of factors (which is preferrable for SEM). We also see that the first variable name has an artifact at the start, which is common for .csv files created by other programs. We can fix this easily by renaming the first variable bacak to “ID”.
names(dat_base)
## [1] "ï..ID" "gender" "age" "class" "agen_1" "agen_2"
## [7] "agen_3" "achiev_1" "achiev_2" "achiev_3" "motiv_1" "motiv_2"
## [13] "motiv_3"
names(dat_base)[1] <- "ID"
names(dat_base)
## [1] "ID" "gender" "age" "class" "agen_1" "agen_2"
## [7] "agen_3" "achiev_1" "achiev_2" "achiev_3" "motiv_1" "motiv_2"
## [13] "motiv_3"
If you have the option, I recommend saving a .csv from the probram you’re data is saved in and using the read.csv() function to import the csv into R. This prevents any of the extra label information from causing issues in analysis down the line.
If you have been cleaning and working with your data in R, but want to bring your data into Mplus for modeling, there is a simple way to do so using the MplusAutomation package. We’ll have to install this package first:
install.packages("MplusAutomation")
Next, we use this package to automatically convert our data into a .dat file ready for Mplus, as well as create a starter .imp file that includes your variable names!
MplusAutomation::prepareMplusData(dat_base,filename = "AAM_Mplus.dat")
##
## -----
## Factor: gender
## Conversion:
## level number
## female 1
## male 2
## -----
##
##
## -----
## Factor: class
## Conversion:
## level number
## A 1
## B 2
## -----
##
## TITLE: Your title goes here
## DATA: FILE = "AAM_Mplus.dat";
## VARIABLE:
## NAMES = ID gender age class agen_1 agen_2 agen_3 achiev_1 achiev_2 achiev_3 motiv_1
## motiv_2 motiv_3;
## MISSING=.;
This converts our gender and class variable into numeric values, and prints a .imp starter script into your R output. If you want MplusAutomation to create the .imp file in your working directory, you can use the following argument:
MplusAutomation::prepareMplusData(dat_base,filename = "AAM_Mplus2.dat",inpfile = T)
##
## -----
## Factor: gender
## Conversion:
## level number
## female 1
## male 2
## -----
##
##
## -----
## Factor: class
## Conversion:
## level number
## A 1
## B 2
## -----
Now you’re all set up to use your data to model in Mplus!