The qualtRics package is huge timesaver and really easy once you get it set up. Ordinarily, you would have to open Qualtrics and download the csv file from each survey manually, but this package lets you pull the data directly from R.

Once they approve your request, you should be able to go to your Account Settings –> Qualtrics IDs –> hit “Generate Token” under the API box. Copy the token because you will use it below, when you authenticate your credentials.

But first, in your RMarkdown file, you’ll want to install and load the qualtRics package:

load libraries

#install.packages("qualtRics") # comment this out once you've installed
library(qualtRics)
library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(ggplot2)

In the code chunk below, you will authenticate your API credentials. Where it says “ENTER TOKEN HERE”, enter the token you copied from the API box in your Qualtrics account page. (Install = T ensures that you only have to do this once.)

authenticate api

qualtrics_api_credentials(api_key = "IPgHt7CTgQBD5OgvjJHv48PlYnFJk8OYvNq8miJp", base_url = "stanforduniversity.ca1.qualtrics.com", install = T, overwrite = TRUE)

## Your original .Renviron will be backed up and stored in your R HOME directory if needed.

## Your Qualtrics key and base URL have been stored in your .Renviron.  
## To use now, restart R or run `readRenviron("~/.Renviron")`

# readRenviron("~/.Renviron")

Make sure to remove your api_key above before you share this Rmd with others or push to Github.

You are all set up! Now you can actually download the data for any of your surveys. To find your survey ID, go to your survey and look at the link. The string that starts with a “SV_ …” between the two slashes is your survey ID, which you should enter in the field that says “SURVEY ID HERE”. (force_request = T tells the fetch_survey function to fetch the survey again even if you’ve fetched the survey in the past - especially useful if you’re frequently downloading the most recent data for things like prompt participant payment.)

load survey

qualtricsData <- fetch_survey(surveyID = "SV_6sRpi4Fvh05zJPg", force_request = T)

## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%

## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   .default = col_character(),
##   StartDate = col_datetime(format = ""),
##   EndDate = col_datetime(format = ""),
##   Progress = col_double(),
##   `Duration (in seconds)` = col_double(),
##   Finished = col_logical(),
##   RecordedDate = col_datetime(format = ""),
##   RecipientLastName = col_logical(),
##   RecipientFirstName = col_logical(),
##   RecipientEmail = col_logical(),
##   ExternalReference = col_logical(),
##   LocationLatitude = col_double(),
##   LocationLongitude = col_double(),
##   `gender _3_TEXT` = col_logical(),
##   age = col_double(),
##   `employment-status _7_TEXT` = col_logical(),
##   `followup-other_2_TEXT` = col_logical(),
##   `resumes_randomization_DO_male-employee` = col_double(),
##   `resumes_randomization_DO_female-founder` = col_double(),
##   `resumes_randomization_DO_female-employee` = col_double(),
##   `resumes_randomization_DO_male-founder` = col_double()
##   # ... with 4 more columns
## )
## ℹ Use `spec()` for the full column specifications.

## Warning: 6 parsing failures.
## row col           expected actual
##   9  -- value in level set      1
##  10  -- value in level set      1
##  11  -- value in level set      1
##  12  -- value in level set      1
##  13  -- value in level set      1
## ... ... .................. ......
## See problems(...) for more details.

## Warning: 1 parsing failure.
## row col           expected actual
##   3  -- value in level set      1

Finally, you may want to save this data in a data folder:

save data

write_csv(qualtricsData, "./qualtricsData.csv")

preliminary analysis

First, I clean the data – excluding cases that did not have necessary responses on key variable

d <- qualtricsData %>% 
  drop_na(`recommend-scale`)

I then recode the recommendation scale to a numerical scale.

d$'recommend.scale' <- recode(d$`recommend-scale`, 'Definitely\nwill not recommend'= 1,'Not very probably recommend' = 2, 'Probably not recommend' = 3, 'Might or\nmight not recommend' = 4, 'Probably recommend' = 5, 'Very probably recommend' =6, 'Definitely will recommend' = 7)

I then create a variable for the ex-employee resume condition and the ex-founder resume condition, because the survey is a 2 x 2 study.

d = d %>%
  select('recommend.scale', starts_with('resumes_randomization'), 'attention-binary-qs') %>%
  mutate(employee_condition = ifelse(is.na(d$'resumes_randomization_DO_male-employee')== FALSE | 
                                       is.na(d$'resumes_randomization_DO_female-employee')==FALSE, 
                                     'employee', 'founder')) %>% 
  mutate(gender = ifelse((d$'resumes_randomization_DO_female-employee'==1) | (d$'resumes_randomization_DO_female-founder'==1), 'female', 'male')) %>%
  mutate(gender = ifelse(is.na(gender), 'male', 'female'))


#d = d %>%
#  select('recommend.scale', starts_with('resumes_randomization')) %>%
#  mutate(employee_condition = ifelse(('resumes_randomization_DO_male-employee'==1) | ('resumes_randomization_DO_female-employee'==1), 'employee', 'founder')) %>%
#  mutate(employee_condition = ifelse(is.na(employee_condition), 'founder', employee_condition)) %>%
#  mutate(gender = ifelse(('resumes_randomization_DO_female-employee'==1) | ('resumes_randomization_DO_female-founder'==1), 'female', 'male')) %>%
#  mutate(gender = ifelse(is.na(gender), 'male', 'female'))

#d$employee_condition <- ifelse((d$'resumes_randomization_DO_male-employee'==1 | d$'resumes_randomization_DO_female-employee' ==1), '1', '0')
#d$'employee_condition'[is.na(d$'employee_condition')] <- 0
#d$entp_condition <- ifelse((d$'resumes_randomization_DO_male-founder'==1 | d$'resumes_randomization_DO_female-founder' ==1), '1', '0')

I exclude cases that do not pass the attention check.

d$attention.check <- recode(d$`attention-binary-qs`, 'Yes, an ex-founder'= "founder",'No, not an ex-founder' = 'employee')

d = d %>% 
  filter(attention.check == employee_condition)

Let’s plot this out!

ggplot(d, aes(x = recommend.scale, fill=gender)) +
  geom_bar() +
  facet_grid(~employee_condition)

d %>%
  group_by(employee_condition, gender) %>%
  summarize(meanscore = mean(recommend.scale),
            sdscore = sd(recommend.scale)) %>%
  ggplot(mapping=aes(x=employee_condition, y=meanscore, #ymin=meanscore-sd, ymax=meanscore+sd,
                     fill=gender)) +
  geom_bar(stat='identity') #+

## `summarise()` has grouped output by 'employee_condition'. You can override using the `.groups` argument.

  #geom_errorbar()

t.test(d$recommend.scale[d$employee_condition=='founder'], 
       d$recommend.scale[d$employee_condition=='employee'])

## 
##  Welch Two Sample t-test
## 
## data:  d$recommend.scale[d$employee_condition == "founder"] and d$recommend.scale[d$employee_condition == "employee"]
## t = -0.343, df = 4.8983, p-value = 0.7458
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.847170  2.180504
## sample estimates:
## mean of x mean of y 
##  4.666667  5.000000

#t.test(c(1,2,3), c(1,2))
#t.test(c(1,2), c(1))

Pilot study analysis

Seyeon Kim

11/11/2021

load libraries

authenticate api

load survey

save data

preliminary analysis