Data Cleaning

setwd("pilot_data/")

# read all CSVs in the folder, bind them
temp <- list.files(pattern="*.csv")
data_files <- lapply(temp, read_csv)
data_files <- rbindlist(data_files, fill = TRUE)


##### image analysis
# filter to get the needed values for image analysis
image_resp <- data_files %>%
  filter(test_part == "test") %>%
  select(image, response, subject_id, rt)


image_code <- gsub("resources/face_", "", image_resp$image)
image_code <- gsub(".jpg", "",image_code)


image_resp$image_code <- image_code

image_resp <- image_resp %>%
  select(subject_id, image, image_code, response)

d <- image_resp %>% filter(!grepl("mickey", image_resp$image))

# write.csv(d, "face_distance_image_responses.csv")

size_cond: size condition! 1 = large, 2 = small

d <- read_csv("face_distance_image_responses.csv") # this is the same file as above but had added labels for image condition

## Warning: Missing column names filled in: 'X1' [1]

## 
## -- Column specification --------------------------------------------------------
## cols(
##   X1 = col_double(),
##   subject_id = col_character(),
##   image = col_character(),
##   image_code = col_character(),
##   response = col_double(),
##   size_cond = col_double()
## )

d$size_cond_str <- if_else(d$size_cond == 1, "large", "small")
d$size_cond_c <- if_else(d$size_cond == 1, .5,  -.5)

Fit the model! Will need a more complex model soon to include by subject and item random intercepts.

# we are predicting response from size condition 
lm(response ~ size_cond_c, data = d) %>% summary()

## 
## Call:
## lm(formula = response ~ size_cond_c, data = d)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -43.288 -34.527  -1.527  29.712  58.473 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   42.408      1.308  32.413   <2e-16 ***
## size_cond_c    1.761      2.617   0.673    0.501    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 33.61 on 658 degrees of freedom
## Multiple R-squared:  0.0006875,  Adjusted R-squared:  -0.0008312 
## F-statistic: 0.4527 on 1 and 658 DF,  p-value: 0.5013

Preliminary Results: We ran a study manipulating face size in a visual field, and capturing subjective familiarity ratings (sliding scale from “Not at all familiar” to “Extremely Familiar”).We recruited 40 participants from Amazon Mechanical Turk. Our stimuli were 40 face images from 20 identities. For each identity in the image test set, the same image was used, but their face was either large in the monitor display, or small in the monitor display. This dataset was split into two sets for order counterbalancing.

We ran a linear model, regressing response value on face size condition (large or small). We found no significant effect of size, b = 1.76, p = .501. Our interim conclusions are that face size in a visual field does not effect subjective measures of familiar, meaning people do not judge faces to be more familiar-seeming when they are larger in a visual field. This has implications for face distance in practical terms. We are surprised by this, because there is evidence from a statistical “face experience” corpus (Oruc et al, 2020) that we often see familiar faces closer in our visual field than non-familiar people. This does not appear to have a reverse causal effect though– a closer face does not mean it’s a more familiar face.

Face Distance Familiarity Ratings

Ivette Colón

4/24/2021

Data Cleaning