Creativity with AI (IRiSS sample)

IRiSS

Libraries

here() starts at /Users/sarawu/Git_Projects/idea_generation

Attaching package: 'renv'
The following objects are masked from 'package:stats':

    embed, update
The following objects are masked from 'package:utils':

    history, upgrade
The following objects are masked from 'package:base':

    autoload, load, remove, use
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
✖ purrr::modify() masks renv::modify()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
This is text (version 1.7.0).
Newer versions may have improved functions and updated defaults to reflect current understandings of the state-of-the-art.

MacOS detected: Setting OpenMP environment variables to avoid potential crash due to libomp conflicts. 
When using the L-BAM library, be aware that models may be downloaded from external sources. Using models may carry security risks, including the possibility of malicious code in RDS files. Always review and trust the source of any model you load.  
The text package is provided 'as is' without any warranty of any kind. 

For more information about the package see www.r-text.org and www.r-topics.org.


Attaching package: 'scales'


The following object is masked from 'package:purrr':

    discard


The following object is masked from 'package:readr':

    col_factor


Loading required package: Matrix


Attaching package: 'Matrix'


The following objects are masked from 'package:tidyr':

    expand, pack, unpack



Attaching package: 'lmerTest'


The following object is masked from 'package:lme4':

    lmer


The following object is masked from 'package:stats':

    step



Attaching package: 'simr'


The following object is masked from 'package:lme4':

    getData


The following object is masked from 'package:stringr':

    fixed



Attaching package: 'kableExtra'


The following object is masked from 'package:dplyr':

    group_rows


Loading required package: tinylabels

Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'

This is lavaan 0.6-20
lavaan is FREE software! Please report any bugs.


Attaching package: 'sjstats'


The following objects are masked from 'package:effectsize':

    cohens_f, cramers_v, phi


The following object is masked from 'package:papaja':

    se


The following object is masked from 'package:broom':

    bootstrap


Loading required package: MASS


Attaching package: 'MASS'


The following object is masked from 'package:gtsummary':

    select


The following object is masked from 'package:patchwork':

    area


The following object is masked from 'package:dplyr':

    select


Loading required package: msm

Loading required package: polycor

wCorr v1.9.8



Attaching package: 'rstatix'


The following object is masked from 'package:MASS':

    select


The following object is masked from 'package:sjstats':

    t_test


The following objects are masked from 'package:effectsize':

    cohens_d, eta_squared


The following object is masked from 'package:stats':

    filter


Loading required package: mvtnorm


Attaching package: 'mvtnorm'


The following object is masked from 'package:effectsize':

    standardize


Loading required package: sandwich

mediation: Causal Mediation Analysis
Version: 4.5.1


Loading required package: boot


Attaching package: 'boot'


The following object is masked from 'package:msm':

    cav



Attaching package: 'QuantPsyc'


The following object is masked from 'package:Matrix':

    norm


The following object is masked from 'package:base':

    norm

IRiSS Sample

iriss <- read_csv("data/iriss.csv") %>% 
  slice(-(1:2)) %>% 
  filter(attn_check == "4" & attn_check_4_TEXT == "8") %>% 
  rename(condition = FL_3_DO) %>% 
  relocate(condition, .after = attn_check_4_TEXT) %>% 
  mutate(condition = case_when(condition == "GuidedInstructions" ~ "high-agency",
                               condition == "UnguidedInstructions" ~ "low-agency",
                               condition == "ControlInstructions" ~ "Control")) %>% 
  filter(Progress=="100") %>% 
  rename(submitter_id = "ResponseId") %>% 
  mutate(age = as.numeric(age))
New names:
Rows: 155 Columns: 185
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(185): StartDate, EndDate, Status, Progress, Duration (in seconds), Fini...
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `timer_First Click` -> `timer_First Click...14`
• `timer_Last Click` -> `timer_Last Click...15`
• `timer_Page Submit` -> `timer_Page Submit...16`
• `timer_Click Count` -> `timer_Click Count...17`
• `timer_First Click` -> `timer_First Click...18`
• `timer_Last Click` -> `timer_Last Click...19`
• `timer_Page Submit` -> `timer_Page Submit...20`
• `timer_Click Count` -> `timer_Click Count...21`
• `timer_First Click` -> `timer_First Click...22`
• `timer_Last Click` -> `timer_Last Click...23`
• `timer_Page Submit` -> `timer_Page Submit...24`
• `timer_Click Count` -> `timer_Click Count...25`
iriss %>% 
  get_summary_stats(age, type="mean_sd")
# A tibble: 1 × 4
  variable     n  mean    sd
  <fct>    <dbl> <dbl> <dbl>
1 age        150  25.1  9.17
submitter_demographics <- iriss %>%
  mutate(gender = case_when(gender == "1" ~ "Male",
                            gender == "2" ~ "Female",
                            gender == "3" ~ "Non-binary",
                            gender == "4" ~ "Prefer to self-describe")) %>% 
  mutate(race = case_when(race == "1" ~ "Black or African-American",
                          race == "2" ~ "American Indian or Alaskan Native",
                          race == "3" ~ "White or Caucasian",
                          race == "4" ~ "Native Hawaiian",
                          race == "5" ~ "Hispanic, Latino or Spanish Origin",
                          race == "6" ~ "Asian-American",
                          race == "7" ~ "Multiracial or self-described")) %>% 
  mutate(race = fct_infreq(race)) %>% 
  dplyr::select(submitter_id, age, gender, race) 

submitter_demographics %>% 
  tbl_summary(include = c(age, gender, race),
              statistic = list(
                all_continuous() ~ "{mean} ({sd})",
                all_categorical() ~ "{n} / {N} ({p}%)"),
              digits = all_continuous() ~ 1,
              label = list(age = "Age",
                           gender = "Gender",
                           race = "Race")) %>% 
   modify_header(label ~ "**Demographic**")
Demographic N = 1501
Age 25.1 (9.2)
Gender
    Female 108 / 150 (72%)
    Male 36 / 150 (24%)
    Non-binary 2 / 150 (1.3%)
    Prefer to self-describe 4 / 150 (2.7%)
Race
    Asian-American 49 / 140 (35%)
    Hispanic, Latino or Spanish Origin 41 / 140 (29%)
    White or Caucasian 25 / 140 (18%)
    Multiracial or self-described 19 / 140 (14%)
    Black or African-American 5 / 140 (3.6%)
    Native Hawaiian 1 / 140 (0.7%)
    Unknown 10
1 Mean (SD); n / N (%)
iriss_memory_check <- iriss %>% 
  mutate(memory_check_condition = case_when(memory_check == "1" ~ "high-agency",
                                            memory_check == "2" ~ "low-agency",
                                            is.na(memory_check) ~ "Control")) %>% 
  relocate(memory_check_condition, .after = "memory_check") %>% 
  mutate(memory_check_result = if_else(condition == memory_check_condition, 1, 0)) %>% 
  relocate(memory_check_result, .after = "memory_check_condition")

# Check the % of participants who remembered what the instructions said
iriss_memory_check %>% 
  count(memory_check_result) %>% 
  mutate(prop=n/sum(n))
# A tibble: 2 × 3
  memory_check_result     n  prop
                <dbl> <int> <dbl>
1                   0    31 0.207
2                   1   119 0.793
iriss_memory_check %>% 
  count(condition)
# A tibble: 3 × 2
  condition       n
  <chr>       <int>
1 Control        50
2 high-agency    50
3 low-agency     50
# Convert to long format
iriss_trial_data <- iriss %>% 
  pivot_longer(names_to = "trial",
                values_to = "use",
                cols = starts_with(c("pilot", "pass", "control"))) %>% 
  mutate(trial = str_remove(trial, "open_")) %>% # fixed naming problem
  filter(grepl("^(pilot|pass|control)_(brick|frisbee|bubble)_[1-5]$", trial)) %>% 
  filter(!is.na(use)) %>% # Keep only the trials with responses
  relocate(trial, .after = "condition") %>% 
  relocate(use, .after = "trial") %>% 
  mutate(object = str_extract(trial, "(?<=_).*")) %>% 
  mutate(object = str_remove(object, "_.*")) %>% 
  relocate(object, .after = "trial") %>% 
  mutate(object = if_else(object == "bubble", "bubblewrap", object))

write_csv(iriss_trial_data, "data/iriss_trial_data.csv")
iriss_self_perceptions <- iriss %>% 
  pivot_longer(names_to = "metric",
               values_to = "rating",
               cols = c(ends_with("_unique"),
                        matches("self_[1-3]$"))) %>% 
  drop_na(rating) %>% 
  mutate(metric = str_replace_all(metric, 
                                  c("self_1" = "creativity",
                                    "self_2" = "originality",
                                    "self_3" = "usefulness"))) %>% 
  separate(
    col = metric,
    into = c("condition", "object", "question"),
    sep = "_",
    remove = TRUE# keep original column if you still want it
  ) %>% 
  mutate(
    object = str_replace_all(
      object,
      c("\\bfris\\b" = "frisbee",
        "\\bbubb\\b"  = "bubble"))) %>% 
  mutate(condition = case_when(condition == "pilot" ~ "High-Agency",
                               condition == "pass" ~ "Low-Agency",
                               condition == "control" ~ "Control")) %>% 
  pivot_wider(names_from = "question",
              values_from = "rating") %>% 
  mutate(across(c(creativity, originality, usefulness), as.numeric))
# Self-perceptions of creativity, originality, usefulness

self_perception_outcomes <- c("originality", "creativity", "usefulness")

self_perception_plots <- map(self_perception_outcomes, function(var) {
  summary_df <- iriss_self_perceptions %>%
    mutate(condition = fct_relevel(as.factor(condition), c("Low-Agency", "High-Agency", "Control"))) %>% 
    group_by(submitter_id, condition) %>%
    summarize(value = mean(.data[[var]]), .groups = "drop")
  
  ggplot(summary_df, aes(x = value, y = condition, fill = condition)) +
    geom_density_ridges(alpha = 0.6, quantile_lines = TRUE, quantile_fun = mean) +
    xlim(c(1,5)) +
    labs(x = paste0("Self-perceived ", var), y = "") +
    theme_apa() +
    theme(legend.position = "none")
})

# To display the plots:
self_originality <- self_perception_plots[[1]]  # originality
self_creativity<- self_perception_plots[[2]]  # creativity
self_usefulness <- self_perception_plots[[3]]  # usefulness

self_creativity
Picking joint bandwidth of 0.268
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_density_ridges()`).

self_originality
Picking joint bandwidth of 0.352
Warning: Removed 3 rows containing non-finite outside the scale range
(`stat_density_ridges()`).

self_usefulness
Picking joint bandwidth of 0.354
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_density_ridges()`).

# Combined plot
combined_self_plot <- self_creativity + self_originality + self_usefulness + plot_annotation(tag_levels = "a")
combined_self_plot
Picking joint bandwidth of 0.268
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_density_ridges()`).
Picking joint bandwidth of 0.352
Warning: Removed 3 rows containing non-finite outside the scale range
(`stat_density_ridges()`).
Picking joint bandwidth of 0.354
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_density_ridges()`).

ggsave(plot=combined_self_plot, filename="figures/self_perceptions.png")
Saving 7 x 5 in image
Picking joint bandwidth of 0.268
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_density_ridges()`).
Picking joint bandwidth of 0.352
Warning: Removed 3 rows containing non-finite outside the scale range
(`stat_density_ridges()`).
Picking joint bandwidth of 0.354
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_density_ridges()`).
# Mixed model
# Relationship between self-perception and third-party perception
iriss_self_perceptions %>% 
  rename(self_creativity = creativity,
         self_originality = originality,
         self_usefulness = usefulness) 
# A tibble: 445 × 154
   StartDate           EndDate   Status Progress Duration (in seconds…¹ Finished
   <chr>               <chr>     <chr>  <chr>    <chr>                  <chr>   
 1 2025-09-29 12:27:26 2025-09-… 0      100      895                    1       
 2 2025-09-29 12:27:26 2025-09-… 0      100      895                    1       
 3 2025-09-29 12:27:26 2025-09-… 0      100      895                    1       
 4 2025-09-29 13:21:18 2025-09-… 0      100      1346                   1       
 5 2025-09-29 13:21:18 2025-09-… 0      100      1346                   1       
 6 2025-09-29 13:21:18 2025-09-… 0      100      1346                   1       
 7 2025-09-29 13:38:49 2025-09-… 0      100      668                    1       
 8 2025-09-29 13:38:49 2025-09-… 0      100      668                    1       
 9 2025-09-29 13:38:49 2025-09-… 0      100      668                    1       
10 2025-09-29 13:42:17 2025-09-… 0      100      1311                   1       
# ℹ 435 more rows
# ℹ abbreviated name: ¹​`Duration (in seconds)`
# ℹ 148 more variables: RecordedDate <chr>, submitter_id <chr>,
#   DistributionChannel <chr>, UserLanguage <chr>, attn_check <chr>,
#   attn_check_4_TEXT <chr>, rep_id <chr>, `timer_First Click...14` <chr>,
#   `timer_Last Click...15` <chr>, `timer_Page Submit...16` <chr>,
#   `timer_Click Count...17` <chr>, `timer_First Click...18` <chr>, …
# Correlation?
# How unique did participants think their ideas were?
perceived_uniqueness <- iriss_self_perceptions %>% 
  drop_na(unique) %>% 
  group_by(condition) %>% 
  count(unique) %>% 
  mutate(prop=n/sum(n)) %>% 
  ggplot(aes(x=unique, y=prop, fill=condition)) +
  geom_col(position = "dodge") +
  scale_x_discrete(labels = c(
    "1" = "No one",
    "2" = "Some people",
    "3" = "Most people",
    "4" = "Everyone"
  )) +
  labs(x="How many people came up with this idea?",
       y="Proportion of responses") +
  theme_apa()

ggsave("figures/uniqueness.png", plot=perceived_uniqueness)
Saving 7 x 5 in image
# How does mean originality compare across conditions?

iriss_self_perceptions %>% 
  drop_na(originality) %>% 
  group_by(submitter_id, condition) %>% 
  summarize(originality = mean(originality)) %>% 
  ggplot(aes(x=condition, y=originality)) +
  geom_boxplot() +
  stat_summary(fun="mean") 
`summarise()` has grouped output by 'submitter_id'. You can override using the
`.groups` argument.
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_segment()`).

  labs(y="Self-perceived originality")
$y
[1] "Self-perceived originality"

attr(,"class")
[1] "labels"

Self perceptions

# Make sure the reference group is the Control group!
iriss_self_originality_mod <- lmer(
  originality ~ condition + (1|object) + (1|submitter_id), data= iriss_self_perceptions)
summary(iriss_self_originality_mod)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: originality ~ condition + (1 | object) + (1 | submitter_id)
   Data: iriss_self_perceptions

REML criterion at convergence: 1407.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.97423 -0.66248  0.04491  0.68796  2.21822 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.37877  0.6154  
 object       (Intercept) 0.01557  0.1248  
 Residual                 1.11078  1.0539  
Number of obs: 441, groups:  submitter_id, 149; object, 3

Fixed effects:
                      Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)            2.74988    0.14353  13.54053  19.158 3.36e-11 ***
conditionHigh-Agency   0.11784    0.17477 145.99395   0.674    0.501    
conditionLow-Agency    0.06594    0.17476 145.99142   0.377    0.706    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) cndH-A
cndtnHgh-Ag -0.614       
cndtnLw-Agn -0.614  0.505
iriss_self_creativity_mod <- lmer(
  creativity ~ condition + (1|object) + (1|submitter_id), data= iriss_self_perceptions)
boundary (singular) fit: see help('isSingular')
summary(iriss_self_creativity_mod)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: creativity ~ condition + (1 | object) + (1 | submitter_id)
   Data: iriss_self_perceptions

REML criterion at convergence: 1362.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.29436 -0.67723 -0.00183  0.74779  1.97638 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.2333   0.4831  
 object       (Intercept) 0.0000   0.0000  
 Residual                 1.0603   1.0297  
Number of obs: 443, groups:  submitter_id, 149; object, 3

Fixed effects:
                     Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)            3.0031     0.1097 144.1223  27.378   <2e-16 ***
conditionHigh-Agency   0.1586     0.1543 144.1117   1.028    0.306    
conditionLow-Agency    0.1506     0.1545 144.6332   0.975    0.331    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) cndH-A
cndtnHgh-Ag -0.711       
cndtnLw-Agn -0.710  0.504
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
iriss_self_usefulness_mod <- lmer(
  usefulness ~ condition + (1|object) + (1|submitter_id), data= iriss_self_perceptions)
summary(iriss_self_usefulness_mod)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: usefulness ~ condition + (1 | object) + (1 | submitter_id)
   Data: iriss_self_perceptions

REML criterion at convergence: 1404.1

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.29841 -0.66050  0.09239  0.69619  1.92264 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.38225  0.6183  
 object       (Intercept) 0.02338  0.1529  
 Residual                 1.11491  1.0559  
Number of obs: 439, groups:  submitter_id, 148; object, 3

Fixed effects:
                     Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)            3.2393     0.1527   9.8511  21.219  1.5e-09 ***
conditionHigh-Agency   0.3650     0.1753 145.4423   2.082   0.0391 *  
conditionLow-Agency    0.2035     0.1760 145.1780   1.157   0.2494    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) cndH-A
cndtnHgh-Ag -0.580       
cndtnLw-Agn -0.577  0.503

Participants in the high-agency condition perceived their ideas as more useful than the other participants.

# iriss_semantic_similarities %>% 
#   group_by(condition, submitter_id) %>% 
#   summarize(similarity = mean(similarity)) %>% 
#   ggplot(aes(x=condition, y=similarity)) +
#   geom_boxplot() +
#   stat_summary(fun="mean") 
# emmeans(iriss_semantic_similarity_mod, pairwise ~ condition, adjust="bonferroni")

Crowdsourced originality ratings

iriss_manual_ratings <- read_csv("data/iriss_manual_ratings.csv") %>% 
  filter(DistributionChannel=="anonymous") %>% 
  filter(attn_check == "4" & attn_check_4_TEXT == "8") %>% 
  drop_na(frisbee.0.condition)  %>%  # Drop rows with missing data (experimenter error) 
  mutate(age = as.numeric(age))
Rows: 160 Columns: 112
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (112): StartDate, EndDate, Status, Progress, Duration (in seconds), Fini...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# evaluator_demographics <- iriss_manual_ratings %>% 
#   mutate(age = if_else(age < 18 | age > 100, NA_real_, age)) %>% 
#   mutate(gender = case_when(gender == "1" ~ "Male",
#                             gender == "2" ~ "Female",
#                             gender == "3" ~ "Non-binary",
#                             gender == "4" ~ "Prefer to self-describe")) %>% 
#   mutate(race = case_when(race == "1" ~ "Black or African-American",
#                           race == "2" ~ "American Indian or Alaskan Native",
#                           race == "3" ~ "White or Caucasian",
#                           race == "4" ~ "Native Hawaiian",
#                           race == "5" ~ "Hispanic, Latino or Spanish Origin",
#                           race == "6" ~ "Asian-American",
#                           race == "7" ~ "Multiracial or self-described")) %>% 
#   mutate(race = fct_infreq(race)) %>% 
#   select(age, gender, race) %>% 
#   tbl_summary(include = c(age, gender, race),
#               statistic = list(
#                 all_continuous() ~ "{mean} ({sd})",
#                 all_categorical() ~ "{n} / {N} ({p}%)"),
#               digits = all_continuous() ~ 1,
#               label = list(age = "Age",
#                            gender = "Gender",
#                            race = "Race")) %>% 
#    modify_header(label ~ "**Demographic**")
# evaluator_demographics
# 
# combined_demographics <- tbl_merge(
#   tbls = list(submitter_demographics, evaluator_demographics),
#   tab_spanner = c("**Submitters**", "**Evaluators**")
# )
# 
# combined_demographics
iriss_manual_ratings_long <- iriss_manual_ratings %>% 
  rename(rater_id = ResponseId) %>% 
  pivot_longer(cols = frisbee.0.condition:bubblewrap.4.use,
               names_to = "stimulus_metadata",
               values_to = "value") %>% 
  mutate(metadata = str_extract(stimulus_metadata, "[^.]+$"),
         stimulus_metadata = str_remove(stimulus_metadata, "\\.[^.]+$")) %>% 
  separate_wider_delim(stimulus_metadata,
           ".",
           names = c("object", "trial_num")) %>% 
  pivot_wider(
              names_from = "metadata",
              values_from = "value") %>% 
  rename(submitter_id = ResponseId)

# Pivot to give each creativity rating its own column
iriss_manual_ratings_metrics <- iriss_manual_ratings_long %>% 
  pivot_longer(cols = c_frisbee_1:u_bubble_5,
               names_to = "trial",
               values_to = "rating") %>% 
  mutate(rating = as.numeric(rating)) %>% 
  mutate(metric = str_extract(trial, "^[A-Za-z]+(?=_)")) %>% 
  mutate(metric = case_when(metric == "c" ~ "creativity",
                            metric == "o" ~ "originality",
                            metric == "u" ~ "usefulness")) %>% 
  dplyr::select(rater_id, object, trial_num, submitter_id, condition, use, trial, metric, rating) %>% 
  mutate(trial = str_remove(trial, "^._")) %>% 
  mutate(trial_num = as.numeric(trial_num),
         trial_num = trial_num+1) %>% 
  unite(col="stimulus", c("object", "trial_num")) %>% 
  mutate(trial = str_replace_all(trial, "bubble", "bubblewrap")) %>% # Check names!
  filter(stimulus==trial) %>% 
  pivot_wider(names_from = "metric",
              values_from = "rating") %>% 
  mutate(object = str_extract(stimulus, "^[A-Za-z]+")) %>% 
  relocate(object, .after = "stimulus") %>% 
  mutate(use_word_count = str_count(use, "\\S+")) %>% 
  relocate(use_word_count, .after = use) %>% 
  mutate(condition=case_when(condition=="Pilot"~"High-Agency",
                             condition=="Passenger"~"Low-Agency",
                             condition=="Control"~"Control"))

# Select process variables from full dataset
process_vars <- iriss_trial_data %>% 
  dplyr::select(submitter_id, object, starts_with("conv_with_ai"), starts_with("consideration_aid"), starts_with("intrinsic_motivation")) %>% 
  distinct()



write_csv(iriss_manual_ratings_metrics, "data/iriss_manual_ratings_metrics.csv")

Descriptives

iriss_manual_ratings_metrics %>% 
  group_by(condition) %>% 
  get_summary_stats(use_word_count, type="mean_sd")
# A tibble: 3 × 5
  condition   variable           n  mean    sd
  <chr>       <fct>          <dbl> <dbl> <dbl>
1 Control     use_word_count   730  4.34  5.40
2 High-Agency use_word_count   718  4.37  4.79
3 Low-Agency  use_word_count   682  5.98  6.18
iriss_manual_ratings_metrics %>% 
  ggplot(aes(x=condition, y=use_word_count)) +
  geom_boxplot() +
  stat_summary(fun="mean")
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_segment()`).

cor.test(iriss_manual_ratings_metrics$use_word_count, iriss_manual_ratings_metrics$originality)

    Pearson's product-moment correlation

data:  iriss_manual_ratings_metrics$use_word_count and iriss_manual_ratings_metrics$originality
t = 6.8587, df = 2128, p-value = 9.073e-12
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.1052508 0.1883610
sample estimates:
      cor 
0.1470655 
ggplot(iriss_manual_ratings_metrics, aes(x=use_word_count, y=originality, color=condition)) +
  geom_jitter() +
  geom_smooth(method="lm")
`geom_smooth()` using formula = 'y ~ x'

Higher word count is associated with greater originality.

Originality analysis

iriss_manual_ratings_metrics %>% 
  group_by(condition) %>%
  get_summary_stats(c(creativity, originality, usefulness), type="mean_sd") %>% 
  arrange(variable)
# A tibble: 9 × 5
  condition   variable        n  mean    sd
  <chr>       <fct>       <dbl> <dbl> <dbl>
1 Control     creativity    730  2.46  1.14
2 High-Agency creativity    718  2.37  1.18
3 Low-Agency  creativity    682  2.52  1.17
4 Control     originality   730  2.45  1.20
5 High-Agency originality   718  2.42  1.23
6 Low-Agency  originality   682  2.52  1.28
7 Control     usefulness    730  2.79  1.26
8 High-Agency usefulness    718  2.95  1.27
9 Low-Agency  usefulness    682  2.92  1.26
iriss_manual_ratings_metrics %>% 
  group_by(condition) %>% 
  count(originality) %>% 
  mutate(prop=n/sum(n)) %>% 
  ggplot(aes(x=originality, y=prop)) +
  geom_col(color="white") +
  facet_wrap(~condition)

# Descriptives table
summary_table <- iriss_manual_ratings_metrics %>%
  mutate(condition=fct_relevel(as.factor(condition), "low-agency", "high-agency", "Control")) %>% 
  group_by(condition) %>%
  get_summary_stats(c(creativity, originality, usefulness), type = "mean_sd") %>%
  mutate(mean_sd = sprintf("%.2f (%.2f)", mean, sd)) %>%
  select(condition, variable, mean_sd) %>%
  pivot_wider(names_from = variable, values_from = mean_sd)
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `condition = fct_relevel(...)`.
Caused by warning:
! 2 unknown levels in `f`: low-agency and high-agency
summary_table %>%
  kbl(
    col.names = c("Condition", "Creativity", "Originality", "Usefulness"),
    align = "lccc",
    booktabs = TRUE
  ) %>%
  kable_styling(
    html_font = "helvetica",
    full_width = FALSE,
    position = "center",
  ) 
Condition Creativity Originality Usefulness
Control 2.46 (1.14) 2.45 (1.21) 2.79 (1.26)
High-Agency 2.37 (1.18) 2.42 (1.23) 2.95 (1.27)
Low-Agency 2.52 (1.17) 2.52 (1.28) 2.92 (1.26)

Heatmap of creativity “sweet spots”

`summarise()` has grouped output by 'originality'. You can override using the
`.groups` argument.
Warning: The `legend.text.align` argument of `theme()` is deprecated as of ggplot2
3.5.0.
ℹ Please use theme(legend.text = element_text(hjust)) instead.

iriss_manual_ratings_metrics %>% 
  group_by(submitter_id, condition) %>% 
  summarize(originality=mean(originality)) %>%
  ggplot(aes(x=originality, y=condition)) +
  geom_density_ridges()
`summarise()` has grouped output by 'submitter_id'. You can override using the
`.groups` argument.
Picking joint bandwidth of 0.181

Plots for H1

outcomes <- c("originality", "creativity", "usefulness")

plots <- map(outcomes, function(var) {
  summary_df <- iriss_manual_ratings_metrics %>%
    group_by(submitter_id, condition) %>%
    summarize(value = mean(.data[[var]]), .groups = "drop")
  
  ggplot(summary_df, aes(x = value, y = condition, fill = condition)) +
    geom_density_ridges(alpha=.6, quantile_lines = TRUE, quantile_fun = mean) +
    xlim(c(1,5)) +
    labs(x = var, y = "") +
    theme_apa() +
    theme(legend.position = "none")
})

# To display the plots:
originality <- plots[[1]]  # originality
creativity <- plots[[2]]  # creativity
usefulness <- plots[[3]]  # usefulness

ggsave("figures/originality.png", plot = plots[[1]])
Saving 7 x 5 in image
Picking joint bandwidth of 0.181
ggsave("figures/creativity.png", plot = plots[[2]])
Saving 7 x 5 in image
Picking joint bandwidth of 0.157
ggsave("figures/usefulness.png", plot = plots[[3]])
Saving 7 x 5 in image
Picking joint bandwidth of 0.17
# Combined plot
combined_evaluation_plot <- creativity + originality + usefulness + plot_annotation(tag_levels = "a")
combined_evaluation_plot
Picking joint bandwidth of 0.157
Picking joint bandwidth of 0.181
Picking joint bandwidth of 0.17

ggsave(plot=combined_evaluation_plot, filename="figures/evaluations.png")
Saving 7 x 5 in image
Picking joint bandwidth of 0.157

Picking joint bandwidth of 0.181

Picking joint bandwidth of 0.17

H1: Individual-level creativity, originality, and usefulness

iriss_originality_mod <- lmer(
  originality ~ condition + (1|object) + (1|submitter_id), data= iriss_manual_ratings_metrics)
summary(iriss_originality_mod)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: originality ~ condition + (1 | object) + (1 | submitter_id)
   Data: iriss_manual_ratings_metrics

REML criterion at convergence: 6856.7

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.9786 -0.8738 -0.1548  0.7059  2.6612 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.12356  0.3515  
 object       (Intercept) 0.04517  0.2125  
 Residual                 1.37521  1.1727  
Number of obs: 2130, groups:  submitter_id, 141; object, 3

Fixed effects:
                       Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)            2.436143   0.140133   3.113507  17.384 0.000333 ***
conditionHigh-Agency  -0.007137   0.096005 135.720987  -0.074 0.940847    
conditionLow-Agency    0.076654   0.096377 138.589138   0.795 0.427767    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) cndH-A
cndtnHgh-Ag -0.341       
cndtnLw-Agn -0.339  0.495
anova(iriss_originality_mod)
Type III Analysis of Variance Table with Satterthwaite's method
          Sum Sq Mean Sq NumDF  DenDF F value Pr(>F)
condition 1.2697 0.63485     2 137.95  0.4616 0.6312
coef(summary(iriss_originality_mod))
                         Estimate Std. Error         df     t value
(Intercept)           2.436143222 0.14013329   3.113507 17.38447162
conditionHigh-Agency -0.007137355 0.09600524 135.720987 -0.07434339
conditionLow-Agency   0.076653668 0.09637659 138.589138  0.79535568
                         Pr(>|t|)
(Intercept)          0.0003326791
conditionHigh-Agency 0.9408466110
conditionLow-Agency  0.4277667169
# anova(iriss_originality_mod)
# F_to_eta2(4.45, df=2, df_error=120)
# eta_squared(compliant_semantic_similarity_mod)


emmeans(iriss_originality_mod, pairwise ~ condition, adjust = "bonferroni")
$emmeans
 condition   emmean    SE   df lower.CL upper.CL
 Control       2.44 0.140 3.11     2.00     2.87
 High-Agency   2.43 0.140 3.13     1.99     2.87
 Low-Agency    2.51 0.141 3.15     2.08     2.95

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

$contrasts
 contrast                     estimate     SE  df t.ratio p.value
 Control - (High-Agency)       0.00714 0.0960 134   0.074  1.0000
 Control - (Low-Agency)       -0.07665 0.0964 137  -0.795  1.0000
 (High-Agency) - (Low-Agency) -0.08379 0.0967 138  -0.867  1.0000

Degrees-of-freedom method: kenward-roger 
P value adjustment: bonferroni method for 3 tests 
# With demographic covariates
demographics <- iriss %>% 
  mutate(gender = case_when(gender == "1" ~ "Male",
                            gender == "2" ~ "Female",
                            gender == "3" ~ "Non-binary",
                            gender == "4" ~ "Prefer to self-describe")) %>% 
  mutate(race_recoded = if_else(race == "3", "White", "Non-White")) %>%
  select(submitter_id, age, race_recoded, gender, education)

iriss_manual_ratings_demographics <- left_join(iriss_manual_ratings_metrics, demographics, by="submitter_id")

#Adjust
iriss_originality_demographics <- lmer(
  originality ~ condition + scale(age) + as.factor(gender) + as.factor(race_recoded) + as.factor(education) + (1|object) + (1|submitter_id), data= iriss_manual_ratings_demographics)
summary(iriss_originality_demographics)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
originality ~ condition + scale(age) + as.factor(gender) + as.factor(race_recoded) +  
    as.factor(education) + (1 | object) + (1 | submitter_id)
   Data: iriss_manual_ratings_demographics

REML criterion at convergence: 6858.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.0137 -0.8539 -0.1529  0.6995  2.6045 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.11461  0.3385  
 object       (Intercept) 0.04524  0.2127  
 Residual                 1.37504  1.1726  
Number of obs: 2130, groups:  submitter_id, 141; object, 3

Fixed effects:
                                          Estimate Std. Error        df t value
(Intercept)                                2.10234    0.53537 156.01024   3.927
conditionHigh-Agency                       0.03019    0.09664 126.14056   0.312
conditionLow-Agency                        0.12242    0.09818 127.53225   1.247
scale(age)                                -0.15733    0.05556 128.62248  -2.832
as.factor(gender)Male                      0.02992    0.09296 126.98209   0.322
as.factor(gender)Non-binary                0.13101    0.33778 118.32494   0.388
as.factor(gender)Prefer to self-describe   0.40296    0.27840 136.76321   1.447
as.factor(race_recoded)White               0.13477    0.10811 122.02206   1.247
as.factor(education)2                      0.11127    0.51970 183.95165   0.214
as.factor(education)3                      0.33033    0.52036 183.41917   0.635
as.factor(education)4                      0.53525    0.53476 180.30627   1.001
as.factor(education)5                      0.21359    0.53352 179.42722   0.400
as.factor(education)6                      0.56040    0.55724 173.28314   1.006
as.factor(education)8                      0.83104    0.72372 161.83923   1.148
                                         Pr(>|t|)    
(Intercept)                              0.000129 ***
conditionHigh-Agency                     0.755204    
conditionLow-Agency                      0.214732    
scale(age)                               0.005377 ** 
as.factor(gender)Male                    0.748119    
as.factor(gender)Non-binary              0.698821    
as.factor(gender)Prefer to self-describe 0.150059    
as.factor(race_recoded)White             0.214943    
as.factor(education)2                    0.830700    
as.factor(education)3                    0.526348    
as.factor(education)4                    0.318211    
as.factor(education)5                    0.689378    
as.factor(education)6                    0.315976    
as.factor(education)8                    0.252542    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation matrix not shown by default, as p = 14 > 12.
Use print(x, correlation=TRUE)  or
    vcov(x)        if you need it
coef(summary(iriss_originality_demographics))
                                            Estimate Std. Error       df
(Intercept)                               2.10234081 0.53537440 156.0102
conditionHigh-Agency                      0.03019497 0.09663527 126.1406
conditionLow-Agency                       0.12241946 0.09818176 127.5322
scale(age)                               -0.15732527 0.05556033 128.6225
as.factor(gender)Male                     0.02991720 0.09296294 126.9821
as.factor(gender)Non-binary               0.13100958 0.33778021 118.3249
as.factor(gender)Prefer to self-describe  0.40296416 0.27839569 136.7632
as.factor(race_recoded)White              0.13477112 0.10811354 122.0221
as.factor(education)2                     0.11127133 0.51969719 183.9516
as.factor(education)3                     0.33032725 0.52036169 183.4192
as.factor(education)4                     0.53524960 0.53476195 180.3063
as.factor(education)5                     0.21359197 0.53351852 179.4272
as.factor(education)6                     0.56039840 0.55723815 173.2831
as.factor(education)8                     0.83104224 0.72372014 161.8392
                                            t value     Pr(>|t|)
(Intercept)                               3.9268609 0.0001288698
conditionHigh-Agency                      0.3124632 0.7552044569
conditionLow-Agency                       1.2468656 0.2147319081
scale(age)                               -2.8316117 0.0053771730
as.factor(gender)Male                     0.3218186 0.7481192837
as.factor(gender)Non-binary               0.3878545 0.6988205538
as.factor(gender)Prefer to self-describe  1.4474511 0.1500589958
as.factor(race_recoded)White              1.2465703 0.2149428176
as.factor(education)2                     0.2141080 0.8306999807
as.factor(education)3                     0.6348032 0.5263478361
as.factor(education)4                     1.0009119 0.3182107571
as.factor(education)5                     0.4003459 0.6893779913
as.factor(education)6                     1.0056713 0.3159759706
as.factor(education)8                     1.1482923 0.2525423728
# Repeat models for creativity and usefulness
iriss_creativity_mod <- lmer(
  creativity ~ condition + (1|object) + (1|submitter_id), data= iriss_manual_ratings_metrics)
summary(iriss_creativity_mod)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: creativity ~ condition + (1 | object) + (1 | submitter_id)
   Data: iriss_manual_ratings_metrics

REML criterion at convergence: 6653

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.7394 -0.7711 -0.2364  0.5857  2.5446 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.05771  0.2402  
 object       (Intercept) 0.01912  0.1383  
 Residual                 1.27768  1.1303  
Number of obs: 2130, groups:  submitter_id, 141; object, 3

Fixed effects:
                      Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)            2.45583    0.09693   3.60261  25.336 3.43e-05 ***
conditionHigh-Agency  -0.07780    0.07802 133.06007  -0.997    0.321    
conditionLow-Agency    0.05157    0.07851 137.74969   0.657    0.512    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) cndH-A
cndtnHgh-Ag -0.400       
cndtnLw-Agn -0.397  0.494
iriss_usefulness_mod <- lmer(
  usefulness ~ condition + (1|object) + (1|submitter_id), data= iriss_manual_ratings_metrics)
summary(iriss_usefulness_mod)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: usefulness ~ condition + (1 | object) + (1 | submitter_id)
   Data: iriss_manual_ratings_metrics

REML criterion at convergence: 7014.4

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.90596 -0.74105  0.01631  0.80993  2.22173 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.085498 0.29240 
 object       (Intercept) 0.007747 0.08802 
 Residual                 1.506123 1.22724 
Number of obs: 2130, groups:  submitter_id, 141; object, 3

Fixed effects:
                      Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)            2.79276    0.08077   7.62367  34.576 1.17e-09 ***
conditionHigh-Agency   0.15416    0.08909 131.01083   1.730   0.0859 .  
conditionLow-Agency    0.13002    0.08956 134.96566   1.452   0.1489    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) cndH-A
cndtnHgh-Ag -0.548       
cndtnLw-Agn -0.545  0.494

Which conditions contributed the most creative ideas?

iriss_manual_ratings_metrics %>% 
  filter(originality > 3 & usefulness > 3) %>% 
  count(condition) 
# A tibble: 3 × 2
  condition       n
  <chr>       <int>
1 Control        51
2 High-Agency    66
3 Low-Agency     76

low-agency participants produced the most creative ideas, followed by high-agency participants, then participants in the control condition.

What were the highest-performing ideas?

iriss_manual_ratings_metrics %>% 
  arrange(desc(creativity)) %>% 
  dplyr::select(object, submitter_id, condition, use, creativity, originality, usefulness) %>% 
  print(n=100)
# A tibble: 2,130 × 7
    object     submitter_id    condition use   creativity originality usefulness
    <chr>      <chr>           <chr>     <chr>      <dbl>       <dbl>      <dbl>
  1 frisbee    R_6dB0PTZKZgdw… High-Age… "Use…          5           4          2
  2 bubblewrap R_1buglgqOJUG3… High-Age… "It …          5           5          1
  3 frisbee    R_7nZ3ULt8wE34… Low-Agen… "Hom…          5           5          4
  4 brick      R_1OVf3LC8jtPa… High-Age… "car…          5           4          2
  5 bubblewrap R_6qF5I5sWYuZp… High-Age… "usi…          5           5          3
  6 brick      R_1qOG1vnfLQDf… High-Age… "A b…          5           5          1
  7 bubblewrap R_6N4WtwIm1l9W… High-Age… "sou…          5           4          5
  8 brick      R_5AqkQSz5Rdex… High-Age… "Doo…          5           5          5
  9 brick      R_1jKUeVXkROyx… High-Age… "Doo…          5           4          5
 10 bubblewrap R_1fYHXsFEoxLP… Control   "Che…          5           5          2
 11 frisbee    R_1kBcWcfRZzzQ… Control   "use…          5           2          4
 12 brick      R_3FM3cYpKpcCt… Control   "pai…          5           4          5
 13 bubblewrap R_1M4k1BhVFmWL… Low-Agen… "ubb…          5           3          3
 14 bubblewrap R_6Yy6TnzRDvkQ… Control   "Gif…          5           2          5
 15 brick      R_7GTiZjIBaixJ… High-Age… "out…          5           3          5
 16 brick      R_5RVYAiVHRg64… Low-Agen… "Mak…          5           5          4
 17 brick      R_1KwLYXRsOg0v… Low-Agen… "Cre…          5           4          4
 18 frisbee    R_5fdNtVEeHR3N… High-Age… "Sun…          5           5          2
 19 brick      R_6wG13RvzYfew… High-Age… "doo…          5           3          5
 20 brick      R_51o6nvHjXplG… Low-Agen… "wei…          5           3          5
 21 bubblewrap R_6dB0PTZKZgdw… High-Age… "You…          5           5          2
 22 frisbee    R_3wNNP1xzDcTG… Low-Agen… "A p…          5           4          5
 23 brick      R_3udGH5aW9O9I… High-Age… "Ham…          5           3          3
 24 brick      R_3LbSAFKDD0oC… Low-Agen… "Wei…          5           4          4
 25 frisbee    R_5fIFgynExiTP… Low-Agen… "Fri…          5           3          1
 26 frisbee    R_3nldplsobICJ… High-Age… "hat"          5           3          1
 27 bubblewrap R_3D7oPyKecuUs… High-Age… "Ind…          5           4          5
 28 brick      R_7ALvm0Qlrvi1… Control   "gar…          5           1          5
 29 bubblewrap R_6EFE2Xz9sXGn… High-Age… "A t…          5           2          5
 30 frisbee    R_67k7B3JtR5NL… Control   "hat"          5           5          3
 31 frisbee    R_665Usq3Hx2qi… High-Age… "Com…          5           5          4
 32 brick      R_6exntUkrDrTz… High-Age… "exe…          5           3          5
 33 brick      R_5LtfsGkz5b77… Control   "Art"          5           3          3
 34 brick      R_6QNXVqas8W4l… Low-Agen… "Bri…          5           2          2
 35 bubblewrap R_1C4pbfUTsEaL… Low-Agen… "fre…          5           5          5
 36 frisbee    R_3attp5yTb9GP… High-Age… "DIY…          5           5          3
 37 brick      R_3LbSAFKDD0oC… Low-Agen… "Wei…          5           4          5
 38 frisbee    R_5fIFgynExiTP… Low-Agen… "Fri…          5           3          3
 39 bubblewrap R_131pl5s2J4Qx… Control   "as …          5           5          3
 40 frisbee    R_1mh0F87uTH5P… Low-Agen… "sle…          5           3          4
 41 brick      R_6N4WtwIm1l9W… High-Age… "cre…          5           1          3
 42 brick      R_1C4pbfUTsEaL… Low-Agen… "gri…          5           5          5
 43 bubblewrap R_3udGH5aW9O9I… High-Age… "Glo…          5           5          5
 44 bubblewrap R_3udGH5aW9O9I… High-Age… "Fid…          5           5          5
 45 bubblewrap R_6zARonxE8Sxl… Control   "Red…          5           5          5
 46 bubblewrap R_5ebqE9oR2LkK… Low-Agen… "Use…          5           1          5
 47 bubblewrap R_51o6nvHjXplG… Low-Agen… "sen…          5           5          5
 48 bubblewrap R_5PL4nsVNfVY0… Low-Agen… "cre…          5           5          4
 49 frisbee    R_3KE2yFsoF3r8… Control   "clo…          5           4          4
 50 bubblewrap R_194CpaJ4XvnB… Control   "raf…          5           5          4
 51 frisbee    R_6l5kCRxzaFyP… Low-Agen… "Fri…          5           5          5
 52 frisbee    R_7rp6J0JF3xsK… Control   "spi…          5           5          5
 53 frisbee    R_5oX6YimCu2Xp… High-Age… "By …          5           5          5
 54 frisbee    R_6auH7PBnqQIf… Control   "to …          5           5          5
 55 frisbee    R_3fCAdCKv9l7Z… Control   "Fet…          5           5          5
 56 brick      R_7FrNcsWW3S6K… Control   "sho…          5           1          1
 57 brick      R_1fYHXsFEoxLP… Control   "Tie…          5           3          5
 58 brick      R_6QLVlOVjs5ao… High-Age… "Bri…          5           5          5
 59 brick      R_5z92Djt5TP9C… Low-Agen… "sto…          5           3          3
 60 brick      R_6EhC0lA5azVe… Low-Agen… "Can…          5           5          4
 61 bubblewrap R_7GTiZjIBaixJ… High-Age… "ins…          5           3          5
 62 bubblewrap R_1KwLYXRsOg0v… Low-Agen… "Wra…          5           3          1
 63 bubblewrap R_31MiIWOm74O8… High-Age… "clo…          5           3          2
 64 bubblewrap R_3cyW0hfrZr3k… Control   "Kne…          5           5          5
 65 brick      R_1Gp1JNnmPu4s… High-Age… "Pla…          5           2          2
 66 frisbee    R_5Uf6lLymS1mm… High-Age… "Fan"          5           5          5
 67 frisbee    R_3Ytym4eWHyxE… Low-Agen… "to …          5           5          4
 68 frisbee    R_7d1ftMosPxb6… Control   "Pla…          5           5          4
 69 brick      R_5ebqE9oR2LkK… Low-Agen… "Use…          5           3          4
 70 brick      R_7ALvm0Qlrvi1… Control   "fen…          5           3          5
 71 brick      R_6auH7PBnqQIf… Control   "mak…          5           4          5
 72 bubblewrap R_6ydDuphsh56I… Control   "as …          5           5          3
 73 bubblewrap R_3wNNP1xzDcTG… Low-Agen… "Wal…          5           5          3
 74 brick      R_1NEsvnizxQQw… Low-Agen… "The…          5           5          1
 75 bubblewrap R_1qOG1vnfLQDf… High-Age… "edi…          5           5          1
 76 frisbee    R_3tnzcEGX1vgk… Control   "a p…          5           1          5
 77 frisbee    R_5z92Djt5TP9C… Low-Agen… "fur…          5           4          5
 78 bubblewrap R_6EEXVo2NyE8F… High-Age… "che…          5           4          3
 79 brick      R_5ebqE9oR2LkK… Low-Agen… "Mak…          5           5          4
 80 brick      R_1C4pbfUTsEaL… Low-Agen… "gri…          5           5          4
 81 frisbee    R_51o6nvHjXplG… Low-Agen… "Out…          5           5          1
 82 bubblewrap R_1oz2I9p4Wn3V… Control   "str…          5           3          2
 83 bubblewrap R_67k7B3JtR5NL… Control   "pil…          5           4          2
 84 frisbee    R_12EcOaXnt7QK… Control   "Mak…          5           5          4
 85 bubblewrap R_6zARonxE8Sxl… Control   "Flo…          5           5          3
 86 frisbee    R_4y8VBVKq6rhG… Control   "Use…          5           4          3
 87 frisbee    R_7rp6J0JF3xsK… Control   "bui…          5           5          1
 88 bubblewrap R_1fYHXsFEoxLP… Control   "Sci…          5           5          3
 89 bubblewrap R_665Usq3Hx2qi… High-Age… "Pho…          5           4          4
 90 brick      R_5RVYAiVHRg64… Low-Agen… "Boo…          5           4          5
 91 bubblewrap R_3li0RUPpKDlF… Control   "As …          5           5          4
 92 bubblewrap R_5zx7TN6W8h0f… High-Age… "Use…          5           5          4
 93 frisbee    R_3JEkszlEzO0c… Low-Agen… "bir…          5           5          4
 94 frisbee    R_3CxXpTsv9bhn… Control   "Pai…          5           5          5
 95 brick      R_5z92Djt5TP9C… Low-Agen… "exe…          5           5          5
 96 brick      R_6QFOPe8GIERr… Low-Agen… "wor…          5           5          5
 97 brick      R_7mEnERegkdhW… Low-Agen… "cou…          5           5          5
 98 bubblewrap R_1f3rlVkaVztA… Control   "it …          5           4          3
 99 bubblewrap R_52r0eiY9gmkT… High-Age… "tub…          5           5          2
100 frisbee    R_1NEsvnizxQQw… Low-Agen… "3. …          5           4          5
# ℹ 2,030 more rows

Most common uses for each object

iriss_manual_ratings_metrics %>% 
  mutate(use = str_to_lower(use)) %>% 
  group_by(object) %>% 
  count(use) %>% 
  arrange(desc(n)) %>% 
  print(n=30)
# A tibble: 1,159 × 3
# Groups:   object [3]
   object     use                                        n
   <chr>      <chr>                                  <int>
 1 frisbee    plate                                     26
 2 frisbee    hat                                       20
 3 bubblewrap stress relief                             17
 4 bubblewrap insulation                                14
 5 brick      doorstop                                  11
 6 frisbee    fan                                       10
 7 brick      bookend                                    8
 8 brick      hammer                                     8
 9 frisbee    decoration                                 8
10 brick      to build something                         7
11 brick      weights                                    7
12 bubblewrap costume                                    7
13 bubblewrap pillow                                     7
14 bubblewrap stress reliever                            7
15 brick      art                                        6
16 brick      as a chair                                 6
17 brick      decoration                                 6
18 brick      door stopper                               6
19 brick      wall                                       6
20 bubblewrap bubble wrap dance floor                    6
21 frisbee    as a plate                                 6
22 frisbee    bird feeder                                6
23 frisbee    shovel                                     6
24 frisbee    sunshade                                   6
25 brick      bookends                                   5
26 brick      brick to create furniture                  5
27 brick      candle holder                              5
28 brick      gluing them together to create a shelf     5
29 brick      outdoor fire pit                           5
30 brick      paperweight                                5
# ℹ 1,129 more rows

H2: Semantic Similarity

iriss_semantic_similarities <- read_csv("data/iriss_semantic_similarities.csv") %>% 
    mutate(condition=case_when(condition=="Passenger"~"Low-Agency",
                             condition=="Pilot"~"High-Agency",
                             condition=="Control"~"Control"))
Rows: 4500 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): condition, object, submitter_id
dbl (1): similarity

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# iriss_semantic_similarity_mod <- lmer(similarity ~ condition + (1|object) + (1|submitter_id), data=iriss_semantic_similarities)
# summary(iriss_semantic_similarity_mod)
# 
# sjPlot::tab_model(iriss_semantic_similarity_mod)

According to the linear mixed model, participants in the low-agency condition produce significantly more similar ideas than the Control group. ### Noncompliance in AI Access conditions

This is JUST on the submission side. Here, we identify the trials on which high-agency participants and low-agency participants did not use AI, regardless of whether ideas from those conversations got evaluated.

# Some of these were not evaluated
no_chats_high_agency <- read_csv("data/pilot_no_first_message.csv")
Rows: 57 Columns: 36
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): participant_id, condition, object, aggregated_ideas, process_open-...
lgl (30): user_message_1, ai_response_1, user_message_2, ai_response_2, user...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
no_chats_low_agency <- read_csv("data/passenger_no_first_message.csv")
Rows: 36 Columns: 36
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): participant_id, condition, object, aggregated_ideas, process_open-...
lgl (30): user_message_1, ai_response_1, user_message_2, ai_response_2, user...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
noncompliant_conversations <- bind_rows(no_chats_high_agency, no_chats_low_agency) %>% 
  rename(submitter_id = participant_id) %>% 
  mutate(condition=case_when(condition=="Passenger"~"Low-Agency",
                             condition=="Pilot"~"High-Agency",
                             condition=="Control"~"Control"))


# Number of noncompliant conversations
noncompliant_conversations %>% 
  group_by(condition) %>% 
  summarize(n_distinct(submitter_id))
# A tibble: 2 × 2
  condition   `n_distinct(submitter_id)`
  <chr>                            <int>
1 High-Agency                         25
2 Low-Agency                          17

Subsetting of conversations and participants based on which conversations were evaluated and which were compliant

# Number of conversations that actually got rated


evaluated_conversations <- iriss_manual_ratings_metrics %>% 
  group_by(submitter_id, condition, object) %>% 
  summarize(aggregated_uses = paste(use, collapse = "; "))
`summarise()` has grouped output by 'submitter_id', 'condition'. You can
override using the `.groups` argument.
compliant_conversations <- evaluated_conversations %>% 
  anti_join(noncompliant_conversations, by=c("submitter_id", "object"))

# unevaluated_conversations <- iriss_conversations %>% 
#   anti_join(evaluated_conversations, by=c("submitter_id", "condition", "object"))

Evaluated, compliant conversations

After exclusions, there were 123 participants in total

# Exclude the noncompliant conversations from the analysis
compliant_conversations %>%
  group_by(condition) %>%
  summarize(unique_people = n_distinct(submitter_id))
# A tibble: 3 × 2
  condition   unique_people
  <chr>               <int>
1 Control                47
2 High-Agency            36
3 Low-Agency             40
write_csv(compliant_conversations, "data/evaluated_compliant_conversations.csv")

compliant_conversations_process <- compliant_conversations %>% 
  left_join(process_vars, by=c("submitter_id", "object")) %>%
  mutate(across(conv_with_ai_1:intrinsic_motivation_4, as.numeric),
         intrinsic_motivation_mean = rowMeans(
      across(intrinsic_motivation_1:intrinsic_motivation_4),
      na.rm = TRUE
    )) %>%
    mutate(
         conv_with_ai_mean = rowMeans(
      across(conv_with_ai_1:conv_with_ai_3),
      na.rm = TRUE
    )) %>%
    mutate(
         consideration_aid_mean = rowMeans(
      across(consideration_aid_1:consideration_aid_3),
      na.rm = TRUE
    ))

# compliant_trials %>% 
#   distinct(condition, submitter_id, object) %>% 
#   filter(condition != "Control") %>% 
#   count(condition)
# 
# compliant_trials %>% 
#   distinct(condition, submitter_id) %>% 
#   count(condition)

# Which participants were completely excluded because all of their trials were noncompliant?


# excluded_participants <- noncompliant %>%
#   count(submitter_id, condition) %>%
#   filter(n==3) %>% 
#   arrange(condition)
# 
# write_csv(excluded_participants, "data/fully_excluded_participants.csv")
# 
# excluded_participants %>% 
#   count(condition)
# 
# # Which participants had AT LEAST ONE noncompliant trial?
# flagged_participants <- noncompliant %>%
#   distinct(submitter_id, condition)


compliant_ideas <- compliant_conversations %>% 
  separate_rows(aggregated_uses, sep = "; ") %>% 
  rename(use="aggregated_uses") %>% 
  distinct(submitter_id, condition, object, use)

Evaluated compliant ideas

write_csv(compliant_ideas, "data/evaluated_compliant_ideas.csv")
compliant_ideas %>% 
  mutate(use = str_to_lower(use)) %>% 
  group_by(condition, object) %>% 
  count(use) %>% 
  arrange(desc(n)) 
# A tibble: 1,002 × 4
# Groups:   condition, object [9]
   condition   object     use               n
   <chr>       <chr>      <chr>         <int>
 1 Control     frisbee    plate            12
 2 Control     frisbee    hat               7
 3 High-Agency bubblewrap stress relief     7
 4 High-Agency frisbee    plate             5
 5 Control     brick      doorstop          4
 6 Control     bubblewrap insulation        4
 7 Control     frisbee    as a plate        4
 8 High-Agency brick      doorstop          4
 9 Control     brick      door stopper      3
10 Control     brick      wall              3
# ℹ 992 more rows
# from sentence_transformers import SentenceTransformer
# import pandas as pd
# import numpy as np
# import os
# os.environ["TOKENIZERS_PARALLELISM"] = "false"
# 
# 
# # Load the sentence transformer model
# model = SentenceTransformer('all-MiniLM-L6-v2')
# 
# # Load the data
# 
# data_path = '~/Git_Projects/idea_generation/data/evaluated_compliant_ideas.csv'  # Update with your actual data path
# df = pd.read_csv(data_path)
# 
# # For each combination of experimental condition, object, and submitter_id, compute the semantic similarity of ideas
# results = []
# for (condition, obj, submitter_id), group in df.groupby(['condition', 'object', 'submitter_id']):
#     ideas = group['use'].tolist()
# 
#     # Compute embeddings
#     embeddings = model.encode(ideas)
# 
#     # Compute cosine similarity matrix
#     norm_embeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)
#     similarity_matrix = np.dot(norm_embeddings, norm_embeddings.T)
# 
#     # Extract upper triangle of the similarity matrix, excluding the diagonal
#     upper_tri_indices = np.triu_indices_from(similarity_matrix, k=1)
#     similarities = similarity_matrix[upper_tri_indices]
# 
#     # Store results
#     for sim in similarities:
#         results.append({
#             'condition': condition,
#             'object': obj,
#             'submitter_id': submitter_id,
#             'similarity': sim
#         })
# 
# # Convert results to DataFrame
# compliant_similarity_df = pd.DataFrame(results)
# summary_output_path = os.path.expanduser('~/Git_Projects/idea_generation/data/compliant_semantic_similarity_summary.csv')

H1 and 2: Compliant analyses

compliant_similarities <- read_csv("data/compliant_semantic_similarities.csv")
Rows: 1435 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): condition, object, submitter_id
dbl (1): similarity

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Age and gender covariates

# Add age and gender covariates
demographics_recoded <- submitter_demographics %>% 
  mutate(gender=if_else(gender=="Female",1, 0),
         race=if_else(race=="White or Caucasian",1,0))

# 
# similarity_demographics <- left_join(compliant_similarities, demographics_recoded, by="submitter_id") 

Top-performing ideas by condition

iriss_manual_ratings_metrics %>% 
  semi_join(compliant_ideas, by=c("submitter_id", "object")) %>% 
  group_by(condition) %>% 
  arrange(    
    desc(creativity),
    desc(originality),
    desc(usefulness)) %>% 
  dplyr::select(object, submitter_id, condition, use, creativity, originality, usefulness) %>% 
  print(n=100)
# A tibble: 1,710 × 7
# Groups:   condition [3]
    object     submitter_id    condition use   creativity originality usefulness
    <chr>      <chr>           <chr>     <chr>      <dbl>       <dbl>      <dbl>
  1 brick      R_5AqkQSz5Rdex… High-Age… "Doo…          5           5          5
  2 bubblewrap R_1C4pbfUTsEaL… Low-Agen… "fre…          5           5          5
  3 brick      R_1C4pbfUTsEaL… Low-Agen… "gri…          5           5          5
  4 bubblewrap R_6zARonxE8Sxl… Control   "Red…          5           5          5
  5 bubblewrap R_51o6nvHjXplG… Low-Agen… "sen…          5           5          5
  6 frisbee    R_6l5kCRxzaFyP… Low-Agen… "Fri…          5           5          5
  7 frisbee    R_7rp6J0JF3xsK… Control   "spi…          5           5          5
  8 frisbee    R_5oX6YimCu2Xp… High-Age… "By …          5           5          5
  9 frisbee    R_6auH7PBnqQIf… Control   "to …          5           5          5
 10 frisbee    R_3fCAdCKv9l7Z… Control   "Fet…          5           5          5
 11 brick      R_6QLVlOVjs5ao… High-Age… "Bri…          5           5          5
 12 bubblewrap R_3cyW0hfrZr3k… Control   "Kne…          5           5          5
 13 frisbee    R_5Uf6lLymS1mm… High-Age… "Fan"          5           5          5
 14 frisbee    R_3CxXpTsv9bhn… Control   "Pai…          5           5          5
 15 brick      R_5z92Djt5TP9C… Low-Agen… "exe…          5           5          5
 16 brick      R_6QFOPe8GIERr… Low-Agen… "wor…          5           5          5
 17 brick      R_7mEnERegkdhW… Low-Agen… "cou…          5           5          5
 18 bubblewrap R_62Dg4jMZHK4Z… High-Age… "bub…          5           5          5
 19 bubblewrap R_1C4pbfUTsEaL… Low-Agen… "mak…          5           5          5
 20 frisbee    R_7nZ3ULt8wE34… Low-Agen… "Hom…          5           5          4
 21 brick      R_5RVYAiVHRg64… Low-Agen… "Mak…          5           5          4
 22 bubblewrap R_5PL4nsVNfVY0… Low-Agen… "cre…          5           5          4
 23 bubblewrap R_194CpaJ4XvnB… Control   "raf…          5           5          4
 24 brick      R_6EhC0lA5azVe… Low-Agen… "Can…          5           5          4
 25 frisbee    R_3Ytym4eWHyxE… Low-Agen… "to …          5           5          4
 26 frisbee    R_7d1ftMosPxb6… Control   "Pla…          5           5          4
 27 brick      R_5ebqE9oR2LkK… Low-Agen… "Mak…          5           5          4
 28 brick      R_1C4pbfUTsEaL… Low-Agen… "gri…          5           5          4
 29 frisbee    R_12EcOaXnt7QK… Control   "Mak…          5           5          4
 30 bubblewrap R_3li0RUPpKDlF… Control   "As …          5           5          4
 31 bubblewrap R_5zx7TN6W8h0f… High-Age… "Use…          5           5          4
 32 frisbee    R_3JEkszlEzO0c… Low-Agen… "bir…          5           5          4
 33 bubblewrap R_6qF5I5sWYuZp… High-Age… "usi…          5           5          3
 34 frisbee    R_67k7B3JtR5NL… Control   "hat"          5           5          3
 35 frisbee    R_3attp5yTb9GP… High-Age… "DIY…          5           5          3
 36 bubblewrap R_131pl5s2J4Qx… Control   "as …          5           5          3
 37 bubblewrap R_6ydDuphsh56I… Control   "as …          5           5          3
 38 bubblewrap R_6zARonxE8Sxl… Control   "Flo…          5           5          3
 39 bubblewrap R_1fYHXsFEoxLP… Control   "Sci…          5           5          3
 40 bubblewrap R_1fYHXsFEoxLP… Control   "Che…          5           5          2
 41 frisbee    R_5fdNtVEeHR3N… High-Age… "Sun…          5           5          2
 42 bubblewrap R_6dB0PTZKZgdw… High-Age… "You…          5           5          2
 43 bubblewrap R_52r0eiY9gmkT… High-Age… "tub…          5           5          2
 44 brick      R_1fYHXsFEoxLP… Control   "Scr…          5           5          2
 45 bubblewrap R_1buglgqOJUG3… High-Age… "It …          5           5          1
 46 brick      R_1qOG1vnfLQDf… High-Age… "A b…          5           5          1
 47 bubblewrap R_1qOG1vnfLQDf… High-Age… "edi…          5           5          1
 48 frisbee    R_51o6nvHjXplG… Low-Agen… "Out…          5           5          1
 49 frisbee    R_7rp6J0JF3xsK… Control   "bui…          5           5          1
 50 brick      R_5LkuzRZVUbxp… Control   "as …          5           5          1
 51 bubblewrap R_6N4WtwIm1l9W… High-Age… "sou…          5           4          5
 52 brick      R_1jKUeVXkROyx… High-Age… "Doo…          5           4          5
 53 brick      R_3FM3cYpKpcCt… Control   "pai…          5           4          5
 54 bubblewrap R_3D7oPyKecuUs… High-Age… "Ind…          5           4          5
 55 brick      R_3LbSAFKDD0oC… Low-Agen… "Wei…          5           4          5
 56 brick      R_6auH7PBnqQIf… Control   "mak…          5           4          5
 57 frisbee    R_5z92Djt5TP9C… Low-Agen… "fur…          5           4          5
 58 brick      R_5RVYAiVHRg64… Low-Agen… "Boo…          5           4          5
 59 brick      R_1KwLYXRsOg0v… Low-Agen… "Cre…          5           4          4
 60 brick      R_3LbSAFKDD0oC… Low-Agen… "Wei…          5           4          4
 61 frisbee    R_3KE2yFsoF3r8… Control   "clo…          5           4          4
 62 bubblewrap R_665Usq3Hx2qi… High-Age… "Pho…          5           4          4
 63 bubblewrap R_6EEXVo2NyE8F… High-Age… "che…          5           4          3
 64 frisbee    R_4y8VBVKq6rhG… Control   "Use…          5           4          3
 65 bubblewrap R_1f3rlVkaVztA… Control   "it …          5           4          3
 66 frisbee    R_6dB0PTZKZgdw… High-Age… "Use…          5           4          2
 67 brick      R_1OVf3LC8jtPa… High-Age… "car…          5           4          2
 68 bubblewrap R_67k7B3JtR5NL… Control   "pil…          5           4          2
 69 brick      R_1fYHXsFEoxLP… Control   "Tie…          5           3          5
 70 brick      R_7ALvm0Qlrvi1… Control   "fen…          5           3          5
 71 brick      R_5ebqE9oR2LkK… Low-Agen… "Use…          5           3          4
 72 bubblewrap R_1M4k1BhVFmWL… Low-Agen… "ubb…          5           3          3
 73 brick      R_5LtfsGkz5b77… Control   "Art"          5           3          3
 74 frisbee    R_5fIFgynExiTP… Low-Agen… "Fri…          5           3          3
 75 brick      R_5z92Djt5TP9C… Low-Agen… "sto…          5           3          3
 76 bubblewrap R_1oz2I9p4Wn3V… Control   "str…          5           3          2
 77 frisbee    R_5fIFgynExiTP… Low-Agen… "Fri…          5           3          1
 78 frisbee    R_3nldplsobICJ… High-Age… "hat"          5           3          1
 79 bubblewrap R_1KwLYXRsOg0v… Low-Agen… "Wra…          5           3          1
 80 bubblewrap R_6Yy6TnzRDvkQ… Control   "Gif…          5           2          5
 81 bubblewrap R_6EFE2Xz9sXGn… High-Age… "A t…          5           2          5
 82 bubblewrap R_3nldplsobICJ… High-Age… "des…          5           2          5
 83 frisbee    R_1kBcWcfRZzzQ… Control   "use…          5           2          4
 84 brick      R_6QNXVqas8W4l… Low-Agen… "Bri…          5           2          2
 85 brick      R_7ALvm0Qlrvi1… Control   "gar…          5           1          5
 86 bubblewrap R_5ebqE9oR2LkK… Low-Agen… "Use…          5           1          5
 87 frisbee    R_3tnzcEGX1vgk… Control   "a p…          5           1          5
 88 bubblewrap R_3z5ZJlDeoI2U… Low-Agen… "pro…          5           1          5
 89 brick      R_6N4WtwIm1l9W… High-Age… "cre…          5           1          3
 90 brick      R_7FrNcsWW3S6K… Control   "sho…          5           1          1
 91 bubblewrap R_1sckIZMShaW5… Low-Agen… "put…          4           5          5
 92 bubblewrap R_1kBcWcfRZzzQ… Control   "as …          4           5          5
 93 bubblewrap R_1sckIZMShaW5… Low-Agen… "ins…          4           5          5
 94 frisbee    R_5z92Djt5TP9C… Low-Agen… "coa…          4           5          5
 95 bubblewrap R_32AcUgLmD4vh… Low-Agen… "cov…          4           5          5
 96 bubblewrap R_1M4k1BhVFmWL… Low-Agen… "Bub…          4           5          5
 97 frisbee    R_6X1gjO8GTNVI… High-Age… "Car…          4           5          5
 98 brick      R_6zARonxE8Sxl… Control   "Tra…          4           5          5
 99 brick      R_5z92Djt5TP9C… Low-Agen… "exe…          4           5          5
100 frisbee    R_3OFDGhl8nE7H… Low-Agen… "Dog…          4           5          4
# ℹ 1,610 more rows
evaluations <- iriss_manual_ratings_metrics %>% 
  semi_join(compliant_ideas, by=c("submitter_id", "object", "use"))

Linear mixed model with compliant data

compliant_trials_demographics <- left_join(evaluations, demographics_recoded, by="submitter_id")

compliant_trials_demographics %>% 
  group_by(condition) %>% 
  get_summary_stats(usefulness, type="mean_sd")
# A tibble: 3 × 5
  condition   variable       n  mean    sd
  <chr>       <fct>      <dbl> <dbl> <dbl>
1 Control     usefulness   730  2.79  1.26
2 High-Agency usefulness   426  2.91  1.25
3 Low-Agency  usefulness   553  2.96  1.25
compliant_originality_mod <- lmer(
  originality ~ condition + scale(age) + as.factor(gender) + as.factor(race) + (1|object) + (1|submitter_id), data=compliant_trials_demographics)

compliant_creativity_mod <- lmer(
  creativity ~ condition + scale(age) + as.factor(gender) + as.factor(race) + (1|object) + (1|submitter_id), data=compliant_trials_demographics)

compliant_usefulness_mod <- lmer(
  usefulness ~ condition + scale(age) + as.factor(gender) + as.factor(race) + (1|object) + (1|submitter_id), data= compliant_trials_demographics
)
summary(compliant_usefulness_mod)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
usefulness ~ condition + scale(age) + as.factor(gender) + as.factor(race) +  
    (1 | object) + (1 | submitter_id)
   Data: compliant_trials_demographics

REML criterion at convergence: 5229.2

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.93014 -0.75475  0.01528  0.81305  2.17754 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.078527 0.28023 
 object       (Intercept) 0.004493 0.06703 
 Residual                 1.488457 1.22002 
Number of obs: 1592, groups:  submitter_id, 115; object, 3

Fixed effects:
                      Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)            2.77449    0.10018  25.78677  27.694   <2e-16 ***
conditionHigh-Agency   0.11861    0.10270 107.06727   1.155   0.2507    
conditionLow-Agency    0.19526    0.09721  98.40361   2.009   0.0473 *  
scale(age)             0.03217    0.04420 103.77170   0.728   0.4684    
as.factor(gender)1     0.03698    0.09220  97.05499   0.401   0.6892    
as.factor(race)1      -0.01184    0.10561 101.30660  -0.112   0.9109    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) cndH-A cndL-A scl(g) as.fctr(g)1
cndtnHgh-Ag -0.347                                 
cndtnLw-Agn -0.366  0.398                          
scale(age)   0.125  0.045 -0.078                   
as.fctr(g)1 -0.631 -0.097 -0.077 -0.097            
as.fctr(r)1 -0.300  0.130  0.052 -0.100  0.067     
anova(compliant_usefulness_mod, type=3)
Type III Analysis of Variance Table with Satterthwaite's method
                  Sum Sq Mean Sq NumDF   DenDF F value Pr(>F)
condition         6.2282 3.11410     2 104.584  2.0922 0.1286
scale(age)        0.7883 0.78825     1 103.772  0.5296 0.4684
as.factor(gender) 0.2394 0.23944     1  97.055  0.1609 0.6892
as.factor(race)   0.0187 0.01872     1 101.307  0.0126 0.9109
effectsize::eta_squared(compliant_usefulness_mod)
# Effect Size for ANOVA (Type III)

Parameter         | Eta2 (partial) |       95% CI
-------------------------------------------------
condition         |           0.04 | [0.00, 1.00]
scale(age)        |       5.08e-03 | [0.00, 1.00]
as.factor(gender) |       1.65e-03 | [0.00, 1.00]
as.factor(race)   |       1.24e-04 | [0.00, 1.00]

- One-sided CIs: upper bound fixed at [1.00].
emmeans(compliant_usefulness_mod, pairwise ~ condition)
$emmeans
 condition   emmean     SE   df lower.CL upper.CL
 Control       2.78 0.0792 11.9     2.61     2.96
 High-Agency   2.90 0.1010 27.7     2.70     3.11
 Low-Agency    2.98 0.0922 20.4     2.79     3.17

Results are averaged over the levels of: gender, race 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

$contrasts
 contrast                     estimate     SE  df t.ratio p.value
 Control - (High-Agency)       -0.1186 0.1030 110  -1.153  0.4835
 Control - (Low-Agency)        -0.1953 0.0973 101  -2.007  0.1156
 (High-Agency) - (Low-Agency)  -0.0766 0.1100 113  -0.697  0.7655

Results are averaged over the levels of: gender, race 
Degrees-of-freedom method: kenward-roger 
P value adjustment: tukey method for comparing a family of 3 estimates 
sjPlot::tab_model(compliant_creativity_mod, compliant_originality_mod, compliant_usefulness_mod,
                  pred.labels = c("Intercept",
                                  "High-Agency Condition",
                                  "Low-Agency Condition",
                                  "Age",
                                  "Gender (Female)",
                                  "Race (White)"),
                  string.ci = "95% CI",
                  dv.labels = c("Creativity", "Originality", "Usefulness"))
  Creativity Originality Usefulness
Predictors Estimates 95% CI p Estimates 95% CI p Estimates 95% CI p
Intercept 2.50 2.28 – 2.72 <0.001 2.46 2.18 – 2.75 <0.001 2.77 2.58 – 2.97 <0.001
High-Agency Condition 0.08 -0.09 – 0.26 0.361 0.12 -0.10 – 0.34 0.275 0.12 -0.08 – 0.32 0.248
Low-Agency Condition 0.09 -0.07 – 0.26 0.278 0.08 -0.12 – 0.29 0.424 0.20 0.00 – 0.39 0.045
Age -0.03 -0.10 – 0.05 0.473 -0.03 -0.12 – 0.07 0.567 0.03 -0.05 – 0.12 0.467
Gender (Female) -0.06 -0.22 – 0.10 0.451 -0.04 -0.24 – 0.15 0.674 0.04 -0.14 – 0.22 0.688
Race (White) -0.02 -0.19 – 0.16 0.869 0.00 -0.22 – 0.22 0.990 -0.01 -0.22 – 0.20 0.911
Random Effects
σ2 1.29 1.39 1.49
τ00 0.05 submitter_id 0.11 submitter_id 0.08 submitter_id
0.02 object 0.03 object 0.00 object
ICC 0.05 0.10 0.05
N 3 object 3 object 3 object
115 submitter_id 115 submitter_id 115 submitter_id
Observations 1592 1592 1592
Marginal R2 / Conditional R2 0.002 / 0.051 0.002 / 0.097 0.006 / 0.058
summary(compliant_originality_mod)   
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
originality ~ condition + scale(age) + as.factor(gender) + as.factor(race) +  
    (1 | object) + (1 | submitter_id)
   Data: compliant_trials_demographics

REML criterion at convergence: 5149.9

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.9155 -0.8925 -0.1247  0.7020  2.4956 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.11333  0.3366  
 object       (Intercept) 0.03352  0.1831  
 Residual                 1.39380  1.1806  
Number of obs: 1592, groups:  submitter_id, 115; object, 3

Fixed effects:
                       Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)            2.463921   0.145324   5.980222  16.955 2.77e-06 ***
conditionHigh-Agency   0.119949   0.109949 118.500983   1.091    0.278    
conditionLow-Agency    0.083544   0.104525 108.799112   0.799    0.426    
scale(age)            -0.027179   0.047431 113.883607  -0.573    0.568    
as.factor(gender)1    -0.041696   0.099188 108.195805  -0.420    0.675    
as.factor(race)1       0.001451   0.113407 112.405655   0.013    0.990    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) cndH-A cndL-A scl(g) as.fctr(g)1
cndtnHgh-Ag -0.261                                 
cndtnLw-Agn -0.273  0.402                          
scale(age)   0.092  0.044 -0.077                   
as.fctr(g)1 -0.470 -0.094 -0.077 -0.098            
as.fctr(r)1 -0.224  0.128  0.051 -0.099  0.072     
anova(compliant_originality_mod)
Type III Analysis of Variance Table with Satterthwaite's method
                   Sum Sq Mean Sq NumDF  DenDF F value Pr(>F)
condition         1.87505 0.93752     2 115.50  0.6726 0.5123
scale(age)        0.45764 0.45764     1 113.88  0.3283 0.5678
as.factor(gender) 0.24630 0.24630     1 108.20  0.1767 0.6750
as.factor(race)   0.00023 0.00023     1 112.41  0.0002 0.9898
homogeneity_demographics <- left_join(compliant_similarities, demographics_recoded, by="submitter_id")
compliant_semantic_similarity_mod <- lmer(similarity ~ condition + (1|object) + (1|submitter_id), data=homogeneity_demographics)
anova(compliant_semantic_similarity_mod)
Type III Analysis of Variance Table with Satterthwaite's method
           Sum Sq  Mean Sq NumDF  DenDF F value  Pr(>F)  
condition 0.14614 0.073069     2 119.72  4.7293 0.01055 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
effectsize::eta_squared(compliant_semantic_similarity_mod)
# Effect Size for ANOVA (Type III)

Parameter | Eta2 (partial) |       95% CI
-----------------------------------------
condition |           0.07 | [0.01, 1.00]

- One-sided CIs: upper bound fixed at [1.00].
compliant_similarity_demographics_mod <- lmer(
  similarity ~ condition + scale(age) + as.factor(gender) + as.factor(race) + (1|object) + (1|submitter_id), data=homogeneity_demographics)
boundary (singular) fit: see help('isSingular')
summary(compliant_similarity_demographics_mod)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
similarity ~ condition + scale(age) + as.factor(gender) + as.factor(race) +  
    (1 | object) + (1 | submitter_id)
   Data: homogeneity_demographics

REML criterion at convergence: -1501.3

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.6840 -0.6286 -0.0606  0.5321  4.0311 

Random effects:
 Groups       Name        Variance Std.Dev.
 submitter_id (Intercept) 0.0113   0.1063  
 object       (Intercept) 0.0000   0.0000  
 Residual                 0.0157   0.1253  
Number of obs: 1351, groups:  submitter_id, 115; object, 3

Fixed effects:
                       Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)            0.236971   0.024632 103.509805   9.621 4.95e-16 ***
conditionHigh-Agency   0.011263   0.026371 110.613122   0.427  0.67013    
conditionLow-Agency    0.074770   0.025617 106.061499   2.919  0.00429 ** 
scale(age)            -0.015042   0.011015 108.647802  -1.366  0.17490    
as.factor(gender)1    -0.008487   0.024231 105.923534  -0.350  0.72685    
as.factor(race)1      -0.048167   0.027470 107.538189  -1.753  0.08238 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) cndH-A cndL-A scl(g) as.fctr(g)1
cndtnHgh-Ag -0.406                                 
cndtnLw-Agn -0.401  0.420                          
scale(age)   0.129  0.037 -0.067                   
as.fctr(g)1 -0.684 -0.080 -0.080 -0.111            
as.fctr(r)1 -0.329  0.127  0.048 -0.085  0.084     
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
sjPlot::tab_model(compliant_similarity_demographics_mod,
                  pred.labels = c("Intercept",
                                  "High-Agency Condition",
                                  "Low-Agency Condition",
                                  "Age",
                                  "Gender (Female)",
                                  "Race (White)"),
                  string.ci = "95% CI",
                  dv.labels = "Homogeneity")
  Homogeneity
Predictors Estimates 95% CI p
Intercept 0.24 0.19 – 0.29 <0.001
High-Agency Condition 0.01 -0.04 – 0.06 0.669
Low-Agency Condition 0.07 0.02 – 0.13 0.004
Age -0.02 -0.04 – 0.01 0.172
Gender (Female) -0.01 -0.06 – 0.04 0.726
Race (White) -0.05 -0.10 – 0.01 0.080
Random Effects
σ2 0.02
τ00 submitter_id 0.01
τ00 object 0.00
N object 3
N submitter_id 115
Observations 1351
Marginal R2 / Conditional R2 0.093 / NA
similarity_stats <- aov(similarity ~ condition, data=compliant_similarities)
anova(compliant_semantic_similarity_mod)
Type III Analysis of Variance Table with Satterthwaite's method
           Sum Sq  Mean Sq NumDF  DenDF F value  Pr(>F)  
condition 0.14614 0.073069     2 119.72  4.7293 0.01055 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# F_to_eta2(4.45, df=2, df_error=120.12)
effectsize::eta_squared(compliant_semantic_similarity_mod)
# Effect Size for ANOVA (Type III)

Parameter | Eta2 (partial) |       95% CI
-----------------------------------------
condition |           0.07 | [0.01, 1.00]

- One-sided CIs: upper bound fixed at [1.00].
# Get estimated marginal means
emm <- emmeans(compliant_semantic_similarity_mod, ~ condition, lmerTest.limit=3380)

# With adjustment for multiple comparisons (Tukey is default)
pairs(emm, adjust = "tukey", reverse = TRUE)
 contrast                     estimate     SE  df t.ratio p.value
 (High-Agency) - Control         0.019 0.0257 121   0.739  0.7408
 (Low-Agency) - Control          0.074 0.0246 116   3.007  0.0090
 (Low-Agency) - (High-Agency)    0.055 0.0268 123   2.056  0.1035

Degrees-of-freedom method: kenward-roger 
P value adjustment: tukey method for comparing a family of 3 estimates 

Compliant Originality Plot

outcomes <- c("originality", "creativity", "usefulness")

plots <- map(outcomes, function(var) {
  summary_df <- compliant_trials_demographics %>%
    mutate(condition= fct_relevel(condition, c("Low-Agency", "High-Agency", "Control"))) %>% 
    group_by(submitter_id, condition) %>%
    summarize(value = mean(.data[[var]]), .groups = "drop")
  
  ggplot(summary_df, aes(x = value, y = condition, fill = condition)) +
    geom_density_ridges(alpha = 0.6, quantile_lines = TRUE, quantile_fun = mean) +
    xlim(c(1,5)) +
    labs(x = str_to_title(var), y = "") +
    theme_apa() +
    theme(legend.position = "none")
})

# To display the plots:
originality <- plots[[1]]  # originality
creativity <- plots[[2]]  # creativity
usefulness <- plots[[3]]  # usefulness

ggsave("figures/originality.png", plot = plots[[1]])
Saving 7 x 5 in image
Picking joint bandwidth of 0.185
ggsave("figures/creativity.png", plot = plots[[2]])
Saving 7 x 5 in image
Picking joint bandwidth of 0.145
ggsave("figures/usefulness.png", plot = plots[[3]])
Saving 7 x 5 in image
Picking joint bandwidth of 0.186
# Combined plot
compliant_evaluation_plot <- creativity + originality + usefulness + plot_annotation(tag_levels = "a")
compliant_evaluation_plot
Picking joint bandwidth of 0.145
Picking joint bandwidth of 0.185
Picking joint bandwidth of 0.186

ggsave(plot=compliant_evaluation_plot, filename="figures/compliant_evaluations.png")
Saving 7 x 5 in image
Picking joint bandwidth of 0.145

Picking joint bandwidth of 0.185

Picking joint bandwidth of 0.186

Compliant Similarity Plot

similarity_means_df <- compliant_similarities %>%
  group_by(condition) %>%
  summarise(mean_similarity = mean(similarity, na.rm = TRUE))

compliant_semantic_similarity_plot <- compliant_similarities %>% 
  mutate(condition= fct_relevel(condition, c("Low-Agency", "High-Agency", "Control"))) %>% 
  ggplot(aes(x=similarity, y=condition, fill=condition)) +
  geom_density_ridges(alpha=.6,
                      quantile_lines = TRUE,
                      quantile_fun = mean) +
  geom_text(
    data = similarity_means_df,
    aes(x = mean_similarity, y = condition, 
        label = round(mean_similarity, 2)),   # label = numeric mean
    nudge_x = -0.05,    # shift to the right of the mean line
    nudge_y = 0.8,  
    size = 3.5,
    color = "black"
  ) +
  scale_x_continuous(breaks=seq(-.5, 1, by=.1)) +
  labs(x="Homogeneity",
       y="") +
  theme_apa() +
  theme(legend.position = "none") 

compliant_semantic_similarity_plot
Picking joint bandwidth of 0.0433

ggsave("figures/compliant_semantic_similarity.png", plot=compliant_semantic_similarity_plot)
Saving 7 x 5 in image
Picking joint bandwidth of 0.0433

Around 24% of the low-agency conversations did not use AI and around 38% of the high-agency conversations did not use AI. Excluding the noncompliant trials did not change the results.

similarities_means_plot <- compliant_similarities %>% 
  mutate(condition= fct_relevel(condition, c("Low-Agency", "High-Agency", "Control"))) %>% 
  group_by(condition) %>% 
  get_summary_stats(similarity, type="mean_se") %>% 
  ggplot(aes(x=condition, y=mean, color=condition)) +
  geom_point() +
  geom_errorbar(aes(ymin = mean-se, ymax = mean+se))+
  ylim(c(0,1))

similarities_means_plot

# Table of descriptives
similarities_descriptives <- compliant_similarities %>% 
  group_by(condition) %>% 
  get_summary_stats(similarity, type="mean_sd") %>% 
  select(-c(variable, n)) %>% 
  gt() %>% 
  fmt_number(
    columns = c(mean, sd),
    decimals = 2
  ) %>% 
  cols_label(
    condition = "Condition",
    mean = "Mean",
    sd = "SD"
  ) %>% 
  tab_header(
    title = "Homogeneity by Condition"
  ) %>% 
  tab_options(
    table.border.top.color = "white",
    table.border.bottom.color = "white"
  ) %>% 
  opt_table_font(stack = "geometric-humanist") %>% 
  tab_style(
  style = list(
    cell_text(weight = "bold")
  ),
  locations = cells_column_labels()
) %>% 
  tab_source_note(
    source_note = "Note: Higher values indicate greater homogeneity."
  )

similarities_descriptives
Homogeneity by Condition
Condition Mean SD
Control 0.22 0.13
High-Agency 0.24 0.17
Low-Agency 0.30 0.20
Note: Higher values indicate greater homogeneity.

Who were the highest-performing participants?

evaluations %>% 
  group_by(submitter_id) %>% 
  summarize(mean_creativity = mean(creativity)) %>% 
  arrange(desc(mean_creativity))
# A tibble: 123 × 2
   submitter_id      mean_creativity
   <chr>                       <dbl>
 1 R_626PnHWHD8OebPr            3.67
 2 R_665Usq3Hx2qiS2y            3.6 
 3 R_1C4pbfUTsEaLjCV            3.47
 4 R_5ebqE9oR2LkKvO9            3.37
 5 R_7wcOe7LsUrh7hzH            3.24
 6 R_5z92Djt5TP9CJBb            3.16
 7 R_7nZ3ULt8wE34mW6            3.08
 8 R_6dB0PTZKZgdwBAS            3.07
 9 R_1fYHXsFEoxLPLcW            3   
10 R_3tnzcEGX1vgkQQk            3   
# ℹ 113 more rows
most_creative_participants <- iriss_manual_ratings_metrics %>% 
  filter(submitter_id %in% c("R_626PnHWHD8OebPr",
                             "R_665Usq3Hx2qiS2y",
                             "R_1C4pbfUTsEaLjCV",
                             "R_5ebqE9oR2LkKvO9",
                             "R_7wcOe7LsUrh7hzH",
                             "R_5z92Djt5TP9CJBb",
                             "R_7nZ3ULt8wE34mW6",
                             "R_6dB0PTZKZgdwBAS"))
# Factor analysis of intrinsic motivation items

intrinsic_motivation <- ' f =~ intrinsic_motivation_1 + intrinsic_motivation_2 + intrinsic_motivation_3 + intrinsic_motivation_4' 

onefac4items <- cfa(intrinsic_motivation, data=compliant_conversations_process)
summary(onefac4items, fit.measures=TRUE, standardized=TRUE)
lavaan 0.6-20 ended normally after 21 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                         8

  Number of observations                           337

Model Test User Model:
                                                      
  Test statistic                                93.492
  Degrees of freedom                                 2
  P-value (Chi-square)                           0.000

Model Test Baseline Model:

  Test statistic                              1167.477
  Degrees of freedom                                 6
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.921
  Tucker-Lewis Index (TLI)                       0.764

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -1892.863
  Loglikelihood unrestricted model (H1)      -1846.117
                                                      
  Akaike (AIC)                                3801.727
  Bayesian (BIC)                              3832.288
  Sample-size adjusted Bayesian (SABIC)       3806.910

Root Mean Square Error of Approximation:

  RMSEA                                          0.368
  90 Percent confidence interval - lower         0.307
  90 Percent confidence interval - upper         0.434
  P-value H_0: RMSEA <= 0.050                    0.000
  P-value H_0: RMSEA >= 0.080                    1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.035

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  f =~                                                                  
    intrnsc_mtvt_1    1.000                               1.178    0.868
    intrnsc_mtvt_2    1.164    0.048   24.339    0.000    1.372    0.928
    intrnsc_mtvt_3    1.259    0.055   22.911    0.000    1.483    0.897
    intrnsc_mtvt_4    0.929    0.052   17.904    0.000    1.094    0.782

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .intrnsc_mtvt_1    0.452    0.045   10.133    0.000    0.452    0.246
   .intrnsc_mtvt_2    0.301    0.042    7.178    0.000    0.301    0.138
   .intrnsc_mtvt_3    0.536    0.059    9.066    0.000    0.536    0.196
   .intrnsc_mtvt_4    0.761    0.066   11.573    0.000    0.761    0.388
    f                 1.388    0.140    9.899    0.000    1.000    1.000

CFI indicates good fit (.92)

Intrinsic motivation during task

# # make a composite intrinsic motivation item
# evaluations <- evaluations %>% 
#   group_by(submitter_id) %>% 
#   mutate(intrinsic_motivation, type="mean_sd")
# 
# # Plot intrinsic motivation
# intrinsic_motivation_p <- evaluations %>% 
#   distinct(submitter_id, condition, intrinsic_motivation_mean) %>%
#   ggplot(aes(x=intrinsic_motivation_mean, y=fct_relevel(condition, "low-agency", "high-agency", "Control"), fill=condition)) +
#   geom_density_ridges(
#     quantile_lines=TRUE, 
#     quantile_fun=mean) +
#   scale_fill_manual(
#     values = c(
#       "Control" = "#619CFF",
#       "low-agency" = "#F8766D",
#       "high-agency" = "#00BA38"
#     )) +
#   labs(x="Intrinsic motivation",
#        y="Condition") +
#   scale_x_continuous(breaks=seq(1, 7, by=1)) +
#   coord_cartesian(xlim = c(1,7)) +
#   theme_apa() +
#   theme(legend.position="none")
# 
# ggsave("figures/intrinsic_motivation.png", plot=intrinsic_motivation_p)
# # Statistical test
# 
# compliant_submitters <- evaluations %>% 
#   distinct(submitter_id, condition, conv_with_ai_mean, consideration_aid_mean, intrinsic_motivation_mean)
# 
# intrinsic_motivation_model <- stats::aov(intrinsic_motivation_mean ~ condition, data=compliant_submitters)
# summary(intrinsic_motivation_model)
# 
# emmeans(intrinsic_motivation_model, pairwise ~ condition, adjust = "bonferroni")
# # Comparing participants in the high- and low-agency conditions on intrinsic motivation, consideration AI, and conversation with AI
# 
# 
# 
# t.test(compliant_submitters$conv_with_ai_mean[compliant_submitters$condition == "high-agency"],
#        compliant_submitters$conv_with_ai_mean[compliant_submitters$condition == "low-agency"])
# 
# t.test(compliant_submitters$consideration_aid_mean[compliant_submitters$condition == "high-agency"],
#        compliant_submitters$consideration_aid_mean[compliant_submitters$condition == "low-agency"])
# Does the difference in self-perceived vs. other-perceived creativity vary by condition?

# perception_diffs <- self_other_perceptions %>% 
#   mutate(creativity_diff = self_creativity-creativity,
#          originality_diff = self_originality-originality,
#          usefulness_diff = self_usefulness-usefulness)
# 
# 
# perception_diffs %>% 
#   group_by(condition) %>% 
#   get_summary_stats(c(creativity_diff, originality_diff, usefulness_diff), type ="mean_sd") %>% 
#   ggplot(aes(x=condition,y=mean)) +
#   geom_point() +
#   facet_wrap(~variable)
# 
# perception_summary <- perception_diffs %>% 
#   group_by(condition) %>% 
#   get_summary_stats(c(creativity_diff, originality_diff, usefulness_diff), 
#                     type = "mean_sd") %>% 
#   mutate(sem = sd / sqrt(n))  # calculate SEM
# 
# perception_summary %>% 
#   ggplot(aes(x = condition, y = mean)) +
#   geom_point() +
#   geom_errorbar(aes(ymin = mean - sem, ymax = mean + sem), width = 0.1) +
#   facet_wrap(~variable) +
#   ylab("Mean ± SEM")
# 
# # Statistical analysis
# 
# perception_diff_participants <- perception_diffs %>% 
#   group_by(submitter_id, condition) %>% 
#   summarize(creativity_diff = mean(creativity_diff),
#             originality_diff = mean(originality_diff),
#             usefulness_diff = mean(usefulness_diff))
# 
# creativity_diff_anova <- aov(creativity_diff ~ condition,
#   data = perception_diff_participants
# )
# summary(creativity_diff_anova)
# 
# originality_diff_anova <- aov(originality_diff ~ condition, data=perception_diff_participants)
# summary(originality_diff_anova)
# 
# usefulness_diff_anova <- aov(usefulness_diff ~ condition, data=perception_diff_participants)
# summary(usefulness_diff_anova)

# # Predict difference score from condition, submitter ID, and object
# lmer(creativity_diff ~ condition + (1|submitter_id) + (1|object), data=perception_diffs) 

Timing

duration <- iriss %>% 
  select(submitter_id, condition, contains("Duration")) %>% 
  mutate(across(c(3:6), as.numeric)) %>% 
  rename(duration_sec = "Duration (in seconds)") %>% 
  mutate(duration_min = duration_sec/60) %>% 
  relocate(duration_min, .after=duration_sec)

duration %>% 
  get_summary_stats(duration_min, type = "five_number")
# A tibble: 1 × 7
  variable         n   min    max    q1 median    q3
  <fct>        <dbl> <dbl>  <dbl> <dbl>  <dbl> <dbl>
1 duration_min   150  5.73 10003.  16.9   21.9  31.3
# Timing variables refer to the amount of time it took the participant to read their condition's instructions
timing <- iriss %>% 
  select(submitter_id, condition, contains("Page Submit")) %>% 
  mutate(across(c(3:5), as.numeric)) %>% # Make all timer columns numeric
  pivot_longer(cols=starts_with("timer"), names_to = "trial", values_to = "duration") 
# Perceptions of uniqueness in the evaluated, compliant sample
uniqueness <- iriss_self_perceptions %>% 
  right_join(compliant_conversations, by=c("submitter_id", "object")) %>% 
  mutate(unique=as.numeric(unique))

unique_lm <- lm(unique~condition.x, data=uniqueness)
summary(unique_lm)

Call:
lm(formula = unique ~ condition.x, data = uniqueness)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.4861 -0.4861 -0.2963  0.6630  1.7037 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)             2.33696    0.08875  26.332   <2e-16 ***
condition.xHigh-Agency -0.04066    0.14593  -0.279    0.781    
condition.xLow-Agency   0.14915    0.13394   1.114    0.267    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.8513 on 215 degrees of freedom
  (119 observations deleted due to missingness)
Multiple R-squared:  0.00863,   Adjusted R-squared:  -0.0005921 
F-statistic: 0.9358 on 2 and 215 DF,  p-value: 0.3939
intent <- read_csv("data/iriss_intent_classification.csv") %>% 
  rename(submitter_id="participant_id") 
Rows: 207 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): participant_id, condition, last_user_message, reasoning
dbl (5): last_turn, asking, doing, expressing, conversation_length

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
intent %>% 
  group_by(condition) %>% 
  summarize(mean_doing = mean(doing),
            sd_doing = sd(doing))
# A tibble: 2 × 3
  condition mean_doing sd_doing
  <chr>          <dbl>    <dbl>
1 Passenger      0.412    0.494
2 Pilot          0.387    0.490

Direct Asks

direct_asks <- read_csv("data/direct_asks.csv") %>% 
  rename(submitter_id = "ResponseId") %>% 
  mutate(Condition=case_when(Condition=="passenger"~"Low-Agency",
                             Condition=="pilot"~"High-Agency")) %>% 
  rename(condition="Condition")
Rows: 76 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): ResponseId, Condition
dbl (6): total_count, direct_ask_count, constraining_count, backgrounding_co...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Direct ask plot
direct_asks_p <- direct_asks %>% 
  mutate(condition = case_when(
    condition == "High-Agency" ~ "High-Agency<br><i>N</i> = 36",
    condition == "Low-Agency"  ~ "Low-Agency<br><i>N</i> = 40",
    TRUE ~ condition
  )) %>% 
  group_by(condition) %>% 
  get_summary_stats(direct_ask_proportion, type="mean_se") %>% 
  ggplot(aes(x=condition, y=mean, color=condition)) +
  geom_point(size=4) +
  geom_errorbar(aes(ymin=mean-se, ymax=mean+se), width=0) +
  scale_color_manual(
    values = c(
      "Low-Agency<br><i>N</i> = 40" = "#F8766D",
      "High-Agency<br><i>N</i> = 36" = "#00BA38"
    )) +
  ggsignif::geom_signif(
    comparisons = list(c("High-Agency<br><i>N</i> = 36", "Low-Agency<br><i>N</i> = 40")), 
    annotations = "***",
    y_position = 0.95,
    tip_length = 0.01,
    textsize = 6,
    color="black"
  ) +
  ylim(c(0,1)) +
  labs(x="",
       y="Proportion of direct asks") +
  theme_apa() +
  theme(legend.position="none",
        axis.text.x = element_markdown())

direct_asks_p

ggsave("figures/
       direct_asks.png", plot=direct_asks_p)
Saving 7 x 5 in image
t.test(data=direct_asks, direct_ask_proportion~condition, var.equal = TRUE)

    Two Sample t-test

data:  direct_ask_proportion by condition
t = -3.6733, df = 74, p-value = 0.00045
alternative hypothesis: true difference in means between group High-Agency and group Low-Agency is not equal to 0
95 percent confidence interval:
 -0.5319985 -0.1578163
sample estimates:
mean in group High-Agency  mean in group Low-Agency 
                0.3842593                 0.7291667 
# Does the proportion of direct asks correlate with person-level homogeneity?

participant_compliant_similarity <- compliant_similarities %>%
  filter(condition %in% c("Low-Agency", "High-Agency")) %>% 
  group_by(submitter_id, condition) %>% 
  summarize(mean_similarity = mean(similarity))
`summarise()` has grouped output by 'submitter_id'. You can override using the
`.groups` argument.
direct_ask_similarity <- full_join(direct_asks, participant_compliant_similarity, by=c("submitter_id", "condition")) %>% 
    mutate(direct_ask_binary = if_else(direct_ask_proportion==0, 0, 1)) # If no direct asks, they get coded as Guided Ideation

ggplot(direct_ask_similarity, aes(x=direct_ask_proportion,y=mean_similarity)) +
  geom_point() +
  stat_smooth()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

cor.test(direct_ask_similarity$direct_ask_proportion, direct_ask_similarity$mean_similarity, method="spearman")
Warning in cor.test.default(direct_ask_similarity$direct_ask_proportion, :
Cannot compute exact p-value with ties

    Spearman's rank correlation rho

data:  direct_ask_similarity$direct_ask_proportion and direct_ask_similarity$mean_similarity
S = 68626, p-value = 0.5956
alternative hypothesis: true rho is not equal to 0
sample estimates:
       rho 
0.06184404 
direct_ask_sim_model <- lm(mean_similarity ~ direct_ask_proportion*condition, data = direct_ask_similarity)
summary(direct_ask_sim_model)

Call:
lm(formula = mean_similarity ~ direct_ask_proportion * condition, 
    data = direct_ask_similarity)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.19326 -0.10037 -0.02794  0.06258  0.45113 

Coefficients:
                                          Estimate Std. Error t value Pr(>|t|)
(Intercept)                                0.25063    0.03026   8.283 4.54e-12
direct_ask_proportion                     -0.01202    0.05051  -0.238    0.813
conditionLow-Agency                        0.02967    0.05972   0.497    0.621
direct_ask_proportion:conditionLow-Agency  0.03012    0.08140   0.370    0.712
                                             
(Intercept)                               ***
direct_ask_proportion                        
conditionLow-Agency                          
direct_ask_proportion:conditionLow-Agency    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1393 on 72 degrees of freedom
Multiple R-squared:  0.03148,   Adjusted R-squared:  -0.008876 
F-statistic: 0.7801 on 3 and 72 DF,  p-value: 0.5089
ggplot(direct_ask_similarity, aes(x=direct_ask_binary, y=mean_similarity)) +
  geom_point() 

# Logistic regression predicting direct ask from condition
direct_ask.log <- glm(direct_ask_binary ~ condition, family = "binomial", data=direct_ask_similarity)
summary(direct_ask.log)

Call:
glm(formula = direct_ask_binary ~ condition, family = "binomial", 
    data = direct_ask_similarity)

Coefficients:
                    Estimate Std. Error z value Pr(>|z|)    
(Intercept)          -0.2231     0.3354  -0.665 0.505868    
conditionLow-Agency   2.4204     0.6247   3.875 0.000107 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 94.796  on 75  degrees of freedom
Residual deviance: 75.468  on 74  degrees of freedom
AIC: 79.468

Number of Fisher Scoring iterations: 4
exp(coef(direct_ask.log))
        (Intercept) conditionLow-Agency 
               0.80               11.25 
# Does direct ask binary predict mean similarity?

direct_ask.lm <- lm(mean_similarity ~ direct_ask_binary*condition, data=direct_ask_similarity)
summary(direct_ask.lm)

Call:
lm(formula = mean_similarity ~ direct_ask_binary * condition, 
    data = direct_ask_similarity)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.19925 -0.10360 -0.02861  0.06717  0.43308 

Coefficients:
                                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)                            0.25679    0.03063   8.383 2.95e-12 ***
direct_ask_binary                     -0.02426    0.04595  -0.528    0.599    
conditionLow-Agency                   -0.06132    0.07503  -0.817    0.416    
direct_ask_binary:conditionLow-Agency  0.13318    0.08558   1.556    0.124    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.137 on 72 degrees of freedom
Multiple R-squared:  0.06288,   Adjusted R-squared:  0.02384 
F-statistic:  1.61 on 3 and 72 DF,  p-value: 0.1944
biserial.cor(direct_ask_similarity$mean_similarity, direct_ask_similarity$direct_ask_binary)
[1] -0.1205152
cor.test(direct_ask_similarity$mean_similarity, direct_ask_similarity$direct_ask_binary)

    Pearson's product-moment correlation

data:  direct_ask_similarity$mean_similarity and direct_ask_similarity$direct_ask_binary
t = 1.0443, df = 74, p-value = 0.2997
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.1078712  0.3368191
sample estimates:
      cor 
0.1205152 
weightedCorr(direct_ask_similarity$mean_similarity, direct_ask_similarity$direct_ask_binary, method="Pearson", weights=direct_ask_similarity$total_count)
[1] 0.1761798

Post-hoc power analysis