Main Experiment

Load Processed Data

# find the most recent file
find_most_recent <- function(directory, pattern){
  files <-  list.files(directory, pattern = pattern, full.names = TRUE) 
  filename <- files[which.max(file.info(files)$mtime)]
  return(filename)
}

df_post_file <- find_most_recent("processed_survey_data", "df_post_output")
df_pre_file <- find_most_recent("processed_survey_data", "df_pre_output")

valid_user_file <- find_most_recent("processed_survey_data", "^valid_user")
invalid_user_file <- find_most_recent("processed_survey_data", "^invalid_user")
trace_file <- find_most_recent("processed_survey_data", "trace")

user <- find_most_recent("processed_survey_data", "^user")
dialog <- find_most_recent("processed_survey_data", "^dialog")
comment <- find_most_recent("processed_survey_data", "^comment_")
api <- find_most_recent("processed_survey_data", "^api")
chat <- find_most_recent("processed_survey_data", "^chat")
video <- find_most_recent("processed_survey_data", "^video")
commentchange <- find_most_recent("processed_survey_data", "^commentchange")
action <- find_most_recent("processed_survey_data", "^action")

# print the file names
print(paste0("Loaded ", c("Pre", "Post", "Valid User", "Invalid User", "Trace"), 
             " File: ", 
             c(df_pre_file, df_post_file, valid_user_file, invalid_user_file, trace_file)))

## [1] "Loaded Pre File: processed_survey_data/df_pre_output_2024-11-20.csv"        
## [2] "Loaded Post File: processed_survey_data/df_post_output_2024-11-20.csv"      
## [3] "Loaded Valid User File: processed_survey_data/valid_user_2024-11-20.rds"    
## [4] "Loaded Invalid User File: processed_survey_data/invalid_user_2024-11-20.rds"
## [5] "Loaded Trace File: processed_survey_data/trace_2024-11-20.rds"

df_post_output <- read.csv(df_post_file)
df_pre_output <- read.csv(df_pre_file)

valid_user <- readRDS(valid_user_file)
invalid_user <- readRDS(invalid_user_file)
trace <- readRDS(trace_file)

user <- read.csv(user)
dialog <- read.csv(dialog)
comment <- read.csv(comment)
api <- read.csv(api)
chat <- read.csv(chat)
video <- read.csv(video)
commentchange <- read.csv(commentchange)
action <- read.csv(action)

df_post_output$username <- tolower(df_post_output$username)
df_post_output$username <- str_trim(df_post_output$username)

Additional Data Cleaning

pre_cov <- c("social_media", "social_media_use", "website_use", "gender", "age", "race", "edu", "polparty", "libcons", "income")
post_cov <- c("social_media_reply", "review_freq", "review_exp", "AI_Review", "AIreview_exp", "AIreview_difficulty", "AIreview_like", "AIreview_dislike", "AIreview_concern", "system_like", "system_dislike", "review_sys", "review_sys_detail", "review_sys_rec", "willingness_to_pay", "mech_speed", "mech_wording", "mech_formulate", "mech_popup", "mech_AIaversion", "mech_difficulty", "mech_trueop")

Step 1: Figure out why post survey response number is less than number of valid users

Currently we have 1982 post-survey responses, but 1985 valid users.

valid_user[duplicated(valid_user$prolific_id),]

We have one user who completed the post-survey twice, it was counted four times during the matching process. We remove the duplicate record and keep the post-survey that was done first. [Question: do we want to remove this user?]

# keep the record with smallest time_diff
valid_user <- valid_user %>% unique() %>% arrange(time_diff) %>% distinct(prolific_id, .keep_all = TRUE) %>% arrange(User.Id)

Step 2: Figure out why pre survey response number is bigger than number of valid users

Currently we have 2049 pre-survey responses, but only 1982 valid users.

For those that did not have a “complete submission” (as pre-survey gives a link to the post-survey, so not everyone click the next button to submit the pre-survey), we have issues with recording which treatment they got assigned to. This would be a problem for users who submitted more than one pre-survey as they could have been assigned to different treatments.

non_dup_user <- df_pre_output %>% group_by(prolific_id) %>% filter(n() == 1) %>% pull(prolific_id)
dup_user <- df_pre_output %>% group_by(prolific_id) %>% filter(n() > 1) %>% pull(prolific_id)

# Apparently there could be duplicates just because of it was double recorded both in the "InProgress" section and the "Complete" section of the data collection. If this is the case, we count them as "did not submit more than one pre-survey". (only 1 such occasion after checking)
df_pre_output <- df_pre_output %>% filter(ResponseId != "FS_3HJxJARkcByR1fj")
non_dup_user <- c(non_dup_user, "6639bc448033a441fd301e89")
dup_user <- setdiff(dup_user, "6639bc448033a441fd301e89")

print(paste0("Number of users who submitted only one pre-survey: ", length(non_dup_user)))

## [1] "Number of users who submitted only one pre-survey: 1918"

print(paste0("Number of users who submitted more than one pre-survey: ", length(dup_user)))

## [1] "Number of users who submitted more than one pre-survey: 64"

–

Find duplicated users who gave the same response for the demographic questions

# find the duplicated user who gave the same response for the demographic questions
dup_same_user <- df_pre_output %>% 
  filter(prolific_id %in% dup_user) %>% 
  dplyr::select(prolific_id, all_of(pre_cov)) %>% 
  distinct() %>% 
  group_by(prolific_id) %>% 
  filter(n() == 1) %>% 
  pull(prolific_id)

dup_same_user <- c(dup_same_user, "66fb023e270e01ac27ce2ce3", "636da58a0d76bbb9167dbf3e")
dup_nonsame_user <- setdiff(dup_user, dup_same_user)
print(paste0("Number of duplicated users who gave the same response for the demographic questions: ", length(dup_same_user)))

## [1] "Number of duplicated users who gave the same response for the demographic questions: 35"

print(paste0("Number of duplicated users who gave different response for the demographic questions: ", length(dup_nonsame_user)))

## [1] "Number of duplicated users who gave different response for the demographic questions: 29"

For now, we are being conservative and only use the 1918 users who submitted only one pre-survey.

df_pre_final <- df_pre_output %>% filter(prolific_id %in% non_dup_user) 
df_post_final <- df_post_output %>% filter(prolific_id %in% non_dup_user)

Step 3: Merge Pre- and Post- surveys

df_pre_final %>% dplyr::select(prolific_id, all_of(pre_cov), mode) -> df_pre_final
df_post_final %>% dplyr::select(prolific_id, username, all_of(post_cov), mode) -> df_post_final

df_final <- merge(df_pre_final, df_post_final, by = "prolific_id")

# Robustness check if the treatment groups are the same
df_final$mode.x <- ifelse(is.na(df_final$mode.x), df_final$mode.y, df_final$mode.x)
print(paste0("Number of responses that have inconsistent treatment groups in Pre- and Post- survey: ", sum(df_final$mode.x != df_final$mode.y)))

## [1] "Number of responses that have inconsistent treatment groups in Pre- and Post- survey: 0"

df_final$mode <- df_final$mode.x
df_final <- df_final %>% dplyr::select(-mode.x, -mode.y)

df_final <- merge(df_final, user %>% dplyr::select(User.Id, Mode, Username), by.x = "username", by.y = "Username", all.x = T)
df_final$Mode <- str_replace(df_final$Mode, "Mode ", "") %>% as.numeric() # extract integer
print(paste0("Number of responses that have inconsistent treatment groups in Pre-survey and Video Website data: ", sum(df_final$mode != df_final$Mode)))

## [1] "Number of responses that have inconsistent treatment groups in Pre-survey and Video Website data: 9"

df_final <- df_final %>% filter(mode == Mode) %>% dplyr::select(-Mode) %>% rename(Treatment = mode)
print(paste0("Final number of valid responses: ", nrow(df_final)))

## [1] "Final number of valid responses: 1909"

Step 4: Clean Video Website Data

There are a few users that did not pass the cutoff time for at least 3 videos. We remove these users from the analysis.

action_final <- action %>% filter(User.Id %in% df_final$User.Id)
action_final <- merge(action_final, video %>% dplyr::select(Video.Id, cutoff_time), by = "Video.Id", all.x = T)
action_final$pass_cutoff <- ifelse(action_final$duration >= action_final$cutoff_time - 15, 1, 0)
not_cutoff_users <- action_final %>% group_by(User.Id) %>% summarise(num_videos = sum(pass_cutoff)) %>% filter(num_videos < 3) %>% pull(User.Id)
cutoff_users <- setdiff(df_final$User.Id, not_cutoff_users)
print(paste0("Number of users who passed the cutoff time for at least 3 videos: ", length(cutoff_users)))

## [1] "Number of users who passed the cutoff time for at least 3 videos: 1893"

print(paste0("Number of users who did not pass the cutoff time for at least 3 videos: ", length(not_cutoff_users)))

## [1] "Number of users who did not pass the cutoff time for at least 3 videos: 16"

There are a few user/video combination that exists more than once, i.e. the user entered a video more than once and both time passed the cutoff time. We keep the first record.

# action_final_cutoff_user <- action_final %>% 
#   filter(User.Id %in% cutoff_users) %>% 
#   arrange(User.Id, Enter.Time) %>% 
#   filter(pass_cutoff == 1) %>% 
#   dplyr::select(-pass_cutoff)
# 
# action_final_noncutoff_user <- action_final %>% 
#   filter(User.Id %in% not_cutoff_users) %>% 
#   arrange(User.Id, Enter.Time) 

action_final <- action_final %>% 
  filter(User.Id %in% cutoff_users) %>% 
  arrange(User.Id, Enter.Time) %>% 
  filter(pass_cutoff == 1) %>% 
  dplyr::select(-pass_cutoff)

initial_num_row_action <- nrow(action_final)
# if User.Id and Video.Id combination exists more than once, keep the first one
action_final <- action_final %>% group_by(User.Id, Video.Id) %>% filter(row_number() == 1)

# order by Entry time and assign order
action_final <- action_final %>% arrange(User.Id, Enter.Time)
action_final <- action_final %>% group_by(User.Id) %>% mutate(order = row_number())

final_num_row_action <- nrow(action_final)
print(paste0("Number of user/video combination that has more than one record: ", initial_num_row_action - final_num_row_action))

## [1] "Number of user/video combination that has more than one record: 30"

action_final3 <- action_final %>% group_by(User.Id) %>% filter(row_number() <= 3)


print(paste0("Number of user/video combinations that passed the cutoff time (after filtering): ", nrow(action_final)))

## [1] "Number of user/video combinations that passed the cutoff time (after filtering): 5921"

print(paste0("Restricting to first three videos started and passed the cutoff time: ", nrow(action_final3), ", which should equal to 3 * ", length(cutoff_users)))

## [1] "Restricting to first three videos started and passed the cutoff time: 5679, which should equal to 3 * 1893"

# action_final %>% 
#   filter(User.Id %in% not_cutoff_users) %>% 
#   arrange(User.Id, desc(pass_cutoff), desc(duration))

There are some user/video combinations that have more than one comment, i.e. the user could comment twice on the same video. We concatenate the comments into one as of now.

comment_final <- comment %>% filter(User.Id %in% cutoff_users)
more_than_one_comment <- comment_final %>% group_by(User.Id, Video.Id) %>% filter(n() > 1)
print(paste0("Number of user/video combinations that have more than one comment: ", nrow(more_than_one_comment) / 2))

## [1] "Number of user/video combinations that have more than one comment: 19"

# if multiple comments for the same user and video, concat
comment_final <- comment_final %>% group_by(User.Id, Video.Id) %>% summarise(Content = paste(Content, collapse = " || "))

## `summarise()` has grouped output by 'User.Id'. You can override using the
## `.groups` argument.

videoweb_final <- merge(action_final, comment_final, by = c("User.Id", "Video.Id"), all.x = T)
videoweb_final3 <- merge(action_final3, comment_final, by = c("User.Id", "Video.Id"), all.x = T)

videoweb_final$hasComment <- ifelse(!is.na(videoweb_final$Content), 1, 0)
videoweb_final3$hasComment <- ifelse(!is.na(videoweb_final3$Content), 1, 0)

api_interact <- api %>% filter(Action == "SendAPI") %>% group_by(User.Id, Video.Id) %>% summarise(interactionAPI = paste(Content, collapse = "||"))

## `summarise()` has grouped output by 'User.Id'. You can override using the
## `.groups` argument.

chat_interact <- chat %>% filter(Action == "SendChat") %>% group_by(User.Id, Video.Id) %>% summarise(interactionChat = paste(Content, collapse = "||"))

## `summarise()` has grouped output by 'User.Id'. You can override using the
## `.groups` argument.

api_interact$interactionAPI <- api_interact$interactionAPI %>% str_replace_all("&quot;", "'")
chat_interact$interactionChat <- chat_interact$interactionChat %>% str_replace_all("&quot;", "'")
api_interact$interactionAPI <- api_interact$interactionAPI %>% str_replace_all('\"', '"')
chat_interact$interactionChat <- chat_interact$interactionChat %>% str_replace_all('\"', '"')

videoweb_final <- merge(videoweb_final, api_interact, by = c("User.Id", "Video.Id"), all.x = T)
videoweb_final <- merge(videoweb_final, chat_interact, by = c("User.Id", "Video.Id"), all.x = T)
videoweb_final$interactionAPI <- ifelse(is.na(videoweb_final$interactionAPI), "", videoweb_final$interactionAPI)
videoweb_final$interactionChat <- ifelse(is.na(videoweb_final$interactionChat), "", videoweb_final$interactionChat)
# merge column into AI Interaction
videoweb_final$AIinteraction <- paste(videoweb_final$interactionAPI, videoweb_final$interactionChat, sep = "")
videoweb_final$AIinteraction <- ifelse(videoweb_final$AIinteraction == "", NA, videoweb_final$AIinteraction)
videoweb_final$AIinteraction <- videoweb_final$AIinteraction %>% str_replace_all("\\|\\|", " ")
videoweb_final <- videoweb_final %>% dplyr::select(-interactionAPI, -interactionChat)
# character count
videoweb_final$AIinteractionLength <- nchar(videoweb_final$AIinteraction)

videoweb_final3 <- merge(videoweb_final3, api_interact, by = c("User.Id", "Video.Id"), all.x = T)
videoweb_final3 <- merge(videoweb_final3, chat_interact, by = c("User.Id", "Video.Id"), all.x = T)
videoweb_final3$interactionAPI <- ifelse(is.na(videoweb_final3$interactionAPI), "", videoweb_final3$interactionAPI)
videoweb_final3$interactionChat <- ifelse(is.na(videoweb_final3$interactionChat), "", videoweb_final3$interactionChat)
# merge column into AI Interaction
videoweb_final3$AIinteraction <- paste(videoweb_final3$interactionAPI, videoweb_final3$interactionChat, sep = "")
videoweb_final3$AIinteraction <- ifelse(videoweb_final3$AIinteraction == "", NA, videoweb_final3$AIinteraction)
videoweb_final3$AIinteraction <- videoweb_final3$AIinteraction %>% str_replace_all("\\|\\|", " ")
videoweb_final3 <- videoweb_final3 %>% dplyr::select(-interactionAPI, -interactionChat)
# character count
videoweb_final3$AIinteractionLength <- nchar(videoweb_final3$AIinteraction)

Data Pre-processing

df_final$review_exp <- case_when(
  df_final$review_exp == "Extremely dislike" ~ 1,
  df_final$review_exp == "Dislike" ~ 2,
  df_final$review_exp == "Somewhat dislike" ~ 3,
  df_final$review_exp == "Neither like nor dislike" ~ 4,
  df_final$review_exp == "Somewhat like" ~ 5,
  df_final$review_exp == "Like" ~ 6,
  df_final$review_exp == "Extremely like" ~ 7
)
df_final$social_media_use <- factor(df_final$social_media_use, levels = c("1 hour or less", "1-3 hours", "3-5 hours", "5+ hours"))
df_final$website_use <- factor(df_final$website_use, levels = c("1 hour or less", "1-3 hours", "3-5 hours", "5+ hours"))
df_final$edu <- factor(df_final$edu, levels = c("Did not graduate from high school", "High school graduate (high school diploma or equivalent including GED)", "Some college, but no degree", "2-year college degree", "4-year college degree", "Postgraduate degree (MA, MBA, JD, PhD, etc.)"))
df_final$polparty <- factor(df_final$polparty, levels = c("Democrat", "Republican", "Independent", "Other Party"))
df_final$libcons <- factor(df_final$libcons, levels = c("Strong Conservative", "Moderate Conservative", "Moderate", "Moderate Liberal", "Strong Liberal"))
df_final$income <- factor(df_final$income, levels = c("Prefer not to say", "Less than $10,000", "$10,000-$49,999", "$50,000-$99,999", "$100,000-$149,999", "$150,000 or more"))
df_final$social_media_reply <- factor(df_final$social_media_reply, levels = c("Never", "Few times a year", "Few times a month", "Few times a week", "1-2 times per day", "More than 4 times per day"))
df_final$review_freq <- factor(df_final$review_freq, levels = c("Never", "Rarely (1 - 20% of the time)", "Occasionally (21 - 40% of the time)", "Sometimes (41 - 60% of the time)", "Often (61 - 80% of the time)", "Very often (81 - 100% of the time)"))

# social media
df_final$social_media_X <- ifelse(str_detect(df_final$social_media, "X (formerly Twitter)"), 1, 0)
df_final$social_media_FB <- ifelse(str_detect(df_final$social_media, "Facebook"), 1, 0)
df_final$social_media_IG <- ifelse(str_detect(df_final$social_media, "Instagram"), 1, 0)
df_final$social_media_LI <- ifelse(str_detect(df_final$social_media, "LinkedIn"), 1, 0)
df_final$social_media_SN <- ifelse(str_detect(df_final$social_media, "Snapchat"), 1, 0)
df_final$social_media_TK <- ifelse(str_detect(df_final$social_media, "TikTok"), 1, 0)
df_final$social_media_YT <- ifelse(str_detect(df_final$social_media, "YouTube"), 1, 0)
df_final$social_media_nonUser <- ifelse(str_detect(df_final$social_media, "not active"), 1, 0)
df_final$social_media_user <- ifelse((df_final$social_media_X == 1) | (df_final$social_media_FB == 1) | (df_final$social_media_IG == 1) | (df_final$social_media_LI == 1) | (df_final$social_media_SN == 1) | (df_final$social_media_TK == 1) , 1, 0)

# social media use time
df_final$social_media_use_numeric <- case_when(df_final$social_media_use == "1 hour or less" ~ 1,
                                               df_final$social_media_use == "1-3 hours" ~ 2,
                                               df_final$social_media_use == "3-5 hours" ~ 3,
                                               df_final$social_media_use == "5+ hours" ~ 4)
df_final$social_media_use_1 <- ifelse(df_final$social_media_use == "1 hour or less", 1, 0)
df_final$social_media_use_13 <- ifelse(df_final$social_media_use == "1-3 hours", 1, 0)
df_final$social_media_use_35 <- ifelse(df_final$social_media_use == "3-5 hours", 1, 0)
df_final$social_media_use_5 <- ifelse(df_final$social_media_use == "5+ hours", 1, 0)

# website use
df_final$website_use_numeric <- case_when(df_final$website_use == "1 hour or less" ~ 1,
                                          df_final$website_use == "1-3 hours" ~ 2,
                                          df_final$website_use == "3-5 hours" ~ 3,
                                          df_final$website_use == "5+ hours" ~ 4)
df_final$website_use_1 <- ifelse(df_final$website_use == "1 hour or less", 1, 0)
df_final$website_use_13 <- ifelse(df_final$website_use == "1-3 hours", 1, 0)
df_final$website_use_35 <- ifelse(df_final$website_use == "3-5 hours", 1, 0)
df_final$website_use_5 <- ifelse(df_final$website_use == "5+ hours", 1, 0)

# gender
df_final$genderFemale <- ifelse(df_final$gender == "Female", 1, 0)

# race
df_final$raceAsian <- ifelse(df_final$race == "Asian/Pacific Islander", 1, 0)
df_final$raceBlack <- ifelse(df_final$race == "Black or African American", 1, 0)
df_final$raceHispanic <- ifelse(df_final$race == "Latino or Hispanic", 1, 0)
df_final$raceWhite <- ifelse(df_final$race == "Caucasian/White", 1, 0)
df_final$raceOther <- ifelse(df_final$race %in% c("Asian/Pacific Islander", "Black or African American", "Latino or Hispanic", "Caucasian/White"), 0, 1)

# education
df_final$eduHighSchoolOrLess <- ifelse(df_final$edu %in% c("Did not graduate from high school", "High school graduate (high school diploma or equivalent including GED)"), 1, 0)
df_final$eduSomeCollege <- ifelse(df_final$edu == "Some college, but no degree", 1, 0)
df_final$eduBachelor <- ifelse(df_final$edu %in% c("2-year college degree", "4-year college degree"), 1, 0)
df_final$eduPostGrad <- ifelse(df_final$edu == "Postgraduate degree (MA, MBA, JD, PhD, etc.)", 1, 0)

# political party
df_final$polpartyDem <- ifelse(df_final$polparty == "Democrat", 1, 0)
df_final$polpartyRep <- ifelse(df_final$polparty == "Republican", 1, 0)
df_final$polpartyInd <- ifelse(df_final$polparty == "Independent", 1, 0)
df_final$polpartyOther <- ifelse(df_final$polparty %in% c("Democrat", "Republican"), 0, 1)

# political ideology
df_final$libcons_numeric <- case_when(df_final$libcons == "Strong Liberal" ~ 5,
          df_final$libcons == "Moderate Liberal" ~ 4,
          df_final$libcons == "Moderate" ~ 3,
          df_final$libcons == "Moderate Conservative" ~ 2,
          df_final$libcons == "Strong Conservative" ~ 1) 

# income
df_final$income_numeric <- case_when(df_final$income == "Less than $10,000" ~ 1,
                                     df_final$income == "$10,000-$49,999" ~ 2,
                                     df_final$income == "$50,000-$99,999" ~ 3,
                                     df_final$income == "$100,000-$149,999" ~ 4,
                                     df_final$income == "$150,000 or more" ~ 5,
                                     TRUE ~ 0)
df_final$income_flag <- ifelse(df_final$income_numeric == 0, 1, 0)

# social media reply
df_final$social_media_reply_numeric <- case_when(df_final$social_media_reply == "Never" ~ 1,
                                                 df_final$social_media_reply == "Few times a year" ~ 2,
                                                 df_final$social_media_reply == "Few times a month" ~ 3,
                                                 df_final$social_media_reply == "Few times a week" ~ 4,
                                                 df_final$social_media_reply == "1-2 times per day" ~ 5,
                                                 df_final$social_media_reply == "More than 4 times per day" ~ 6) 
df_final$social_media_reply_never <- ifelse(df_final$social_media_reply == "Never", 1, 0)
df_final$social_media_reply_fewayear <- ifelse(df_final$social_media_reply == "Few times a year", 1, 0)
df_final$social_media_reply_fewamonth <- ifelse(df_final$social_media_reply == "Few times a month", 1, 0)
df_final$social_media_reply_fewaweek <- ifelse(df_final$social_media_reply == "Few times a week", 1, 0)
df_final$social_media_reply_12times <- ifelse(df_final$social_media_reply == "1-2 times per day", 1, 0)
df_final$social_media_reply_4times <- ifelse(df_final$social_media_reply == "More than 4 times per day", 1, 0)

# review frequency                              
df_final$review_freq_numeric <- case_when(df_final$review_freq == "Never" ~ 1,
                                          df_final$review_freq == "Rarely (1 - 20% of the time)" ~ 2,
                                          df_final$review_freq == "Occasionally (21 - 40% of the time)" ~ 3,
                                          df_final$review_freq == "Sometimes (41 - 60% of the time)" ~ 4,
                                          df_final$review_freq == "Often (61 - 80% of the time)" ~ 5,
                                          df_final$review_freq == "Very often (81 - 100% of the time)" ~ 6)
df_final$review_freq_never <- ifelse(df_final$review_freq == "Never", 1, 0)
df_final$review_freq_rarely <- ifelse(df_final$review_freq == "Rarely (1 - 20% of the time)", 1, 0)
df_final$review_freq_occasionally <- ifelse(df_final$review_freq == "Occasionally (21 - 40% of the time)", 1, 0)
df_final$review_freq_sometimes <- ifelse(df_final$review_freq == "Sometimes (41 - 60% of the time)", 1, 0)
df_final$review_freq_often <- ifelse(df_final$review_freq == "Often (61 - 80% of the time)", 1, 0)
df_final$review_freq_veryoften <- ifelse(df_final$review_freq == "Very often (81 - 100% of the time)", 1, 0)


covariates_all <- c(pre_cov[!pre_cov %in% c("social_media", "race")], "social_media_reply", "review_freq", "social_media_X", "social_media_FB", "social_media_IG", "social_media_LI", "social_media_SN", "social_media_TK", "social_media_YT", "social_media_nonUser", "social_media_use_numeric", "social_media_use_1", "social_media_use_13", "social_media_use_35", "social_media_use_5", "website_use_numeric", "website_use_1", "website_use_13", "website_use_35", "website_use_5", "genderFemale", "raceAsian", "raceBlack", "raceHispanic", "raceWhite", "raceOther", "eduHighSchoolOrLess", "eduSomeCollege", "eduBachelor", "eduPostGrad", "polpartyDem", "polpartyRep", "polpartyInd", "polpartyOther", "libcons_numeric", "income_numeric", "income_flag", "social_media_reply_numeric", "social_media_reply_never", "social_media_reply_fewayear", "social_media_reply_fewamonth", "social_media_reply_fewaweek", "social_media_reply_12times", "social_media_reply_4times", "review_freq_numeric", "review_freq_never", "review_freq_rarely", "review_freq_occasionally", "review_freq_sometimes", "review_freq_often", "review_freq_veryoften")

covariates_simple <- c("age", "social_media_YT", "social_media_nonUser", "social_media_user", "social_media_use_numeric", "website_use_numeric", "genderFemale", "raceAsian", "raceBlack", "raceHispanic", "raceWhite", "raceOther", "eduHighSchoolOrLess", "eduSomeCollege", "eduBachelor", "eduPostGrad", "polpartyDem", "polpartyRep", "polpartyOther", "libcons_numeric", "income_numeric",  "social_media_reply_numeric",  "review_freq_numeric")

covariates_simple_fancy <- c("Age", "YouTube User", "Social Media: Non-User", "Social Media: User", "Social Media Usage (1 - 4 Scale)", "Online Usage (1 - 4 Scale)", "Female", "Race: Asian", "Race: Black", "Race: Hispanic", "Race: White", "Race: Other", "Education: High School or Less", "Education: Some College", "Education: Bachelor", "Education: Postgraduate", "Political Party: Democrat", "Political Party: Republican", "Political Party: Other", "Political Ideology (1 - 5 Scale; 5 Strong Liberal)", "Income (1 - 5 Scale)", "Social Media Reply Frequency (1 - 6 Scale)", "Review Frequency (1 - 6 Scale)")

# makeCodebook(df_final[, covariates_all], replace = TRUE, 
#              reportTitle = 'Covariates Summary', # change this with final version
#              file = 'processed_final_data/codebook_covariates.Rmd') # change this with final version

Final Dataframes Creation

# note that 235, 1856, 1905, 2405 were users that would not made the cutoff (if not the 15 seconds offset)
manually_validated_users <- c(249, 262, 269, 270, 271, 286, 288, 292, 303, 327, 330, 336, 378, 445, 466, 473, 979, 983, 991, 1109, 1178, 1223, 1629, 1682, 1788, 1836, 2023, 2230)
print(paste0("Number of manually validated users: ", length(manually_validated_users)))

## [1] "Number of manually validated users: 28"

Create panel dataframe (df_panel) - a binary indicator for whether the video was the first three video passed cutoff - a binary indicator for whether the duration of watching time was significantly larger than the cutoff time - a binary indicator for whether the user was manually validated - currently only include videos that pass the cutoff threshold

df_panel <- merge(videoweb_final, df_final, by = "User.Id", all.x = T)
# if the Video.Id and User.Id are in videoweb_final3, then it is the first three videos that passed the cutoff time
df_panel <- df_panel %>% mutate(firstThree = if_else(paste(User.Id, Video.Id) %in% paste(videoweb_final3$User.Id, videoweb_final3$Video.Id), 1, 0))
df_panel$outlier_cutoff <- ifelse(df_panel$duration >= 1200, 1, 0)
df_panel$manual_validated <- ifelse(df_panel$User.Id %in% manually_validated_users, 1, 0)
df_panel$Treatment <- case_when(df_panel$Treatment == 1 ~ "Pure Control",
                                df_panel$Treatment == 2 ~ "Hint Control",
                                df_panel$Treatment == 3 ~ "One-Click Generate",
                                df_panel$Treatment == 4 ~ "Chat Generate")
df_panel$Treatment <- factor(df_panel$Treatment, levels = c("Pure Control", "Hint Control", "One-Click Generate", "Chat Generate"))

print(paste0("Currently setting the outlier watching time to be 1200 seconds (i.e. 20 minutes). Number of rows that are considered outliers: ", sum(df_panel$outlier_cutoff)))

## [1] "Currently setting the outlier watching time to be 1200 seconds (i.e. 20 minutes). Number of rows that are considered outliers: 435"

Create user level dataframes (df_wide_all and df_wide) - two versions: one version with all the videos passing cutoff, one version with the first three videos passing cutoff - a binary indicator for whether the user was manually validated - dummy variables for the videos that are in the number of comments estimate

# generate dummy variable for Video.Id after aggregate on user.id level
df_wide_all <- videoweb_final %>%
  # Sum `hasComment` for each user
  group_by(User.Id) %>%
  summarise(
    num_comment = sum(hasComment, na.rm = T),
    .groups = "drop"
  ) %>%
  # Add dummy variables for `video_id`
  left_join(
    videoweb_final %>%
      mutate(dummy = 1) %>%
      pivot_wider(
        id_cols = User.Id,
        names_from = Video.Id,
        values_from = dummy,
        values_fill = 0
      ),
    by = "User.Id"
  ) 

# check rowSum
df_wide_all$numVideosWatched <- rowSums(df_wide_all[, 3:ncol(df_wide_all)], na.rm = T)
print("Distribution of how many videos watched pass cutoff")

## [1] "Distribution of how many videos watched pass cutoff"

table(df_wide_all$numVideosWatched)

## 
##    3    4    5    6    7    8    9   10   11   12 
## 1768   92   15    1    4    1    2    2    2    6

df_wide_all <- merge(df_wide_all, df_final, by = "User.Id")
df_wide_all$manual_validated <- ifelse(df_wide_all$User.Id %in% manually_validated_users, 1, 0)
df_wide_all <- df_wide_all %>% 
  rename(video14 = `14`, video11 = `11`, video17 = `17`, 
         video23 = `23`, video18 = `18`, video19 = `19`, 
         video13 = `13`, video16 = `16`, video21 = `21`, 
         video15 = `15`, video22 = `22`, video20 = `20`)
df_wide_all$Treatment <- case_when(df_wide_all$Treatment == 1 ~ "Pure Control",
                                   df_wide_all$Treatment == 2 ~ "Hint Control",
                                   df_wide_all$Treatment == 3 ~ "One-Click Generate",
                                   df_wide_all$Treatment == 4 ~ "Chat Generate")
df_wide_all$Treatment <- factor(df_wide_all$Treatment, levels = c("Pure Control", "Hint Control", "One-Click Generate", "Chat Generate"))
head(df_wide_all)

df_wide <- videoweb_final3 %>%
  # Sum `hasComment` for each user
  group_by(User.Id) %>%
  summarise(
    num_comment = sum(hasComment, na.rm = T),
    .groups = "drop"
  ) %>%
  # Add dummy variables for `video_id`
  left_join(
    videoweb_final3 %>%
      mutate(dummy = 1) %>%
      pivot_wider(
        id_cols = User.Id,
        names_from = Video.Id,
        values_from = dummy,
        values_fill = 0
      ),
    by = "User.Id"
  )

df_wide <- merge(df_wide, df_final, by = "User.Id")
df_wide$manual_validated <- ifelse(df_wide$User.Id %in% manually_validated_users, 1, 0)
df_wide <- df_wide %>% 
  rename(video14 = `14`, video11 = `11`, video17 = `17`, 
         video23 = `23`, video18 = `18`, video19 = `19`, 
         video13 = `13`, video16 = `16`, video21 = `21`, 
         video15 = `15`, video22 = `22`, video20 = `20`)
df_wide$Treatment <- case_when(df_wide$Treatment == 1 ~ "Pure Control",
                                   df_wide$Treatment == 2 ~ "Hint Control",
                                   df_wide$Treatment == 3 ~ "One-Click Generate",
                                   df_wide$Treatment == 4 ~ "Chat Generate")
df_wide$Treatment <- factor(df_wide$Treatment, levels = c("Pure Control", "Hint Control", "One-Click Generate", "Chat Generate"))

saveRDS(df_panel, "processed_final_data/df_panel.RDS")
saveRDS(df_wide_all, "processed_final_data/df_wide_all.RDS")
saveRDS(df_wide, "processed_final_data/df_wide.RDS")

Followup Dataframe Generation

# find the videos that the valid users have watched
followup_panel_df <-  df_panel %>% dplyr::select(User.Id, Video.Id, hasComment, prolific_id, Treatment)

# if more than three videos, keep the commented videos first; if more than three commented videos, randomly select three
set.seed(2024)
followup_panel_df_final <- c()
for (u in unique(followup_panel_df$User.Id)){
  temp_df <- followup_panel_df %>% filter(User.Id == u)
  if (nrow(temp_df) < 3){
    # warning
    print(paste0("User ", u, " has less than 3 videos watched"))
  }
  if (nrow(temp_df) > 3){
    temp_df <- temp_df %>% arrange(desc(hasComment))
    num_commented <- sum(temp_df$hasComment) 
    if (num_commented >= 3){
      # print(paste0("User ", u, " has ", num_commented, " commented videos"))
      sampled_video <- sample(1:sum(temp_df$hasComment == 1), 3)
      # print(paste0("Sampled Commented Order: ", sampled_video))
      temp_df <- temp_df %>% filter(hasComment == 1) %>% slice(sampled_video)
      followup_panel_df_final <- rbind(followup_panel_df_final, temp_df)
    }
    else{
      num_remain <- 3 - num_commented
      # randomly select uncommented videos
      followup_panel_df_final <- rbind(followup_panel_df_final, 
                                       temp_df %>% filter(hasComment == 1), 
                                       temp_df %>% filter(hasComment == 0) %>% 
                                         slice(sample(1:sum(hasComment == 0), num_remain)))
    }
  } else {
    temp_df <- temp_df %>% arrange(desc(hasComment))
    followup_panel_df_final <- rbind(followup_panel_df_final, temp_df)
  }
}

# double check correctness
check_panel_df <- merge(followup_panel_df_final %>% group_by(User.Id) %>% summarise(hasComment_fu = sum(hasComment)), df_panel %>% group_by(User.Id) %>% summarise(hasComment = sum(hasComment)))
check_panel_df[check_panel_df$hasComment_fu != check_panel_df$hasComment,]

# attach video name
followup_panel_df_final <- merge(followup_panel_df_final, video[, c("Video.Id", "Title")], by = "Video.Id")
followup_panel_df_final

# find which video is missing from the video panel dataframe
video_missing <- video$Title[!video$Title %in% followup_panel_df_final$Title]
print(paste0("These videos are not watched by any valid user: ", paste0(video_missing, collapse = ", ")))

## [1] "These videos are not watched by any valid user: "

# remove any space in title
followup_panel_df_final$Title <- str_replace_all(followup_panel_df_final$Title, " ", "")
video_missing <- str_replace_all(video_missing, " ", "")

followup_output_df <- reshape2::dcast(followup_panel_df_final, prolific_id + Treatment ~ Title, fun.aggregate = function(x) {
  ifelse(length(na.omit(x)) > 0, 1, 0)
})

## Using Title as value column: use value.var to override.

followup_output_df$Treatment <- ifelse(followup_output_df$Treatment == "Pure Control", 1, 
                                       ifelse(followup_output_df$Treatment == "Hint Control", 2, 
                                              ifelse(followup_output_df$Treatment == "One-Click Generate", 3, 4)))


for (v in video_missing){
  followup_output_df[[v]] <- 0
}

# rename
followup_output_df <- followup_output_df %>% 
  rename("Boom" = "BOOM",
         "Crook" = "CROOK$",
         "TimeMachine" = "One-MinuteTimeMachine", 
         "Different" = "DIFFERENT",
         "ProlificID" = "prolific_id")

# add some test users
followup_output_df <- rbind(followup_output_df, 
                            data.frame(ProlificID = "test1", Treatment = 1, 
                                       "AlternativeMath" = 1, "Boom" = 1, "CoinOperated" = 1,
                                       "Crook" = 0, "Different" = 0, "ForeverSleep" = 0, 
                                       "FrenchRoast" = 0, "TimeMachine" = 0, "RadicalHonesty"= 0, 
                                       "Skipped" = 0, "SoftRain" = 0, "TheCook" = 0),
                            data.frame(ProlificID = "test2", Treatment = 2,
                                       "AlternativeMath" = 0, "Boom" = 0, "CoinOperated" = 0,
                                       "Crook" = 1, "Different" = 1, "ForeverSleep" = 1, 
                                       "FrenchRoast" = 0, "TimeMachine" = 0, "RadicalHonesty"= 0, 
                                       "Skipped" = 0, "SoftRain" = 0, "TheCook" = 0),
                            data.frame(ProlificID = "test3", Treatment = 3,
                                       "AlternativeMath" = 0, "Boom" = 0, "CoinOperated" = 0,
                                       "Crook" = 0, "Different" = 0, "ForeverSleep" = 0, 
                                       "FrenchRoast" = 1, "TimeMachine" = 1, "RadicalHonesty"= 1, 
                                       "Skipped" = 0, "SoftRain" = 0, "TheCook" = 0),
                            data.frame(ProlificID = "test4", Treatment = 4,
                                       "AlternativeMath" = 0, "Boom" = 0, "CoinOperated" = 0,
                                       "Crook" = 0, "Different" = 0, "ForeverSleep" = 0, 
                                       "FrenchRoast" = 0, "TimeMachine" = 0, "RadicalHonesty"= 0, 
                                       "Skipped" = 1, "SoftRain" = 1, "TheCook" = 1))

write.csv(followup_output_df, paste0("followup_dfs/main/followup_output_df_final.csv"), row.names = FALSE, fileEncoding = "UTF-8")

# randomly sample some for robustness manual check
set.seed(2024)
random_ids <- df_panel %>% group_by(User.Id) %>% filter(n() > 3) %>% pull(prolific_id) %>% sample(10)

df_panel %>% filter(prolific_id %in% random_ids) %>% dplyr::select(prolific_id, Treatment, Video.Id, hasComment) %>% arrange(prolific_id)

followup_panel_df_final %>% filter(prolific_id %in% random_ids) %>% dplyr::select(User.Id, prolific_id, Treatment, Video.Id, hasComment) %>% arrange(prolific_id)

merge(followup_output_df %>% dplyr::select(ProlificID, Treatment), df_post_output %>% dplyr::select(prolific_id, mode), by.x = "ProlificID", by.y = "prolific_id") %>% filter(Treatment != mode)

Pull prolific ids for different dates

followup_dates <- merge(followup_output_df, 
      df_post_output %>% 
        filter(prolific_id %in% non_dup_user) %>% 
        dplyr::select(prolific_id, StartDate), 
      by.x = "ProlificID", by.y = "prolific_id") %>% 
  mutate(FollowupStartDate = as.Date(StartDate) + 21) %>% arrange(FollowupStartDate) %>% dplyr::select(ProlificID, Treatment, StartDate, FollowupStartDate)

followup_dates_unique <- followup_dates$FollowupStartDate %>% unique()

for (d in followup_dates_unique){
  d <- as.Date(d)
  print(paste0("Date: ", d))
  print(nrow(followup_dates %>% filter(FollowupStartDate == d)))
  write.csv(followup_dates %>% 
              filter(FollowupStartDate == d), 
            paste0("followup_dfs/main/followup_deploy_", d, ".csv"), row.names = FALSE, fileEncoding = "UTF-8")
}

## [1] "Date: 2024-12-02"
## [1] 49
## [1] "Date: 2024-12-03"
## [1] 304
## [1] "Date: 2024-12-04"
## [1] 347
## [1] "Date: 2024-12-05"
## [1] 345
## [1] "Date: 2024-12-06"
## [1] 359
## [1] "Date: 2024-12-07"
## [1] 351
## [1] "Date: 2024-12-10"
## [1] 129
## [1] "Date: 2024-12-11"
## [1] 9

Analysis

Note: Demographics also include social media usage and comment behaviors

Randomization check

Whether there exist significant differences in demographics & pop-up feature across treatment conditions.

Note: social media usage are divided into social media users (e.g. Facebook, Instagram, etc.), YouTube users, and non-users (e.g. not active on social media). A participant can belong in multiple categories.

cov_form <- c("review_exp", covariates_simple,  "willingness_to_pay")
cov_name_fancy <- c("Review Experience", covariates_simple_fancy,  "Willingness to Pay")

# Create formula #  
balance_fmla_cov_main = formula(paste("Treatment != 1 ~",paste(cov_form,collapse="+")))
# Pre-allocate dataframe #
balance_plot_treatments_main = data.frame(matrix(NA,0,3))
# Compute standardized differences for each of the treatment groups and fill dataframe #
for (t in c(2, 3, 4)){
        balance_temp = xBalance(balance_fmla_cov_main,
                                data=df_final[df_final$Treatment %in% c(1,t),],
                                report="std.diffs",na.rm=TRUE)
        balance_temp = data.frame(balance_temp)
        balance_temp = cbind(cov_name_fancy, balance_temp[,1],rep(t,length(cov_name_fancy)))
    
        balance_plot_treatments_main = rbind(balance_plot_treatments_main,balance_temp)
        }
# Fill dataframe to input into the plot function with the remaining columns #
colnames(balance_plot_treatments_main) = c("covariates","diff","grouping")
balance_plot_treatments_main[,2] = as.numeric(balance_plot_treatments_main[,2])

balance_plot_treatments_main$grouping <- ifelse(balance_plot_treatments_main$grouping == 2, "Hint Control", 
                                                ifelse(balance_plot_treatments_main$grouping == 3, "One-Click Generate", "Chat Generate"))
balance_plot_treatments_main$grouping <- factor(balance_plot_treatments_main$grouping, levels = c("Hint Control", "One-Click Generate", "Chat Generate"))

# Plots standardized differences by treatment #
match_plot = function(data,title){
    pic = ggplot(data=data,aes(x=diff,y=factor(covariates,levels = unique(covariates)),group=grouping)) + 
        theme_bw()+
        theme(axis.line.y = element_line(colour="black"),axis.line.x = element_line(colour="black"),
              panel.border = element_blank(),
              panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
        legend.title = element_text(face='bold', size=20, hjust=0.5, vjust=0.5),
        legend.position = c(0.97,.80),legend.justification = c("right", "bottom"),
        legend.key = element_rect(colour = "transparent"),
        legend.box.just = "right", legend.text = element_text(size=20), legend.margin = margin(0, 6, 6, 6),
        legend.box.background = element_rect( fill="transparent", size=1),legend.background = element_blank()) +
        geom_vline(aes(xintercept=-0.1),color="black", linetype="longdash", size=0.75) +
        geom_vline(aes(xintercept=0.1),color="black", linetype="longdash", size=0.75) +
        geom_vline(aes(xintercept=0.25),color="black", linetype="dashed", size=0.5) +
        geom_vline(aes(xintercept=0), color="black", linetype="solid", size=0.5) +
        geom_vline(aes(xintercept=-0.25),color="black", linetype="dashed", size=0.5) +
        geom_vline(aes(xintercept=-0.5),color="black", linetype="dotted", size=0.5) +
        geom_vline(aes(xintercept=0.5),color="black", linetype="dotted", size=0.5) +
        geom_point(aes(color=grouping,fill=grouping,shape=grouping),size=5) +
    
        scale_color_manual(name = "vs Pure Control",
                           values=c("Hint Control" = "black","One-Click Generate"="black","Chat Generate" = "black"))+
        scale_fill_manual(name = "vs Pure Control",
                          values=c("Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2"))+
        scale_shape_manual(name = "vs Pure Control",
                           values=c("Hint Control" = 23, "One-Click Generate"=24,"Chat Generate" = 25))+
    
        labs(y="Variable",x="Standardized Difference")+
        #xlim(-0.5,0.5) +
        scale_x_discrete(limits = c(-0.5,-0.25,-0.1,0,0.1,0.25,0.5), labels = c("-0.50","-0.25","-0.10","0","0.10","0.25","0.50")) +
        scale_y_discrete(limits = rev) +
        theme(axis.text.x = element_text(color = "black", size = 20, angle = 0, hjust = .5, vjust = 0, face = "plain"),
        axis.text.y = element_text(color = "black", size = 20, angle = 0, hjust = 1, vjust = .5, face = "plain",
                                   margin=unit(rep(0.5,4),"cm")),
        axis.title.x = element_text(color = "black", size = 25, angle = 0, hjust = .5, vjust = 0, face = "bold"),
        axis.title.y = element_text(color = "black", size = 30, angle = 90, hjust = .5, vjust = .5, face = "bold"),
        axis.ticks.length.y = unit(-0.25,"cm"), axis.ticks.x=element_blank()) +
        ggtitle(title) +
        theme(plot.title = element_text(face='bold', size=30, hjust=0.5, vjust=0.5)) +
        # 
        # facet_grid(rows = vars(cov_group),
        #      scales = "free_y", # Let the x axis vary across facets.
        #      space = "free_y",  # Let the width of facets vary and force all bars to have the same width.
        #      switch = "y")+
        theme(strip.placement = "outside",    # Place facet labels outside x axis labels.
         strip.background = element_blank(),  # Make facet label background white.
         strip.text.y.left = element_text(size = 21,face = "bold",angle = 0, hjust=0.5),
         axis.title = element_blank(),
              panel.border = element_rect(color = "grey70", fill = NA, size = 2))

pic
}

match_plot(balance_plot_treatments_main, "Standardized Differences by Treatment Group")

ggsave("tables_and_figures/std_diff_by_samples.png", width = 18, height = 20)

df_final %>% group_by(Treatment) %>% summarise(mech_popup = mean(mech_popup, na.rm = T))

# plot mech_popup by Treatment distribution
ggplot(df_final, aes(x = Treatment, y = mech_popup, fill = Treatment, group = Treatment)) +
  geom_boxplot() +
  theme_minimal() +
  theme(legend.position = "none") +
  labs(x = "Treatment", y = "Pop-up Feature") +
  ggtitle("Pop-up Feature by Treatment Group") +
  ylim(1, 7)

User Level Regression

Number of Reviews ~ treatment + video (+ demographics)

TODO (11/24): + order effect

video_columns <- c("video11", "video13", "video14", "video15", "video16", "video17", "video18", "video19", "video20", "video21", "video22", "video23")

All Videos

cov_form_lm <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "))
user_lm_all <- lm(cov_form_lm, data = df_wide_all)

cov_form_lm_with_cov <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple, collapse = " + "), 
                               "+ willingness_to_pay")
user_lm_with_cov_all <- lm(cov_form_lm_with_cov, data = df_wide_all)

stargazer(user_lm_all, user_lm_with_cov_all, 
          star.cutoffs = c(0.05, 0.01, 0.001), 
          dep.var.labels = "Number of Comments", 
          covariate.labels = c("Treatment: Hint Control", "Treatment: One-Click Generate", "Treatment: Chat Generate",
                               "Video 11", "Video 13", "Video 14", "Video 15", "Video 16", "Video 17", 
                               "Video 18", "Video 19", "Video 20", "Video 21", "Video 22", "Video 23",
                               covariates_simple_fancy, "Willingness to Pay"),
          column.labels = c("Without Covariates", "With Covariates"))

## 
## % Table created by stargazer v.5.2.3 by Marek Hlavac, Social Policy Institute. E-mail: marek.hlavac at gmail.com
## % Date and time: Fri, Mar 28, 2025 - 11:01:21
## \begin{table}[!htbp] \centering 
##   \caption{} 
##   \label{} 
## \begin{tabular}{@{\extracolsep{5pt}}lcc} 
## \\[-1.8ex]\hline 
## \hline \\[-1.8ex] 
##  & \multicolumn{2}{c}{\textit{Dependent variable:}} \\ 
## \cline{2-3} 
## \\[-1.8ex] & \multicolumn{2}{c}{Number of Comments} \\ 
##  & Without Covariates & With Covariates \\ 
## \\[-1.8ex] & (1) & (2)\\ 
## \hline \\[-1.8ex] 
##  Treatment: Hint Control & 0.230$^{*}$ & 0.247$^{**}$ \\ 
##   & (0.094) & (0.092) \\ 
##   & & \\ 
##  Treatment: One-Click Generate & 0.195$^{*}$ & 0.241$^{**}$ \\ 
##   & (0.093) & (0.092) \\ 
##   & & \\ 
##  Treatment: Chat Generate & $-$0.294$^{**}$ & $-$0.291$^{**}$ \\ 
##   & (0.092) & (0.091) \\ 
##   & & \\ 
##  Video 11 & 0.163$^{*}$ & 0.115 \\ 
##   & (0.080) & (0.079) \\ 
##   & & \\ 
##  Video 13 & 0.135 & 0.111 \\ 
##   & (0.081) & (0.080) \\ 
##   & & \\ 
##  Video 14 & 0.298$^{***}$ & 0.314$^{***}$ \\ 
##   & (0.088) & (0.087) \\ 
##   & & \\ 
##  Video 15 & 0.306$^{**}$ & 0.305$^{***}$ \\ 
##   & (0.093) & (0.092) \\ 
##   & & \\ 
##  Video 16 & 0.252$^{**}$ & 0.263$^{**}$ \\ 
##   & (0.082) & (0.083) \\ 
##   & & \\ 
##  Video 17 & 0.149 & 0.147 \\ 
##   & (0.083) & (0.081) \\ 
##   & & \\ 
##  Video 18 & 0.106 & 0.125 \\ 
##   & (0.086) & (0.085) \\ 
##   & & \\ 
##  Video 19 & 0.418$^{***}$ & 0.474$^{***}$ \\ 
##   & (0.092) & (0.091) \\ 
##   & & \\ 
##  Video 20 & 0.243$^{**}$ & 0.190$^{*}$ \\ 
##   & (0.090) & (0.090) \\ 
##   & & \\ 
##  Video 21 & 0.258$^{**}$ & 0.286$^{**}$ \\ 
##   & (0.094) & (0.094) \\ 
##   & & \\ 
##  Video 22 & 0.225$^{*}$ & 0.204$^{*}$ \\ 
##   & (0.093) & (0.092) \\ 
##   & & \\ 
##  Video 23 & 0.373$^{***}$ & 0.364$^{***}$ \\ 
##   & (0.083) & (0.082) \\ 
##   & & \\ 
##  Age &  & 0.003 \\ 
##   &  & (0.002) \\ 
##   & & \\ 
##  YouTube User &  & 0.211$^{*}$ \\ 
##   &  & (0.097) \\ 
##   & & \\ 
##  Social Media: Non-User &  & 0.320 \\ 
##   &  & (0.376) \\ 
##   & & \\ 
##  Social Media: User &  & $-$0.131 \\ 
##   &  & (0.118) \\ 
##   & & \\ 
##  Social Media Usage (1 - 4 Scale) &  & $-$0.080 \\ 
##   &  & (0.042) \\ 
##   & & \\ 
##  Online Usage (1 - 4 Scale) &  & 0.011 \\ 
##   &  & (0.034) \\ 
##   & & \\ 
##  Female &  & $-$0.190$^{**}$ \\ 
##   &  & (0.070) \\ 
##   & & \\ 
##  Race: Asian &  & 0.236 \\ 
##   &  & (0.161) \\ 
##   & & \\ 
##  Race: Black &  & $-$0.088 \\ 
##   &  & (0.130) \\ 
##   & & \\ 
##  Race: Hispanic &  & $-$0.074 \\ 
##   &  & (0.172) \\ 
##   & & \\ 
##  Race: White &  & $-$0.081 \\ 
##   &  & (0.112) \\ 
##   & & \\ 
##  Race: Other &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Education: High School or Less &  & 0.057 \\ 
##   &  & (0.122) \\ 
##   & & \\ 
##  Education: Some College &  & 0.025 \\ 
##   &  & (0.110) \\ 
##   & & \\ 
##  Education: Bachelor &  & 0.040 \\ 
##   &  & (0.092) \\ 
##   & & \\ 
##  Education: Postgraduate &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Political Party: Democrat &  & $-$0.097 \\ 
##   &  & (0.086) \\ 
##   & & \\ 
##  Political Party: Republican &  & $-$0.138 \\ 
##   &  & (0.098) \\ 
##   & & \\ 
##  Political Party: Other &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Political Ideology (1 - 5 Scale; 5 Strong Liberal) &  & $-$0.029 \\ 
##   &  & (0.038) \\ 
##   & & \\ 
##  Income (1 - 5 Scale) &  & $-$0.040 \\ 
##   &  & (0.033) \\ 
##   & & \\ 
##  Social Media Reply Frequency (1 - 6 Scale) &  & 0.121$^{***}$ \\ 
##   &  & (0.026) \\ 
##   & & \\ 
##  Review Frequency (1 - 6 Scale) &  & 0.139$^{***}$ \\ 
##   &  & (0.031) \\ 
##   & & \\ 
##  Willingness to Pay &  & 0.019 \\ 
##   &  & (0.027) \\ 
##   & & \\ 
##  Constant & 0.808$^{***}$ & 0.360 \\ 
##   & (0.158) & (0.329) \\ 
##   & & \\ 
## \hline \\[-1.8ex] 
## Observations & 1,893 & 1,893 \\ 
## R$^{2}$ & 0.045 & 0.094 \\ 
## Adjusted R$^{2}$ & 0.037 & 0.077 \\ 
## Residual Std. Error & 1.429 (df = 1877) & 1.399 (df = 1856) \\ 
## F Statistic & 5.913$^{***}$ (df = 15; 1877) & 5.368$^{***}$ (df = 36; 1856) \\ 
## \hline 
## \hline \\[-1.8ex] 
## \textit{Note:}  & \multicolumn{2}{r}{$^{*}$p$<$0.05; $^{**}$p$<$0.01; $^{***}$p$<$0.001} \\ 
## \end{tabular} 
## \end{table}

user_lm_basic <- lm(num_comment ~ Treatment, data = df_wide_all)
coef_table <- summary(user_lm_basic)$coefficients
# row names contain Treatment or Intercept
coef_table <- coef_table[row.names(coef_table) %in% c("TreatmentHint Control", "TreatmentOne-Click Generate", "TreatmentChat Generate", "(Intercept)"),]

# ggplot
coef_df <- coef_table %>% as.data.frame()
coef_df$ci_lower <- coef_df$Estimate - coef_df$`Std. Error` * 1.96
coef_df$ci_upper <- coef_df$Estimate + coef_df$`Std. Error` * 1.96

coef_df$Treatment <- rownames(coef_df)
coef_df$Treatment <- str_replace(coef_df$Treatment, "Treatment", "")
coef_df$Treatment <- factor(coef_df$Treatment, levels = c( "(Intercept)", "Chat Generate", "One-Click Generate", "Hint Control"))

ggplot(coef_df %>% filter(Treatment != "(Intercept)"), aes(x = Estimate, y = Treatment)) +
  geom_vline(xintercept = 0, linetype = "dashed") +
  geom_point(aes(color = ifelse(ci_lower > 0 | ci_upper < 0, "Significant", "Not Significant")), size = 3) +
  geom_errorbarh(aes(xmin = ci_lower, xmax = ci_upper), height = 0.1) +
  theme_minimal() +
  theme(legend.position = "none") +
  # make fonts bigger
  theme(axis.text.x = element_text(face = "bold", size = 12.5),
        axis.text.y = element_text(face = "bold", size = 12.5),
        axis.title.x = element_text(face = "bold",size = 15),
        axis.title.y = element_text(face = "bold",size = 15)) +
  # make title bigger and bold
  theme(plot.title = element_text(face='bold', size=15),
        plot.subtitle = element_text(size = 12.5)) +
  labs(x = "Coefficient Estimate", y = "Treatment", title = "Number of Comments ATE: User Level Regression", subtitle = paste0("vs Pure Control (Baseline): ", round(coef_df$Estimate[1], 2)))

ggsave("tables_and_figures/num_comment_ate.png", width = 8, height = 6)

First Three Videos

cov_form_lm <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "))
user_lm <- lm(cov_form_lm, data = df_wide)

cov_form_lm_with_cov <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple, collapse = " + "))
user_lm_with_cov <- lm(cov_form_lm_with_cov, data = df_wide)

stargazer(user_lm, user_lm_with_cov, 
          star.cutoffs = c(0.05, 0.01, 0.001), 
          dep.var.labels = "Number of Comments (Upper Bound: 3)", 
          covariate.labels = c("Treatment: Hint Control", "Treatment: One-Click Generate", "Treatment: Chat Generate",
                               "Video 11", "Video 13", "Video 14", "Video 15", "Video 16", "Video 17", 
                               "Video 18", "Video 19", "Video 20", "Video 21", "Video 22", "Video 23",
                               covariates_simple_fancy),
          column.labels = c("Without Covariates", "With Covariates"))

## 
## % Table created by stargazer v.5.2.3 by Marek Hlavac, Social Policy Institute. E-mail: marek.hlavac at gmail.com
## % Date and time: Fri, Mar 28, 2025 - 11:01:21
## \begin{table}[!htbp] \centering 
##   \caption{} 
##   \label{} 
## \begin{tabular}{@{\extracolsep{5pt}}lcc} 
## \\[-1.8ex]\hline 
## \hline \\[-1.8ex] 
##  & \multicolumn{2}{c}{\textit{Dependent variable:}} \\ 
## \cline{2-3} 
## \\[-1.8ex] & \multicolumn{2}{c}{Number of Comments (Upper Bound: 3)} \\ 
##  & Without Covariates & With Covariates \\ 
## \\[-1.8ex] & (1) & (2)\\ 
## \hline \\[-1.8ex] 
##  Treatment: Hint Control & 0.209$^{*}$ & 0.230$^{*}$ \\ 
##   & (0.091) & (0.089) \\ 
##   & & \\ 
##  Treatment: One-Click Generate & 0.137 & 0.181$^{*}$ \\ 
##   & (0.090) & (0.089) \\ 
##   & & \\ 
##  Treatment: Chat Generate & $-$0.283$^{**}$ & $-$0.278$^{**}$ \\ 
##   & (0.089) & (0.088) \\ 
##   & & \\ 
##  Video 11 & $-$0.191$^{*}$ & $-$0.227$^{*}$ \\ 
##   & (0.095) & (0.095) \\ 
##   & & \\ 
##  Video 13 & $-$0.229$^{*}$ & $-$0.239$^{**}$ \\ 
##   & (0.090) & (0.090) \\ 
##   & & \\ 
##  Video 14 & $-$0.094 & $-$0.072 \\ 
##   & (0.102) & (0.101) \\ 
##   & & \\ 
##  Video 15 & $-$0.057 & $-$0.049 \\ 
##   & (0.107) & (0.107) \\ 
##   & & \\ 
##  Video 16 & $-$0.154 & $-$0.132 \\ 
##   & (0.094) & (0.095) \\ 
##   & & \\ 
##  Video 17 & $-$0.202$^{*}$ & $-$0.192$^{*}$ \\ 
##   & (0.098) & (0.097) \\ 
##   & & \\ 
##  Video 18 & $-$0.254$^{*}$ & $-$0.223$^{*}$ \\ 
##   & (0.100) & (0.100) \\ 
##   & & \\ 
##  Video 19 & 0.006 & 0.077 \\ 
##   & (0.111) & (0.109) \\ 
##   & & \\ 
##  Video 20 & $-$0.167 & $-$0.201 \\ 
##   & (0.106) & (0.106) \\ 
##   & & \\ 
##  Video 21 & $-$0.119 & $-$0.088 \\ 
##   & (0.111) & (0.112) \\ 
##   & & \\ 
##  Video 22 & $-$0.182 & $-$0.192 \\ 
##   & (0.109) & (0.108) \\ 
##   & & \\ 
##  Video 23 &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Age &  & 0.003 \\ 
##   &  & (0.002) \\ 
##   & & \\ 
##  YouTube User &  & 0.194$^{*}$ \\ 
##   &  & (0.094) \\ 
##   & & \\ 
##  Social Media: Non-User &  & 0.304 \\ 
##   &  & (0.364) \\ 
##   & & \\ 
##  Social Media: User &  & $-$0.141 \\ 
##   &  & (0.114) \\ 
##   & & \\ 
##  Social Media Usage (1 - 4 Scale) &  & $-$0.084$^{*}$ \\ 
##   &  & (0.041) \\ 
##   & & \\ 
##  Online Usage (1 - 4 Scale) &  & 0.013 \\ 
##   &  & (0.033) \\ 
##   & & \\ 
##  Female &  & $-$0.175$^{*}$ \\ 
##   &  & (0.068) \\ 
##   & & \\ 
##  Race: Asian &  & 0.185 \\ 
##   &  & (0.155) \\ 
##   & & \\ 
##  Race: Black &  & $-$0.035 \\ 
##   &  & (0.125) \\ 
##   & & \\ 
##  Race: Hispanic &  & $-$0.036 \\ 
##   &  & (0.167) \\ 
##   & & \\ 
##  Race: White &  & $-$0.053 \\ 
##   &  & (0.108) \\ 
##   & & \\ 
##  Race: Other &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Education: High School or Less &  & 0.006 \\ 
##   &  & (0.118) \\ 
##   & & \\ 
##  Education: Some College &  & 0.031 \\ 
##   &  & (0.106) \\ 
##   & & \\ 
##  Education: Bachelor &  & 0.037 \\ 
##   &  & (0.089) \\ 
##   & & \\ 
##  Education: Postgraduate &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Political Party: Democrat &  & $-$0.093 \\ 
##   &  & (0.083) \\ 
##   & & \\ 
##  Political Party: Republican &  & $-$0.123 \\ 
##   &  & (0.094) \\ 
##   & & \\ 
##  Political Party: Other &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Political Ideology (1 - 5 Scale; 5 Strong Liberal) &  & $-$0.032 \\ 
##   &  & (0.036) \\ 
##   & & \\ 
##  Income (1 - 5 Scale) &  & $-$0.038 \\ 
##   &  & (0.032) \\ 
##   & & \\ 
##  Social Media Reply Frequency (1 - 6 Scale) &  & 0.108$^{***}$ \\ 
##   &  & (0.025) \\ 
##   & & \\ 
##  Review Frequency (1 - 6 Scale) &  & 0.144$^{***}$ \\ 
##   &  & (0.030) \\ 
##   & & \\ 
##  Constant & 1.955$^{***}$ & 1.528$^{***}$ \\ 
##   & (0.204) & (0.350) \\ 
##   & & \\ 
## \hline \\[-1.8ex] 
## Observations & 1,893 & 1,893 \\ 
## R$^{2}$ & 0.027 & 0.073 \\ 
## Adjusted R$^{2}$ & 0.020 & 0.056 \\ 
## Residual Std. Error & 1.381 (df = 1878) & 1.354 (df = 1858) \\ 
## F Statistic & 3.690$^{***}$ (df = 14; 1878) & 4.331$^{***}$ (df = 34; 1858) \\ 
## \hline 
## \hline \\[-1.8ex] 
## \textit{Note:}  & \multicolumn{2}{r}{$^{*}$p$<$0.05; $^{**}$p$<$0.01; $^{***}$p$<$0.001} \\ 
## \end{tabular} 
## \end{table}

Panel Regression

Outcome: {Review or Not, Time Spent, Input Length, Informativeness}

Outcome ~ treatment + video + user (+ demographics) [adding time fixed effect to detect any time-dependent effect]

Note: we are focusing on the outcome Review or Not first.

First look at comment rate for different orders

df_panel %>% group_by(order) %>% summarise(comment_prob = mean(hasComment, na.rm = T), comment_prob_se = sd(hasComment, na.rm = T) / sqrt(n()))

All Videos

panel_lm <- feols(hasComment ~ Treatment + order | Video.Id, data = df_panel, cluster = ~User.Id)

# panel_lm_with_cov <- feols(hasComment ~ Treatment + order | Video.Id + social_media_nonUser + social_media_user + social_media_YT + social_media_use + website_use + gender + age + raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + edu + polparty + libcons + income + social_media_reply + review_freq, data = df_panel)
panel_lm_with_cov <- feols(hasComment ~ Treatment + order | Video.Id + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, data = df_panel, cluster = ~User.Id)

model_list <- list(panel_lm, panel_lm_with_cov)
texreg::texreg(model_list, 
               stars = c(0.05, 0.01, 0.001), 
               caption = "Panel Regression Results", 
               label = "tab:panel_regression", 
               digits = 4,
               custom.note = "Standard errors are clustered at the user level.",
               custom.model.names = c("Without Covariates", "With Covariates"), 
               custom.coef.names = c("Hint Control", "One-Click Generate", "Chat Generate", "Order"))

## 
## \begin{table}
## \begin{center}
## \begin{tabular}{l c c}
## \hline
##  & Without Covariates & With Covariates \\
## \hline
## Hint Control                               & $0.0757^{**}$   & $0.0699^{*}$    \\
##                                            & $(0.0288)$      & $(0.0276)$      \\
## One-Click Generate                         & $0.0608^{*}$    & $0.0679^{*}$    \\
##                                            & $(0.0296)$      & $(0.0279)$      \\
## Chat Generate                              & $-0.0938^{**}$  & $-0.0960^{**}$  \\
##                                            & $(0.0292)$      & $(0.0294)$      \\
## Order                                      & $-0.0378^{***}$ & $-0.0387^{***}$ \\
##                                            & $(0.0081)$      & $(0.0057)$      \\
## \hline
## Num. obs.                                  & $5921$          & $5921$          \\
## Num. groups: Video.Id                      & $12$            & $12$            \\
## R$^2$ (full model)                         & $0.0280$        & $0.1083$        \\
## R$^2$ (proj model)                         & $0.0254$        & $0.0271$        \\
## Adj. R$^2$ (full model)                    & $0.0256$        & $0.0907$        \\
## Adj. R$^2$ (proj model)                    & $0.0247$        & $0.0264$        \\
## Num. groups: age                           & $$              & $60$            \\
## Num. groups: social\_media\_YT             & $$              & $2$             \\
## Num. groups: social\_media\_nonUser        & $$              & $2$             \\
## Num. groups: social\_media\_user           & $$              & $2$             \\
## Num. groups: social\_media\_use\_numeric   & $$              & $4$             \\
## Num. groups: website\_use\_numeric         & $$              & $4$             \\
## Num. groups: genderFemale                  & $$              & $2$             \\
## Num. groups: raceAsian                     & $$              & $2$             \\
## Num. groups: raceBlack                     & $$              & $2$             \\
## Num. groups: raceHispanic                  & $$              & $2$             \\
## Num. groups: raceWhite                     & $$              & $2$             \\
## Num. groups: raceOther                     & $$              & $2$             \\
## Num. groups: eduHighSchoolOrLess           & $$              & $2$             \\
## Num. groups: eduSomeCollege                & $$              & $2$             \\
## Num. groups: eduBachelor                   & $$              & $2$             \\
## Num. groups: eduPostGrad                   & $$              & $2$             \\
## Num. groups: polpartyDem                   & $$              & $2$             \\
## Num. groups: polpartyRep                   & $$              & $2$             \\
## Num. groups: polpartyOther                 & $$              & $2$             \\
## Num. groups: libcons\_numeric              & $$              & $5$             \\
## Num. groups: income\_numeric               & $$              & $6$             \\
## Num. groups: social\_media\_reply\_numeric & $$              & $6$             \\
## Num. groups: review\_freq\_numeric         & $$              & $6$             \\
## \hline
## \multicolumn{3}{l}{\scriptsize{Standard errors are clustered at the user level.}}
## \end{tabular}
## \caption{Panel Regression Results}
## \label{tab:panel_regression}
## \end{center}
## \end{table}

# plotting using panel_lm_with_cov
coef_df <- summary(panel_lm_with_cov)$coeftable %>% as.data.frame()
coef_df$ci_lower <- coef_df$Estimate - coef_df$`Std. Error` * 1.96
coef_df$ci_upper <- coef_df$Estimate + coef_df$`Std. Error` * 1.96
coef_df$variable <- rownames(coef_df)
coef_df <- coef_df %>% filter(variable != "order")
coef_df <- coef_df %>% mutate(Treatment = str_replace(variable, "Treatment", ""))
coef_df$Treatment <- factor(coef_df$Treatment, levels = c( "order", "Hint Control", "One-Click Generate", "Chat Generate"))


# ggplot
ggplot(coef_df, aes(x = Estimate, y = Treatment)) +
  geom_vline(xintercept = 0, linetype = "dashed") +
  geom_point(aes(color = ifelse(ci_lower > 0 | ci_upper < 0, "Significant", "Not Significant")), size = 3) +
  geom_errorbar(aes(xmin = ci_lower, xmax = ci_upper), height = 0.15) +
  theme_minimal() +
  theme(legend.position = "none") +
  # make fonts bigger
  theme(axis.text.x = element_text(face = "bold", size = 12.5),
        axis.text.y = element_text(face = "bold", size = 12.5),
        axis.title.x = element_text(face = "bold",size = 15),
        axis.title.y = element_text(face = "bold",size = 15)) +
  # make title bigger and bold
  theme(plot.title = element_text(face='bold', size=15),
        plot.subtitle = element_text(size = 12.5)) +
  labs(x = "Coefficient Estimate", y = "Treatment") +
  ggtitle("Has Comment (0/1 Binary): Panel Regression", subtitle = "vs Pure Control")

First Three Videos

panel_lm <- feols(hasComment ~ Treatment + order | Video.Id, data = df_panel %>% filter(firstThree == 1), cluster = ~User.Id)

panel_lm_with_cov <- feols(hasComment ~ Treatment  + order | Video.Id + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, data = df_panel %>% filter(firstThree == 1), cluster = ~User.Id)

model_list <- list(panel_lm, panel_lm_with_cov)
texreg::texreg(model_list, 
               stars = c(0.05, 0.01, 0.001), 
               caption = "Panel Regression Results (First Three Videos)", 
               label = "tab:panel_regression_first_three", 
               digits = 4,
               custom.note = "Standard errors are clustered at the video level.",
               custom.model.names = c("Without Covariates", "With Covariates"), 
               custom.coef.names = c("Hint Control", "One-Click Generate", "Chat Generate", "Order"))

## 
## \begin{table}
## \begin{center}
## \begin{tabular}{l c c}
## \hline
##  & Without Covariates & With Covariates \\
## \hline
## Hint Control                               & $0.0695^{*}$    & $0.0657^{*}$    \\
##                                            & $(0.0291)$      & $(0.0280)$      \\
## One-Click Generate                         & $0.0456$        & $0.0569^{*}$    \\
##                                            & $(0.0297)$      & $(0.0285)$      \\
## Chat Generate                              & $-0.0958^{**}$  & $-0.0962^{**}$  \\
##                                            & $(0.0299)$      & $(0.0299)$      \\
## Order                                      & $-0.0218^{***}$ & $-0.0228^{***}$ \\
##                                            & $(0.0043)$      & $(0.0043)$      \\
## \hline
## Num. obs.                                  & $5679$          & $5679$          \\
## Num. groups: Video.Id                      & $12$            & $12$            \\
## R$^2$ (full model)                         & $0.0201$        & $0.0998$        \\
## R$^2$ (proj model)                         & $0.0174$        & $0.0188$        \\
## Adj. R$^2$ (full model)                    & $0.0175$        & $0.0812$        \\
## Adj. R$^2$ (proj model)                    & $0.0167$        & $0.0181$        \\
## Num. groups: age                           & $$              & $60$            \\
## Num. groups: social\_media\_YT             & $$              & $2$             \\
## Num. groups: social\_media\_nonUser        & $$              & $2$             \\
## Num. groups: social\_media\_user           & $$              & $2$             \\
## Num. groups: social\_media\_use\_numeric   & $$              & $4$             \\
## Num. groups: website\_use\_numeric         & $$              & $4$             \\
## Num. groups: genderFemale                  & $$              & $2$             \\
## Num. groups: raceAsian                     & $$              & $2$             \\
## Num. groups: raceBlack                     & $$              & $2$             \\
## Num. groups: raceHispanic                  & $$              & $2$             \\
## Num. groups: raceWhite                     & $$              & $2$             \\
## Num. groups: raceOther                     & $$              & $2$             \\
## Num. groups: eduHighSchoolOrLess           & $$              & $2$             \\
## Num. groups: eduSomeCollege                & $$              & $2$             \\
## Num. groups: eduBachelor                   & $$              & $2$             \\
## Num. groups: eduPostGrad                   & $$              & $2$             \\
## Num. groups: polpartyDem                   & $$              & $2$             \\
## Num. groups: polpartyRep                   & $$              & $2$             \\
## Num. groups: polpartyOther                 & $$              & $2$             \\
## Num. groups: libcons\_numeric              & $$              & $5$             \\
## Num. groups: income\_numeric               & $$              & $6$             \\
## Num. groups: social\_media\_reply\_numeric & $$              & $6$             \\
## Num. groups: review\_freq\_numeric         & $$              & $6$             \\
## \hline
## \multicolumn{3}{l}{\scriptsize{Standard errors are clustered at the video level.}}
## \end{tabular}
## \caption{Panel Regression Results (First Three Videos)}
## \label{tab:panel_regression_first_three}
## \end{center}
## \end{table}

Mediation Analysis (User Level)

First conduct a overall correlation analysis for the mediators.

mediator_columns <- c("mech_popup", "mech_speed", "mech_wording", "mech_formulate","mech_difficulty",  "mech_AIaversion", "mech_trueop")
mech_fancy_names <- c("Pop-up (+)", "Speed (+)", "Help Wording (+)", "Help Formulate (+)", "Difficult to Use (-)", "AI Aversion (-)",  "True Opinion (-)")
mech_mapping <- c("Speed (+)" = "mech_speed", "Help Wording (+)" = "mech_wording", "Difficult to Use (-)" = "mech_difficulty", "Help Formulate (+)" = "mech_formulate", "AI Aversion (-)" = "mech_AIaversion", "Pop-up (+)" = "mech_popup", "True Opinion (-)" = "mech_trueop")

covariates_simple_without_baseline <- covariates_simple[!covariates_simple %in% c("raceOther", "eduHighSchoolOrLess", "polpartyOther")]
covariates_simple_without_baseline_fancy <- covariates_simple_fancy[!covariates_simple_fancy %in% c("Race: Other", "Political Party: Other", "Education: High School or Less")]

# visual correlation plot
mediator_corr <- cor(df_wide_all[, c(mediator_columns, "willingness_to_pay", "review_exp")], use = "pairwise.complete.obs")
# change with fancy names
rownames(mediator_corr) <- c(mech_fancy_names, "Willingness to Pay", "Review Experience")
colnames(mediator_corr) <- c(mech_fancy_names, "Willingness to Pay", "Review Experience")

# visualize
corrplot(mediator_corr,method = 'number')

Step 1: Mediator Treatment Effect

Mediator: {faster, not reflect true opinion, right word, difficulty of usage, thought formulation, AI aversion, pop-up feature} Mediator ~ treatment + video (+ demographics)

mediator_coef_df <- data.frame()
for (m in c("mech_speed", "mech_wording", "mech_difficulty", "mech_formulate", "mech_AIaversion", "mech_popup", "mech_trueop")){
  cov_form_lm <- paste0(m, " ~ Treatment + ", paste(video_columns, collapse = " + "))
  mediator_lm <- lm(cov_form_lm, data = df_wide_all)
  mediator_lm_coef <- summary(mediator_lm)$coefficients[2:4, c("Estimate", "Std. Error")]
  mediator_coef_df <- rbind(mediator_coef_df, cbind(mediator_lm_coef, mech_fancy_names[which(mech_mapping == m)]))
}
colnames(mediator_coef_df) <- c("Estimate", "Std. Error", "Mediator")
mediator_coef_df$Mediator <- factor(mediator_coef_df$Mediator, levels = c("Pop-up (+)", "Speed (+)", "Help Wording (+)", "Help Formulate (+)", "Difficult to Use (-)","AI Aversion (-)", "True Opinion (-)"))
mediator_coef_df$Estimate <- as.numeric(as.character(mediator_coef_df$Estimate))
mediator_coef_df$`Std. Error` <- as.numeric(as.character(mediator_coef_df$`Std. Error`))
mediator_coef_df$Treatment <- rownames(mediator_coef_df)
mediator_coef_df$Treatment <- ifelse(str_detect(mediator_coef_df$Treatment, "Hint"), "Hint Control", 
                                     ifelse(str_detect(mediator_coef_df$Treatment, "One"), "One-Click Generate", "Chat Generate"))
mediator_coef_df$Treatment <- factor(mediator_coef_df$Treatment, levels = c("Hint Control", "One-Click Generate", "Chat Generate"))

This plot is generated based on estimating {Mediator ~ treatment + video}. We use all videos instead of limiting to first three.

#plot bar plot
ggplot(mediator_coef_df, aes(x = Mediator, y = Estimate, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = Estimate - 1.96*`Std. Error`, ymax = Estimate + 1.96*`Std. Error`), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Mediator Analysis", x = "Mediator", y = "Estimate") +
  scale_fill_manual(name = "vs Pure Control",
                    values=c("Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        # center title
        title = element_text(face = "bold"))

mediator_coef_with_cov_df <- data.frame()
for (m in c("mech_speed", "mech_wording", "mech_difficulty", "mech_formulate", "mech_AIaversion", "mech_popup", "mech_trueop")){
  cov_form_lm_with_cov <- paste0(m, " ~ Treatment + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple_without_baseline, collapse = " + "))
  mediator_lm_with_cov <- lm(cov_form_lm_with_cov, data = df_wide_all)
  assign(paste0("mediator_lm_with_cov_", m), lm(cov_form_lm_with_cov, data = df_wide_all))
  mediator_lm_coef <- summary(mediator_lm_with_cov)$coefficients[2:4, c("Estimate", "Std. Error")]
  mediator_coef_with_cov_df <- rbind(mediator_coef_with_cov_df, cbind(mediator_lm_coef, mech_fancy_names[which(mech_mapping == m)]))
}
colnames(mediator_coef_with_cov_df) <- c("Estimate", "Std. Error", "Mediator")
mediator_coef_with_cov_df$Mediator <- factor(mediator_coef_with_cov_df$Mediator, levels = c("Pop-up (+)", "Speed (+)", "Help Wording (+)", "Help Formulate (+)", "Difficult to Use (-)","AI Aversion (-)", "True Opinion (-)"))
mediator_coef_with_cov_df$Estimate <- as.numeric(as.character(mediator_coef_with_cov_df$Estimate))
mediator_coef_with_cov_df$`Std. Error` <- as.numeric(as.character(mediator_coef_with_cov_df$`Std. Error`))
mediator_coef_with_cov_df$Treatment <- rownames(mediator_coef_with_cov_df)
mediator_coef_with_cov_df$Treatment <- ifelse(str_detect(mediator_coef_with_cov_df$Treatment, "Hint"), "Hint Control", 
                                     ifelse(str_detect(mediator_coef_with_cov_df$Treatment, "One"), "One-Click Generate", "Chat Generate"))
mediator_coef_with_cov_df$Treatment <- factor(mediator_coef_with_cov_df$Treatment, levels = c("Hint Control", "One-Click Generate", "Chat Generate"))

#plot bar plot
ggplot(mediator_coef_with_cov_df, aes(x = Mediator, y = Estimate, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = Estimate - 1.96*`Std. Error`, ymax = Estimate + 1.96*`Std. Error`), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Mediator Analysis (With Covariates)", x = "Mediator", y = "Estimate") +
  scale_fill_manual(name = "vs Pure Control",
                    values=c("Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        # center title
        title = element_text(face = "bold"))

Step 2: Mediation Effect

Number of Reviews ~ treatment + mediator + video (+ demographics)

outcome_coef_with_cov_df <- data.frame()
for (m in mediator_columns){
  cov_form_lm_with_cov <- paste0("num_comment ~ Treatment + ", m, " + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple_without_baseline, collapse = " + "))
  assign(paste0("outcome_lm_with_cov_", m), lm(cov_form_lm_with_cov, data = df_wide_all))
}

# include png
include_graphics("tables_and_figures/mediation_table.png")

Individual Mediator Effect

ACME (Average Causal Mediation Effect): The indirect effect (IE) of the treatment through the mediator.
ADE (Average Direct Effect): The direct effect (DE) of the treatment on the outcome.
Proportion Mediated: The proportion of the total effect explained by the mediator.

Should note that the y-axis ranges are different for each mediator plot.

conduct_mediation_analysis <- function(mediator_lm_with_cov, outcome_lm_with_cov, mediator, sims = 500){
  mediation_hintcontrol <- mediate(mediator_lm_with_cov, outcome_lm_with_cov, treat = "Treatment", control.value = "Pure Control", treat.value = "Hint Control", mediator = mediator,  sims = 500, robustSE = TRUE)
  
  mediation_oneclick <- mediate(mediator_lm_with_cov, outcome_lm_with_cov, treat = "Treatment", control.value = "Pure Control", treat.value = "One-Click Generate", mediator = mediator, sims = 500, robustSE = TRUE)
  
  mediation_chat <- mediate(mediator_lm_with_cov, outcome_lm_with_cov, treat = "Treatment", control.value = "Pure Control", treat.value = "Chat Generate", mediator = mediator, sims = 500, robustSE = TRUE)
  
  hintcontrol_mediate_df <- rbind(c(mediation_hintcontrol$d.avg, mediation_hintcontrol$d.avg.ci), 
        c(mediation_hintcontrol$z.avg, mediation_hintcontrol$z.avg.ci), 
        c(mediation_hintcontrol$n.avg, mediation_hintcontrol$n.avg.ci)) %>% as.data.frame()
  hintcontrol_mediate_df$type <- c("ACME", "ADE", "Prop. Mediated")
  hintcontrol_mediate_df$treatment <- "Hint Control"
  
  oneclick_mediate_df <- rbind(c(mediation_oneclick$d.avg, mediation_oneclick$d.avg.ci), 
        c(mediation_oneclick$z.avg, mediation_oneclick$z.avg.ci), 
        c(mediation_oneclick$n.avg, mediation_oneclick$n.avg.ci)) %>% as.data.frame()
  
  oneclick_mediate_df$type <- c("ACME", "ADE", "Prop. Mediated")
  oneclick_mediate_df$treatment <- "One-click Generate"
  
  chat_mediate_df <- rbind(c(mediation_chat$d.avg, mediation_chat$d.avg.ci),
        c(mediation_chat$z.avg, mediation_chat$z.avg.ci), 
        c(mediation_chat$n.avg, mediation_chat$n.avg.ci)) %>% as.data.frame()
  chat_mediate_df$type <- c("ACME", "ADE", "Prop. Mediated")
  chat_mediate_df$treatment <- "Chat Generate"
  
  mediate_df_popup <- rbind(hintcontrol_mediate_df, oneclick_mediate_df, chat_mediate_df)
  colnames(mediate_df_popup) <- c("Estimate", "2.5% CI", "97.5% CI", "Estimate Type", "Treatment")
  mediate_df_popup$Treatment <- factor(mediate_df_popup$Treatment, levels = c("Hint Control", "One-click Generate", "Chat Generate"))
  
  # plot
  output_plot <- ggplot(mediate_df_popup, aes(x = `Treatment`, y = Estimate, fill = `Estimate Type`)) +
    geom_bar(stat = "identity", position = "dodge") +
    geom_errorbar(aes(ymin = `2.5% CI`, ymax = `97.5% CI`), width = 0.2, position = position_dodge(0.9)) +
    labs(title = paste0("Mediation Analysis: ", mech_mapping[which(mech_mapping == mediator)] %>% names()), x = "Treatment", y = "Estimate") +
    scale_fill_manual(name = "Estimate Type",
                      values=c("ACME" = "turquoise4","ADE"="pink4","Prop. Mediated" = "orange4")) +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1),
          # center title
          title = element_text(face = "bold"))
  return(list(mediate_df_popup, output_plot))
}

Pop-up

mediate_analysis_popup <- conduct_mediation_analysis(mediator_lm_with_cov_mech_popup, outcome_lm_with_cov_mech_popup, "mech_popup")
mediate_analysis_popup[[2]]

Speed

mediate_analysis_speed <- conduct_mediation_analysis(mediator_lm_with_cov_mech_speed, outcome_lm_with_cov_mech_speed, "mech_speed")
mediate_analysis_speed[[2]]

Help Wording

mediate_analysis_wording <- conduct_mediation_analysis(mediator_lm_with_cov_mech_wording, outcome_lm_with_cov_mech_wording, "mech_wording")
mediate_analysis_wording[[2]]

Help Formulate

mediate_analysis_formulate <- conduct_mediation_analysis(mediator_lm_with_cov_mech_formulate, outcome_lm_with_cov_mech_formulate, "mech_formulate")
mediate_analysis_formulate[[2]]

Difficult to Use

mediate_analysis_difficulty <- conduct_mediation_analysis(mediator_lm_with_cov_mech_difficulty, outcome_lm_with_cov_mech_difficulty, "mech_difficulty")
mediate_analysis_difficulty[[2]]

AI Aversion

mediate_analysis_AIaversion <- conduct_mediation_analysis(mediator_lm_with_cov_mech_AIaversion, outcome_lm_with_cov_mech_AIaversion, "mech_AIaversion")
mediate_analysis_AIaversion[[2]]

Not Reflect True Opinion

mediate_analysis_trueop <- conduct_mediation_analysis(mediator_lm_with_cov_mech_trueop, outcome_lm_with_cov_mech_trueop, "mech_trueop")
mediate_analysis_trueop[[2]]

Subgroup Analysis

The outcome variable here is number of videos that have comments.

Subgroups using median split: social media usage, comment frequency, demographics, popup. Currently do not include social media platform type.

median_split <- function(df, var){
  median_val <- median(df[[var]], na.rm = T)
  df[[paste0(var, "_median")]] <- ifelse(df[[var]] > median_val, "Above Median", "Below Median")
  df[[paste0(var, "_median")]] <- factor(df[[paste0(var, "_median")]], levels = c("Below Median", "Above Median"))
  return(df)
}

df_wide_all <- df_wide_all %>% 
  median_split("social_media_use_numeric") %>%
  median_split("website_use_numeric") %>%
  median_split("social_media_reply_numeric") %>% 
  median_split("review_freq_numeric") %>% 
  median_split("age") %>% 
  median_split("income_numeric") %>% 
  median_split("libcons_numeric") %>%
  median_split("mech_popup")

df_wide_all$edu_combined <- case_when(
  df_wide_all$edu == "Did not graduate from high school" ~ "High School or Less",
  df_wide_all$edu == "High school graduate (high school diploma or equivalent including GED)" ~ "High School or Less",
  df_wide_all$edu == "Some college, but no degree" ~ "Some College",
  df_wide_all$edu == "2-year college degree" ~ "Bachelor's Degree",
  df_wide_all$edu == "4-year college degree" ~ "Bachelor's Degree",
  df_wide_all$edu == "Postgraduate degree (MA, MBA, JD, PhD, etc.)" ~ "Graduate Degree"
)

df_wide_all$edu_combined <- factor(df_wide_all$edu_combined, levels = c("High School or Less", "Some College", "Bachelor's Degree", "Graduate Degree"))

df_wide_all$race_combined <- case_when(
  df_wide_all$race == "Asian/Pacific Islander" ~ "Asian",
  df_wide_all$race == "Black or African American" ~ "Black",
  df_wide_all$race == "Latino or Hispanic" ~ "Hispanic",
  df_wide_all$race == "Caucasian/White" ~ "White",
  TRUE ~ "Other"
)
df_wide_all$race_combined <- factor(df_wide_all$race_combined, levels = c("White", "Black", "Hispanic", "Asian", "Other"))

df_wide_all$polparty_combined <- case_when(
  df_wide_all$polparty == "Democrat" ~ "Democrat",
  df_wide_all$polparty == "Republican" ~ "Republican",
  TRUE ~ "Other"
)
df_wide_all$polparty_combined <- factor(df_wide_all$polparty_combined, levels = c("Democrat", "Republican", "Other"))

value_split <- function(df, var){
  # split by 1 2, 3 4, 5 6
  df[[paste0(var, "_valuesplit")]] <- ifelse(df[[var]] %in% c(1, 2), "1-2", 
                                         ifelse(df[[var]] %in% c(3, 4), "3-4", "5-6"))
  df[[paste0(var, "_valuesplit")]] <- factor(df[[paste0(var, "_valuesplit")]], levels = c("5-6", "3-4", "1-2"))
  return (df)
}

df_wide_all <- df_wide_all %>% 
  value_split("social_media_reply_numeric") %>% 
  value_split("review_freq_numeric")

subgroups_columns <- c("social_media_use_numeric_median", "website_use_numeric_median", "social_media_reply_numeric_median", "review_freq_numeric_median", "age_median", "income_numeric_median", "libcons_numeric_median", "mech_popup_median", "edu_combined", "race_combined", "polparty_combined")

for (subgroup in subgroups_columns){
  if (grepl("median", subgroup)){
    df_wide_all[[subgroup]] <- factor(df_wide_all[[subgroup]], levels = c("Above Median", "Below Median"))
  }
  subgroup_lm <- lm(paste0("num_comment ~ Treatment * ", subgroup, " + ", paste(video_columns, collapse = " + ")), data = df_wide_all)
  assign(paste0("subgroup_lm_", subgroup), subgroup_lm)
}

for (subgroup in subgroups_columns){
  subgroup_lm <- get(paste0("subgroup_lm_", subgroup))
  coef_table <- summary(subgroup_lm)$coefficients
  cat("#### ", subgroup, "\n")
  print(paste0("Values for this variable: ", paste(unique(df_wide_all[[subgroup]]), collapse = ", ")))
  rows_to_extract <- rownames(coef_table)[!str_detect(rownames(coef_table), "video") & (rownames(coef_table) != "(Intercept)")]
  subgroup_coef <- coef_table[rows_to_extract, ]
  print(kable(subgroup_coef, format = "markdown"))
  cat("\n")
}

social_media_use_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1520628	0.1490672	1.0200956	0.3078149
TreatmentOne-Click Generate	0.2064884	0.1490213	1.3856297	0.1660250
TreatmentChat Generate	-0.5545985	0.1489900	-3.7223872	0.0002032
social_media_use_numeric_medianBelow Median	-0.1894189	0.1355886	-1.3970120	0.1625754
TreatmentHint Control:social_media_use_numeric_medianBelow Median	0.1201511	0.1914924	0.6274456	0.5304437
TreatmentOne-Click Generate:social_media_use_numeric_medianBelow Median	-0.0264369	0.1909572	-0.1384443	0.8899042
TreatmentChat Generate:social_media_use_numeric_medianBelow Median	0.4233885	0.1897219	2.2316262	0.0257576

website_use_numeric_median

[1] “Values for this variable: Above Median, Below Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.3278598	0.1529453	2.1436405	0.0321901
TreatmentOne-Click Generate	0.0556265	0.1512530	0.3677715	0.7130851
TreatmentChat Generate	-0.3120024	0.1496007	-2.0855686	0.0371528
website_use_numeric_medianBelow Median	-0.0985376	0.1365535	-0.7216042	0.4706279
TreatmentHint Control:website_use_numeric_medianBelow Median	-0.1631604	0.1935151	-0.8431407	0.3992574
TreatmentOne-Click Generate:website_use_numeric_medianBelow Median	0.2270564	0.1921939	1.1813924	0.2375969
TreatmentChat Generate:website_use_numeric_medianBelow Median	0.0253999	0.1904716	0.1333526	0.8939289

social_media_reply_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1758918	0.1751497	1.0042370	0.3153941
TreatmentOne-Click Generate	0.3996551	0.1787583	2.2357292	0.0254869
TreatmentChat Generate	-0.8249604	0.1705226	-4.8378372	0.0000014
social_media_reply_numeric_medianBelow Median	-0.4494243	0.1437485	-3.1264613	0.0017964
TreatmentHint Control:social_media_reply_numeric_medianBelow Median	0.0820542	0.2063779	0.3975918	0.6909764
TreatmentOne-Click Generate:social_media_reply_numeric_medianBelow Median	-0.2458927	0.2087038	-1.1781900	0.2388705
TreatmentChat Generate:social_media_reply_numeric_medianBelow Median	0.7456835	0.2020940	3.6897853	0.0002309

review_freq_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1341986	0.1680516	0.7985562	0.4246490
TreatmentOne-Click Generate	0.1775520	0.1695819	1.0469984	0.2952354
TreatmentChat Generate	-0.9273383	0.1647473	-5.6288518	0.0000000
review_freq_numeric_medianBelow Median	-0.6424686	0.1402798	-4.5799069	0.0000050
TreatmentHint Control:review_freq_numeric_medianBelow Median	0.1531519	0.2012190	0.7611207	0.4466808
TreatmentOne-Click Generate:review_freq_numeric_medianBelow Median	0.0553152	0.2016659	0.2742912	0.7838911
TreatmentChat Generate:review_freq_numeric_medianBelow Median	0.9169359	0.1976928	4.6381852	0.0000038

age_median

[1] “Values for this variable: Above Median, Below Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2901877	0.1333980	2.1753532	0.0297284
TreatmentOne-Click Generate	0.1375793	0.1323183	1.0397605	0.2985854
TreatmentChat Generate	-0.4263501	0.1335906	-3.1914681	0.0014390
age_medianBelow Median	-0.0692916	0.1320071	-0.5249084	0.5997090
TreatmentHint Control:age_medianBelow Median	-0.1265224	0.1880281	-0.6728910	0.5010997
TreatmentOne-Click Generate:age_medianBelow Median	0.1129823	0.1864932	0.6058253	0.5447042
TreatmentChat Generate:age_medianBelow Median	0.2517580	0.1850340	1.3606040	0.1738026

income_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2261115	0.1778550	1.2713250	0.2037709
TreatmentOne-Click Generate	0.1148626	0.1689765	0.6797546	0.4967439
TreatmentChat Generate	-0.2727882	0.1749008	-1.5596736	0.1190060
income_numeric_medianBelow Median	0.0793380	0.1440695	0.5506929	0.5819099
TreatmentHint Control:income_numeric_medianBelow Median	0.0021682	0.2091697	0.0103655	0.9917307
TreatmentOne-Click Generate:income_numeric_medianBelow Median	0.1186734	0.2026350	0.5856510	0.5581806
TreatmentChat Generate:income_numeric_medianBelow Median	-0.0316052	0.2059150	-0.1534866	0.8780311

libcons_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2257952	0.1412443	1.5986146	0.1100749
TreatmentOne-Click Generate	0.3444022	0.1394530	2.4696661	0.0136125
TreatmentChat Generate	-0.1567416	0.1370036	-1.1440688	0.2527413
libcons_numeric_medianBelow Median	0.2388302	0.1327239	1.7994504	0.0721084
TreatmentHint Control:libcons_numeric_medianBelow Median	0.0016486	0.1893856	0.0087051	0.9930554
TreatmentOne-Click Generate:libcons_numeric_medianBelow Median	-0.2699784	0.1881897	-1.4346074	0.1515659
TreatmentChat Generate:libcons_numeric_medianBelow Median	-0.2504395	0.1861856	-1.3451066	0.1787537

mech_popup_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.3692243	0.1368563	2.697897	0.0070405
TreatmentOne-Click Generate	0.4900238	0.1403991	3.490221	0.0004938
TreatmentChat Generate	-0.3043213	0.1332715	-2.283469	0.0225143
mech_popup_medianBelow Median	-0.4756642	0.1288238	-3.692363	0.0002285
TreatmentHint Control:mech_popup_medianBelow Median	-0.2061091	0.1842171	-1.118837	0.2633529
TreatmentOne-Click Generate:mech_popup_medianBelow Median	-0.4064217	0.1849701	-2.197229	0.0281262
TreatmentChat Generate:mech_popup_medianBelow Median	0.0457362	0.1810017	0.252684	0.8005402

edu_combined

[1] “Values for this variable: Bachelor’s Degree, High School or Less, Graduate Degree, Some College”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1916700	0.2483283	0.7718411	0.4403063
TreatmentOne-Click Generate	0.2781296	0.2665922	1.0432771	0.2969552
TreatmentChat Generate	-0.1530604	0.2506167	-0.6107350	0.5414494
edu_combinedSome College	0.1754877	0.2351195	0.7463769	0.4555338
edu_combinedBachelor’s Degree	0.0042786	0.2124339	0.0201408	0.9839332
edu_combinedGraduate Degree	-0.0497918	0.2379079	-0.2092900	0.8342447
TreatmentHint Control:edu_combinedSome College	-0.2500921	0.3200204	-0.7814881	0.4346147
TreatmentOne-Click Generate:edu_combinedSome College	-0.5344004	0.3337272	-1.6013090	0.1094779
TreatmentChat Generate:edu_combinedSome College	-0.1165625	0.3235608	-0.3602493	0.7187015
TreatmentHint Control:edu_combinedBachelor’s Degree	0.0691463	0.2859646	0.2418002	0.8089616
TreatmentOne-Click Generate:edu_combinedBachelor’s Degree	0.0374767	0.2999348	0.1249495	0.9005770
TreatmentChat Generate:edu_combinedBachelor’s Degree	-0.1357479	0.2852142	-0.4759509	0.6341651
TreatmentHint Control:edu_combinedGraduate Degree	0.3292070	0.3257771	1.0105283	0.3123733
TreatmentOne-Click Generate:edu_combinedGraduate Degree	0.0668557	0.3376210	0.1980201	0.8430509
TreatmentChat Generate:edu_combinedGraduate Degree	-0.2820058	0.3263543	-0.8641092	0.3876390

race_combined

[1] “Values for this variable: White, Hispanic, Black, Other, Asian”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2406438	0.1206836	1.9940059	0.0462971
TreatmentOne-Click Generate	0.1585521	0.1215779	1.3041192	0.1923542
TreatmentChat Generate	-0.1922975	0.1205240	-1.5955126	0.1107674
race_combinedBlack	0.2493282	0.1764127	1.4133233	0.1577279
race_combinedHispanic	-0.2569530	0.3031778	-0.8475324	0.3968074
race_combinedAsian	-0.1424023	0.2555765	-0.5571809	0.5774708
race_combinedOther	0.5677767	0.2380523	2.3850922	0.0171744
TreatmentHint Control:race_combinedBlack	0.0655543	0.2509028	0.2612737	0.7939103
TreatmentOne-Click Generate:race_combinedBlack	-0.0811209	0.2463661	-0.3292696	0.7419890
TreatmentChat Generate:race_combinedBlack	-0.5928841	0.2433210	-2.4366335	0.0149179
TreatmentHint Control:race_combinedHispanic	0.0092450	0.4375765	0.0211276	0.9831461
TreatmentOne-Click Generate:race_combinedHispanic	0.4703768	0.4256433	1.1050962	0.2692608
TreatmentChat Generate:race_combinedHispanic	0.4608780	0.4014581	1.1480101	0.2511119
TreatmentHint Control:race_combinedAsian	0.3430677	0.3582146	0.9577157	0.3383305
TreatmentOne-Click Generate:race_combinedAsian	0.8817918	0.3757343	2.3468491	0.0190373
TreatmentChat Generate:race_combinedAsian	0.5777475	0.3622872	1.5947224	0.1109441
TreatmentHint Control:race_combinedOther	-0.5929527	0.3334806	-1.7780727	0.0755552
TreatmentOne-Click Generate:race_combinedOther	-0.4498574	0.3186683	-1.4116790	0.1582117
TreatmentChat Generate:race_combinedOther	-0.6462898	0.3239693	-1.9949106	0.0461982

polparty_combined

[1] “Values for this variable: Other, Republican, Democrat”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2810308	0.1511192	1.8596624	0.0630903
TreatmentOne-Click Generate	0.3269587	0.1524336	2.1449254	0.0320874
TreatmentChat Generate	-0.1337202	0.1478253	-0.9045829	0.3658030
polparty_combinedRepublican	0.1839453	0.1622802	1.1335039	0.2571481
polparty_combinedOther	0.2683236	0.1572355	1.7065081	0.0880796
TreatmentHint Control:polparty_combinedRepublican	-0.0237570	0.2301377	-0.1032293	0.9177920
TreatmentOne-Click Generate:polparty_combinedRepublican	-0.1549203	0.2311001	-0.6703599	0.5027112
TreatmentChat Generate:polparty_combinedRepublican	-0.2235712	0.2279546	-0.9807708	0.3268327
TreatmentHint Control:polparty_combinedOther	-0.1210572	0.2242238	-0.5398944	0.5893342
TreatmentOne-Click Generate:polparty_combinedOther	-0.2562977	0.2216947	-1.1560841	0.2477945
TreatmentChat Generate:polparty_combinedOther	-0.2869505	0.2198479	-1.3052231	0.1919776

We specifically look at social media reply and review frequency where we split each into three groups.

subgroup_lm_valuesplit_smr <- lm(paste0("num_comment ~ Treatment * social_media_reply_numeric_valuesplit + ", paste(video_columns, collapse = " + ")), data = df_wide_all)


summary(subgroup_lm_valuesplit_smr)

## 
## Call:
## lm(formula = paste0("num_comment ~ Treatment * social_media_reply_numeric_valuesplit + ", 
##     paste(video_columns, collapse = " + ")), data = df_wide_all)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.8805 -1.2838 -0.0019  1.2920  7.5485 
## 
## Coefficients:
##                                                                      Estimate
## (Intercept)                                                           1.13360
## TreatmentHint Control                                                 0.17686
## TreatmentOne-Click Generate                                           0.40010
## TreatmentChat Generate                                               -0.82480
## social_media_reply_numeric_valuesplit3-4                             -0.24685
## social_media_reply_numeric_valuesplit1-2                             -0.83726
## video11                                                               0.13087
## video13                                                               0.14833
## video14                                                               0.31961
## video15                                                               0.31741
## video16                                                               0.24329
## video17                                                               0.13486
## video18                                                               0.10783
## video19                                                               0.40949
## video20                                                               0.22798
## video21                                                               0.24207
## video22                                                               0.24203
## video23                                                               0.39403
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4        0.05799
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4 -0.32414
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4       0.69537
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2        0.14064
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2 -0.06515
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2       0.84625
##                                                                      Std. Error
## (Intercept)                                                             0.18547
## TreatmentHint Control                                                   0.17369
## TreatmentOne-Click Generate                                             0.17727
## TreatmentChat Generate                                                  0.16910
## social_media_reply_numeric_valuesplit3-4                                0.15283
## social_media_reply_numeric_valuesplit1-2                                0.17796
## video11                                                                 0.07895
## video13                                                                 0.07950
## video14                                                                 0.08691
## video15                                                                 0.09196
## video16                                                                 0.08073
## video17                                                                 0.08152
## video18                                                                 0.08428
## video19                                                                 0.09067
## video20                                                                 0.08882
## video21                                                                 0.09271
## video22                                                                 0.09139
## video23                                                                 0.08151
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4          0.21939
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4    0.22221
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4         0.21489
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2          0.25317
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2    0.25170
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2         0.24957
##                                                                      t value
## (Intercept)                                                            6.112
## TreatmentHint Control                                                  1.018
## TreatmentOne-Click Generate                                            2.257
## TreatmentChat Generate                                                -4.878
## social_media_reply_numeric_valuesplit3-4                              -1.615
## social_media_reply_numeric_valuesplit1-2                              -4.705
## video11                                                                1.658
## video13                                                                1.866
## video14                                                                3.677
## video15                                                                3.452
## video16                                                                3.014
## video17                                                                1.654
## video18                                                                1.279
## video19                                                                4.516
## video20                                                                2.567
## video21                                                                2.611
## video22                                                                2.648
## video23                                                                4.834
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4         0.264
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4  -1.459
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4        3.236
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2         0.555
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2  -0.259
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2        3.391
##                                                                          Pr(>|t|)
## (Intercept)                                                          0.0000000012
## TreatmentHint Control                                                    0.308683
## TreatmentOne-Click Generate                                              0.024121
## TreatmentChat Generate                                               0.0000011647
## social_media_reply_numeric_valuesplit3-4                                 0.106433
## social_media_reply_numeric_valuesplit1-2                             0.0000027282
## video11                                                                  0.097581
## video13                                                                  0.062217
## video14                                                                  0.000242
## video15                                                                  0.000570
## video16                                                                  0.002616
## video17                                                                  0.098262
## video18                                                                  0.200886
## video19                                                              0.0000066803
## video20                                                                  0.010344
## video21                                                                  0.009103
## video22                                                                  0.008159
## video23                                                              0.0000014473
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4           0.791545
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4     0.144805
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4          0.001233
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2           0.578619
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2     0.795779
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2          0.000711
##                                                                         
## (Intercept)                                                          ***
## TreatmentHint Control                                                   
## TreatmentOne-Click Generate                                          *  
## TreatmentChat Generate                                               ***
## social_media_reply_numeric_valuesplit3-4                                
## social_media_reply_numeric_valuesplit1-2                             ***
## video11                                                              .  
## video13                                                              .  
## video14                                                              ***
## video15                                                              ***
## video16                                                              ** 
## video17                                                              .  
## video18                                                                 
## video19                                                              ***
## video20                                                              *  
## video21                                                              ** 
## video22                                                              ** 
## video23                                                              ***
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4          
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4    
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4      ** 
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2          
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2    
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2      ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.403 on 1869 degrees of freedom
## Multiple R-squared:  0.08317,    Adjusted R-squared:  0.07188 
## F-statistic: 7.371 on 23 and 1869 DF,  p-value: < 0.00000000000000022

subgroup_lm_valuesplit_rf <- lm(paste0("num_comment ~ Treatment * review_freq_numeric_valuesplit + ", paste(video_columns, collapse = " + ")), data = df_wide_all)

summary(subgroup_lm_valuesplit_rf)

## 
## Call:
## lm(formula = paste0("num_comment ~ Treatment * review_freq_numeric_valuesplit + ", 
##     paste(video_columns, collapse = " + ")), data = df_wide_all)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.7038 -1.2722 -0.0907  1.3311  7.7037 
## 
## Coefficients:
##                                                               Estimate
## (Intercept)                                                    1.23610
## TreatmentHint Control                                          0.15480
## TreatmentOne-Click Generate                                    0.41396
## TreatmentChat Generate                                        -1.02198
## review_freq_numeric_valuesplit3-4                             -0.24003
## review_freq_numeric_valuesplit1-2                             -0.71319
## video11                                                        0.12304
## video13                                                        0.14364
## video14                                                        0.31730
## video15                                                        0.30611
## video16                                                        0.22458
## video17                                                        0.15480
## video18                                                        0.09082
## video19                                                        0.45899
## video20                                                        0.24744
## video21                                                        0.26271
## video22                                                        0.21896
## video23                                                        0.41107
## TreatmentHint Control:review_freq_numeric_valuesplit3-4        0.14940
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4 -0.07320
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4       0.51181
## TreatmentHint Control:review_freq_numeric_valuesplit1-2        0.06659
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2 -0.30713
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2       1.11682
##                                                               Std. Error
## (Intercept)                                                      0.22849
## TreatmentHint Control                                            0.25894
## TreatmentOne-Click Generate                                      0.26585
## TreatmentChat Generate                                           0.25852
## review_freq_numeric_valuesplit3-4                                0.20328
## review_freq_numeric_valuesplit1-2                                0.19972
## video11                                                          0.07905
## video13                                                          0.07965
## video14                                                          0.08693
## video15                                                          0.09181
## video16                                                          0.08094
## video17                                                          0.08130
## video18                                                          0.08446
## video19                                                          0.09068
## video20                                                          0.08881
## video21                                                          0.09268
## video22                                                          0.09123
## video23                                                          0.08166
## TreatmentHint Control:review_freq_numeric_valuesplit3-4          0.29702
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4    0.30387
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4         0.29387
## TreatmentHint Control:review_freq_numeric_valuesplit1-2          0.29093
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2    0.29607
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2         0.29130
##                                                               t value
## (Intercept)                                                     5.410
## TreatmentHint Control                                           0.598
## TreatmentOne-Click Generate                                     1.557
## TreatmentChat Generate                                         -3.953
## review_freq_numeric_valuesplit3-4                              -1.181
## review_freq_numeric_valuesplit1-2                              -3.571
## video11                                                         1.557
## video13                                                         1.803
## video14                                                         3.650
## video15                                                         3.334
## video16                                                         2.775
## video17                                                         1.904
## video18                                                         1.075
## video19                                                         5.061
## video20                                                         2.786
## video21                                                         2.834
## video22                                                         2.400
## video23                                                         5.034
## TreatmentHint Control:review_freq_numeric_valuesplit3-4         0.503
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4  -0.241
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4        1.742
## TreatmentHint Control:review_freq_numeric_valuesplit1-2         0.229
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2  -1.037
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2        3.834
##                                                                   Pr(>|t|)    
## (Intercept)                                                   0.0000000712 ***
## TreatmentHint Control                                             0.550036    
## TreatmentOne-Click Generate                                       0.119618    
## TreatmentChat Generate                                        0.0000799832 ***
## review_freq_numeric_valuesplit3-4                                 0.237851    
## review_freq_numeric_valuesplit1-2                                 0.000365 ***
## video11                                                           0.119753    
## video13                                                           0.071506 .  
## video14                                                           0.000270 ***
## video15                                                           0.000873 ***
## video16                                                           0.005580 ** 
## video17                                                           0.057035 .  
## video18                                                           0.282398    
## video19                                                       0.0000004569 ***
## video20                                                           0.005389 ** 
## video21                                                           0.004640 ** 
## video22                                                           0.016488 *  
## video23                                                       0.0000005266 ***
## TreatmentHint Control:review_freq_numeric_valuesplit3-4           0.615023    
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4     0.809675    
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4          0.081739 .  
## TreatmentHint Control:review_freq_numeric_valuesplit1-2           0.818989    
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2     0.299701    
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2          0.000130 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.403 on 1869 degrees of freedom
## Multiple R-squared:  0.08364,    Adjusted R-squared:  0.07236 
## F-statistic: 7.417 on 23 and 1869 DF,  p-value: < 0.00000000000000022

Mode 3 and 4 Give-up Analysis (Commented Out For Now)

Let’s first check how many interactions Mode 3 for video/user combination

api_data <- api %>% filter(Action == "SendAPI") %>% filter(User.Id %in% cutoff_users)
api_data %>% group_by(User.Id, Video.Id) %>% summarise(n = n()) %>% pull(n) %>% table()

## `summarise()` has grouped output by 'User.Id'. You can override using the
## `.groups` argument.

## .
##   1   2 
## 606  20

api_data <- merge(api_data, api_data %>% group_by(User.Id, Video.Id) %>% summarise(numInteraction = n()), by = c("User.Id", "Video.Id"), all.x = T)

## `summarise()` has grouped output by 'User.Id'. You can override using the
## `.groups` argument.

Let’s also check how many interactions Mode 4 for video/user combination

chat_data <- chat %>% filter(Action == "SendChat") %>% filter(User.Id %in% cutoff_users)
chat_data %>% group_by(User.Id, Video.Id) %>% summarise(n = n()) %>% pull(n) %>% table()

## `summarise()` has grouped output by 'User.Id'. You can override using the
## `.groups` argument.

## .
##    1    2    3    4    5    6    7    8 
## 1015  137   22    9    3    2    3    2

chat_data <- merge(chat_data, chat_data %>% group_by(User.Id, Video.Id) %>% summarise(numInteraction = n()), by = c("User.Id", "Video.Id"), all.x = T)

## `summarise()` has grouped output by 'User.Id'. You can override using the
## `.groups` argument.

Mode 3

User-Video Pair

For those user-video pair that had an interaction with AI, how many of them were commented.

# find rows in df_panel such that the user and video combination is seen in api data
# needs to be comibination
api_sub_data <- merge(api_data, df_panel %>% filter(Treatment == "One-Click Generate"), by = c("User.Id", "Video.Id"), all.x = T) %>% 
  dplyr::select(User.Id, Video.Id, hasComment, numInteraction) %>%
  distinct()

# filter out those video that did not make the cutoff
api_sub_data <- api_sub_data %>% filter(!is.na(hasComment))

api_sub_data %>% summarise(hasCommentCombination = sum(hasComment), totalCombination = n(), commentRate = hasCommentCombination/totalCombination)

User-Video Pair

For those user-video pair that had an interaction with AI, how many of them were commented, subset by number of interactions.

api_sub_data %>% group_by(numInteraction) %>% summarise(hasCommentCombination = sum(hasComment), totalCombination = n(), commentRate = hasCommentCombination/totalCombination)

User Level

How many user interacted with AI at least once.

num_interact_user3 <- api_sub_data$User.Id %>% unique() %>% length()
total_user3 <- df_panel %>% filter(Treatment == "One-Click Generate") %>% pull(User.Id) %>% unique() %>% length()
print(paste0("Number of users who interacted with AI at least once in Mode 3: ", num_interact_user3))

## [1] "Number of users who interacted with AI at least once in Mode 3: 243"

print(paste0("Total number of users in Mode 3: ", total_user3))

## [1] "Total number of users in Mode 3: 470"

print(paste0("Interaction Rate: ", num_interact_user3/total_user3))

## [1] "Interaction Rate: 0.517021276595745"

Mode 4

User-Video Pair

For those user-video pair that had an interaction with AI, how many of them were commented.

chat_sub_data <- merge(chat_data, df_panel, by = c("User.Id", "Video.Id"), all.x = T) %>% 
  dplyr::select(User.Id, Video.Id, hasComment, numInteraction) %>%
  distinct()

# filter out those user/video combination that did not make the cutoff 
chat_sub_data <- chat_sub_data %>% filter(!is.na(hasComment))

chat_sub_data %>% summarise(hasCommentCombination = sum(hasComment), totalCombination = n(), commentRate = hasCommentCombination/totalCombination)

User-Video Pair

For those user-video pair that had an interaction with AI, how many of them were commented, subset by number of interactions.

chat_sub_data %>% group_by(numInteraction) %>% summarise(hasCommentCombination = sum(hasComment), totalCombination = n(), commentRate = hasCommentCombination/totalCombination)

User-Level

num_interact_user4 <- chat_sub_data$User.Id %>% unique() %>% length()
total_user4 <- df_panel %>% filter(Treatment == "Chat Generate") %>% pull(User.Id) %>% unique() %>% length()
print(paste0("Number of users who interacted with AI at least once in Mode 4: ", num_interact_user4))

## [1] "Number of users who interacted with AI at least once in Mode 4: 429"

print(paste0("Total number of users in Mode 4: ", total_user4))

## [1] "Total number of users in Mode 4: 489"

print(paste0("Interaction Rate: ", num_interact_user4/total_user4))

## [1] "Interaction Rate: 0.877300613496933"

User-Video Pair

For those user-video pair that had an interaction with AI, how many of them click the “Edit Comment” box.

saw_commentbox <- commentchange %>% dplyr::select(Video.Id, User.Id) %>% distinct()
saw_commentbox$sawCommentBox <- 1

chat_sub_data <- merge(chat_sub_data, saw_commentbox, by = c("User.Id", "Video.Id"), all.x = T) 
chat_sub_data$sawCommentBox <- ifelse(is.na(chat_sub_data$sawCommentBox), 0, chat_sub_data$sawCommentBox)

chat_sub_data %>% summarise(sawCommentBox = sum(sawCommentBox), totalCombination = n(), sawCommentBoxRate = sawCommentBox/totalCombination)

User-Video Pair

For those user-video pair that had an interaction, how likely are they going to click the “Edit Comment” box, subset by number of Interactions

chat_sub_data %>% group_by(numInteraction) %>% summarise(sawCommentBox = sum(sawCommentBox), totalCombination = n(), sawCommentBoxRate = sawCommentBox/totalCombination)

User-Video Pair

For those user-video pair that clicked the “Edit Comment” box, how likely are they going to make a comment.

chat_sub_data %>% filter(sawCommentBox == 1) %>% group_by(numInteraction) %>% summarise(hasCommentCombination = sum(hasComment), totalCombination = n(), commentRate = hasCommentCombination/totalCombination)

chat_sub_data %>% filter(sawCommentBox == 1) %>% summarise(hasCommentCombination = sum(hasComment), totalCombination = n(), commentRate = hasCommentCombination/totalCombination)

User-Level

sawCommentBoxUser <- chat_sub_data %>% filter(sawCommentBox == 1) %>% pull(User.Id) %>% unique()
num_saw_commentbox <- chat_sub_data %>% filter(sawCommentBox == 1) %>% pull(User.Id) %>% unique() %>% length()
total_user4 <- df_panel %>% filter(Treatment == "Chat Generate") %>% pull(User.Id) %>% unique() %>% length()
print(paste0("Number of users who clicked the comment box at least once in Mode 4: ", num_saw_commentbox))

## [1] "Number of users who clicked the comment box at least once in Mode 4: 210"

print(paste0("Number of users who interacted with AI at least once in Mode 4: ", num_interact_user4))

## [1] "Number of users who interacted with AI at least once in Mode 4: 429"

print(paste0("Click Comment Box Rate (User Level): ", num_saw_commentbox/num_interact_user4))

## [1] "Click Comment Box Rate (User Level): 0.48951048951049"

print(paste0("Agree with the statement AI does NOT reflect my true opinion (Clicked Edit): ",df_wide_all %>% filter(User.Id %in% sawCommentBoxUser) %>% summarise(mech_trueop = mean(mech_trueop)) %>% pull(mech_trueop)))

## [1] "Agree with the statement AI does NOT reflect my true opinion (Clicked Edit): 3.12857142857143"

print(paste0("Agree with the statement AI does NOT reflect my true opinion (Did NOT Click Edit): ", df_wide_all %>% filter(Treatment == "Chat Generate" & !User.Id %in% sawCommentBoxUser) %>% summarise(mech_trueop = mean(mech_trueop)) %>% pull(mech_trueop)))

## [1] "Agree with the statement AI does NOT reflect my true opinion (Did NOT Click Edit): 3.45878136200717"

Follow-up Experiment

Load Data

df_followup_file <- find_most_recent("qualtrics_data/main", "Followup")
df_followup <- read.csv(df_followup_file)

df_followup <- df_followup %>% filter(Status == "IP Address") %>%
  filter(Finished == "True") %>%
  filter(Consent == "YES")

##### Get User Id
df_followup <- merge(df_followup, df_wide_all %>% dplyr::select(User.Id, prolific_id), by = "prolific_id", all.x = T)

##### Extract Watching Order
df_followup %>% mutate(order1 = str_split(FL_36_DO, "\\|", simplify = T)[,1],
                       order2 = str_split(FL_36_DO, "\\|", simplify = T)[,2],
                       order3 = str_split(FL_36_DO, "\\|", simplify = T)[,3]) -> df_followup
# strip white spaces
df_followup$order1 <- str_trim(df_followup$order1)
df_followup$order2 <- str_trim(df_followup$order2)
df_followup$order3 <- str_trim(df_followup$order3)

df_followup$order1 <- case_when(
  df_followup$order1 == "FL_37" ~ "CoinOperated",
  df_followup$order1 == "FL_38" ~ "Crook",
  df_followup$order1 == "FL_39" ~ "ForeverSleep",
  df_followup$order1 == "FL_40" ~ "SoftRain",
  df_followup$order1 == "FL_41" ~ "TimeMachine",
  df_followup$order1 == "FL_42" ~ "RadicalHonesty",
  df_followup$order1 == "FL_43" ~ "AlternativeMath",
  df_followup$order1 == "FL_44" ~ "FrenchRoast",
  df_followup$order1 == "FL_45" ~ "Different",
  df_followup$order1 == "FL_46" ~ "TheCook",
  df_followup$order1 == "FL_47" ~ "Skipped",
  df_followup$order1 == "FL_48" ~ "Boom"
)

df_followup$order2 <- case_when(
  df_followup$order2 == "FL_37" ~ "CoinOperated",
  df_followup$order2 == "FL_38" ~ "Crook",
  df_followup$order2 == "FL_39" ~ "ForeverSleep",
  df_followup$order2 == "FL_40" ~ "SoftRain",
  df_followup$order2 == "FL_41" ~ "TimeMachine",
  df_followup$order2 == "FL_42" ~ "RadicalHonesty",
  df_followup$order2 == "FL_43" ~ "AlternativeMath",
  df_followup$order2 == "FL_44" ~ "FrenchRoast",
  df_followup$order2 == "FL_45" ~ "Different",
  df_followup$order2 == "FL_46" ~ "TheCook",
  df_followup$order2 == "FL_47" ~ "Skipped",
  df_followup$order2 == "FL_48" ~ "Boom"
)

df_followup$order3 <- case_when(
  df_followup$order3 == "FL_37" ~ "CoinOperated",
  df_followup$order3 == "FL_38" ~ "Crook",
  df_followup$order3 == "FL_39" ~ "ForeverSleep",
  df_followup$order3 == "FL_40" ~ "SoftRain",
  df_followup$order3 == "FL_41" ~ "TimeMachine",
  df_followup$order3 == "FL_42" ~ "RadicalHonesty",
  df_followup$order3 == "FL_43" ~ "AlternativeMath",
  df_followup$order3 == "FL_44" ~ "FrenchRoast",
  df_followup$order3 == "FL_45" ~ "Different",
  df_followup$order3 == "FL_46" ~ "TheCook",
  df_followup$order3 == "FL_47" ~ "Skipped",
  df_followup$order3 == "FL_48" ~ "Boom"
)

df_followup$comment1 <- ifelse(df_followup$order1 == "CoinOperated", df_followup$comment_coinoperated, 
ifelse(df_followup$order1 == "Crook", df_followup$comment_CROOK., 
ifelse(df_followup$order1 == "ForeverSleep", df_followup$comment_foreversleep, 
ifelse(df_followup$order1 == "SoftRain", df_followup$comment_softrain, 
ifelse(df_followup$order1 == "TimeMachine", df_followup$comment_1mtmachine, 
ifelse(df_followup$order1 == "RadicalHonesty", df_followup$comment_radhonesty, 
ifelse(df_followup$order1 == "AlternativeMath", df_followup$comment_altmath, 
ifelse(df_followup$order1 == "FrenchRoast", df_followup$comment_frenchroast, 
ifelse(df_followup$order1 == "Different", df_followup$comment_different, 
ifelse(df_followup$order1 == "TheCook", df_followup$comment_thecook, 
ifelse(df_followup$order1 == "Skipped", df_followup$comment_skipped, 
ifelse(df_followup$order1 == "Boom", df_followup$comment_boom, NA))))))))))))

df_followup$comment2 <- ifelse(df_followup$order2 == "CoinOperated", df_followup$comment_coinoperated, 
ifelse(df_followup$order2 == "Crook", df_followup$comment_CROOK., 
ifelse(df_followup$order2 == "ForeverSleep", df_followup$comment_foreversleep, 
ifelse(df_followup$order2 == "SoftRain", df_followup$comment_softrain, 
ifelse(df_followup$order2 == "TimeMachine", df_followup$comment_1mtmachine, 
ifelse(df_followup$order2 == "RadicalHonesty", df_followup$comment_radhonesty, 
ifelse(df_followup$order2 == "AlternativeMath", df_followup$comment_altmath, 
ifelse(df_followup$order2 == "FrenchRoast", df_followup$comment_frenchroast, 
ifelse(df_followup$order2 == "Different", df_followup$comment_different, 
ifelse(df_followup$order2 == "TheCook", df_followup$comment_thecook, 
ifelse(df_followup$order2 == "Skipped", df_followup$comment_skipped, 
ifelse(df_followup$order2 == "Boom", df_followup$comment_boom, NA))))))))))))

df_followup$comment3 <- ifelse(df_followup$order3 == "CoinOperated", df_followup$comment_coinoperated, 
ifelse(df_followup$order3 == "Crook", df_followup$comment_CROOK., 
ifelse(df_followup$order3 == "ForeverSleep", df_followup$comment_foreversleep, 
ifelse(df_followup$order3 == "SoftRain", df_followup$comment_softrain, 
ifelse(df_followup$order3 == "TimeMachine", df_followup$comment_1mtmachine, 
ifelse(df_followup$order3 == "RadicalHonesty", df_followup$comment_radhonesty, 
ifelse(df_followup$order3 == "AlternativeMath", df_followup$comment_altmath, 
ifelse(df_followup$order3 == "FrenchRoast", df_followup$comment_frenchroast, 
ifelse(df_followup$order3 == "Different", df_followup$comment_different, 
ifelse(df_followup$order3 == "TheCook", df_followup$comment_thecook, 
ifelse(df_followup$order3 == "Skipped", df_followup$comment_skipped, 
ifelse(df_followup$order3 == "Boom", df_followup$comment_boom, NA))))))))))))

df_followup$order1_duration <- ifelse(df_followup$order1 == "CoinOperated", df_followup$timer_coinoperated_Page.Submit, 
ifelse(df_followup$order1 == "Crook", df_followup$timer_CROOK._Page.Submit, 
ifelse(df_followup$order1 == "ForeverSleep", df_followup$timer_foreversleep_Page.Submit, 
ifelse(df_followup$order1 == "SoftRain", df_followup$timer_softrain_Page.Submit, 
ifelse(df_followup$order1 == "TimeMachine", df_followup$timer_1mtmachine_Page.Submit, 
ifelse(df_followup$order1 == "RadicalHonesty", df_followup$timer_radhonesty_Page.Submit, 
ifelse(df_followup$order1 == "AlternativeMath", df_followup$timer_altmath_Page.Submit, 
ifelse(df_followup$order1 == "FrenchRoast", df_followup$timer_frenchroast_Page.Submit, 
ifelse(df_followup$order1 == "Different", df_followup$timer_different_Page.Submit, 
ifelse(df_followup$order1 == "TheCook", df_followup$timer_thecook_Page.Submit, 
ifelse(df_followup$order1 == "Skipped", df_followup$timer_skipped_Page.Submit, 
ifelse(df_followup$order1 == "Boom", df_followup$timer_boom_Page.Submit, NA))))))))))))

df_followup$order2_duration <- ifelse(df_followup$order2 == "CoinOperated", df_followup$timer_coinoperated_Page.Submit,
ifelse(df_followup$order2 == "Crook", df_followup$timer_CROOK._Page.Submit,
ifelse(df_followup$order2 == "ForeverSleep", df_followup$timer_foreversleep_Page.Submit,
ifelse(df_followup$order2 == "SoftRain", df_followup$timer_softrain_Page.Submit,
ifelse(df_followup$order2 == "TimeMachine", df_followup$timer_1mtmachine_Page.Submit,
ifelse(df_followup$order2 == "RadicalHonesty", df_followup$timer_radhonesty_Page.Submit,
ifelse(df_followup$order2 == "AlternativeMath", df_followup$timer_altmath_Page.Submit,
ifelse(df_followup$order2 == "FrenchRoast", df_followup$timer_frenchroast_Page.Submit,
ifelse(df_followup$order2 == "Different", df_followup$timer_different_Page.Submit,
ifelse(df_followup$order2 == "TheCook", df_followup$timer_thecook_Page.Submit,
ifelse(df_followup$order2 == "Skipped", df_followup$timer_skipped_Page.Submit,
ifelse(df_followup$order2 == "Boom", df_followup$timer_boom_Page.Submit, NA))))))))))))

df_followup$order3_duration <- ifelse(df_followup$order3 == "CoinOperated", df_followup$timer_coinoperated_Page.Submit,
ifelse(df_followup$order3 == "Crook", df_followup$timer_CROOK._Page.Submit,
ifelse(df_followup$order3 == "ForeverSleep", df_followup$timer_foreversleep_Page.Submit,
ifelse(df_followup$order3 == "SoftRain", df_followup$timer_softrain_Page.Submit,
ifelse(df_followup$order3 == "TimeMachine", df_followup$timer_1mtmachine_Page.Submit,
ifelse(df_followup$order3 == "RadicalHonesty", df_followup$timer_radhonesty_Page.Submit,
ifelse(df_followup$order3 == "AlternativeMath", df_followup$timer_altmath_Page.Submit,
ifelse(df_followup$order3 == "FrenchRoast", df_followup$timer_frenchroast_Page.Submit,
ifelse(df_followup$order3 == "Different", df_followup$timer_different_Page.Submit,
ifelse(df_followup$order3 == "TheCook", df_followup$timer_thecook_Page.Submit,
ifelse(df_followup$order3 == "Skipped", df_followup$timer_skipped_Page.Submit,
ifelse(df_followup$order3 == "Boom", df_followup$timer_boom_Page.Submit, NA))))))))))))

## Additional Video Watched

df_followup$additional_video <- str_trim(df_followup$intro_videowatching)
df_followup$additional_video_duration <- ifelse(df_followup$additional_video == "Second Team", df_followup$timer_secondteam_Page.Submit,
ifelse(df_followup$additional_video == "A Social Life", df_followup$timer_asociallife_Page.Submit,
ifelse(df_followup$additional_video == "Freelancer", df_followup$timer_freelancer_Page.Submit, 
ifelse(df_followup$additional_video == "Kayak", df_followup$timer_kayak_Page.Submit, NA))))
df_followup$additional_comment <- ifelse(df_followup$additional_video == "Second Team", df_followup$review_secondteam,
ifelse(df_followup$additional_video == "A Social Life", df_followup$review_asociallife,
ifelse(df_followup$additional_video == "Freelancer", df_followup$review_freelancer,
ifelse(df_followup$additional_video == "Kayak", df_followup$review_kayak, NA))))


df_followup$additional_comment_exists <- ifelse(df_followup$additional_comment == "", 0, 1)

## Additional Video Watched Pass Cutoff
cutoff_asociallife <- 430
cutoff_kayak <- 330
cutoff_freelancer <- 480
cutoff_secondteam <- 550


df_followup$additional_video_duration <- as.numeric(df_followup$additional_video_duration)

df_followup$additional_video_pass_cutoff <- ifelse(df_followup$additional_video == "Second Team" & df_followup$additional_video_duration >= cutoff_secondteam - 15, 1, 
ifelse(df_followup$additional_video == "A Social Life" & df_followup$additional_video_duration >= cutoff_asociallife - 15, 1,
ifelse(df_followup$additional_video == "Freelancer" & df_followup$additional_video_duration >= cutoff_freelancer - 15, 1,
ifelse(df_followup$additional_video == "Kayak" & df_followup$additional_video_duration >= cutoff_kayak - 15, 1, 
ifelse(is.na(df_followup$additional_video_duration), 1, 0)))))

df_followup$additional_video_cutoff <- ifelse(df_followup$additional_video == "Second Team", cutoff_secondteam,
ifelse(df_followup$additional_video == "A Social Life", cutoff_asociallife,
ifelse(df_followup$additional_video == "Freelancer", cutoff_freelancer,
ifelse(df_followup$additional_video == "Kayak", cutoff_kayak, NA))))

##### Invalid Prolific IDs
# 66d8790772c9e0e4ee96259c （cannot use)

invalid_prolific_ids <- df_followup %>% filter(additional_video_pass_cutoff == 0) %>% dplyr::select(prolific_id, additional_video, additional_video_duration, additional_video_cutoff) %>% pull(prolific_id)
invalid_prolific_ids <- c(invalid_prolific_ids, "66d8790772c9e0e4ee96259c")


### award partial payment to those almost finish watching
partial_payment <- merge(df_followup %>% filter(additional_video_pass_cutoff == 0) %>% dplyr::select(prolific_id, additional_video, additional_video_duration, additional_video_cutoff) %>% arrange(prolific_id), df_post_output %>% dplyr::select(prolific_id, StartDate)) %>% mutate(FollowupStartDate = as.Date(StartDate) + 21) %>% arrange(FollowupStartDate) %>% filter(additional_video_duration >= additional_video_cutoff - 60) %>% mutate(Message = "Did not finish watching the additional video all the way to the end as instructed. Will award partial payment for the effort. Please return your submission as of now.") 

### reject those who were not close to finish watching
no_partial_payment <- merge(df_followup %>% filter(additional_video_pass_cutoff == 0) %>% dplyr::select(prolific_id, additional_video, additional_video_duration, additional_video_cutoff) %>% arrange(prolific_id), df_post_output %>% dplyr::select(prolific_id, StartDate)) %>% mutate(FollowupStartDate = as.Date(StartDate) + 21) %>% arrange(FollowupStartDate) %>% filter(additional_video_duration < additional_video_cutoff - 60) %>% mutate(Message = paste0("Did not finish watching the additional video all the way to the end as instructed. Exited ", additional_video_cutoff - additional_video_duration, " seconds before the cutoff. Please return your submission as soon as possible. Otherwise, we will have to reject your submission."))

write.csv(partial_payment, "prolific_data/followup/partial_payment_followup.csv")
write.csv(no_partial_payment, "prolific_data/followup/no_partial_payment_followup.csv")

valid_ids <- merge(df_followup %>% filter(additional_video_pass_cutoff == 1) %>% dplyr::select(prolific_id, additional_video, additional_video_duration, additional_video_cutoff) %>% arrange(prolific_id), df_post_output %>% dplyr::select(prolific_id, StartDate)) %>% mutate(FollowupStartDate = as.Date(StartDate) + 21) %>% arrange(FollowupStartDate)

write.csv(valid_ids, "prolific_data/followup/valid_ids_followup.csv")

##### Social Desirability Bias
# FFFFTFTFTTFFT
df_followup$sdb_dummy_1 <- ifelse(df_followup$sdb_1 == "FALSE", 1, 0)
df_followup$sdb_dummy_2 <- ifelse(df_followup$sdb_2 == "FALSE", 1, 0)
df_followup$sdb_dummy_3 <- ifelse(df_followup$sdb_3 == "FALSE", 1, 0)
df_followup$sdb_dummy_4 <- ifelse(df_followup$sdb_4 == "FALSE", 1, 0)
df_followup$sdb_dummy_5 <- ifelse(df_followup$sdb_5 == "TRUE", 1, 0)
df_followup$sdb_dummy_6 <- ifelse(df_followup$sdb_6 == "FALSE", 1, 0)
df_followup$sdb_dummy_7 <- ifelse(df_followup$sdb_7 == "TRUE", 1, 0)
df_followup$sdb_dummy_8 <- ifelse(df_followup$sdb_8 == "FALSE", 1, 0)
df_followup$sdb_dummy_9 <- ifelse(df_followup$sdb_9 == "TRUE", 1, 0)
df_followup$sdb_dummy_10 <- ifelse(df_followup$sdb_10 == "TRUE", 1, 0)
df_followup$sdb_dummy_11 <- ifelse(df_followup$sdb_11 == "FALSE", 1, 0)
df_followup$sdb_dummy_12 <- ifelse(df_followup$sdb_12 == "FALSE", 1, 0)
df_followup$sdb_dummy_13 <- ifelse(df_followup$sdb_13 == "TRUE", 1, 0)
df_followup$sdb <- rowSums(df_followup %>% dplyr::select(starts_with("sdb_dummy")))

##### Memory Test
df_followup$notinset <- ifelse(str_detect(df_followup$videos_watched, "Wireless|The Flying Sailor|Score|Kayak|Tunnel|The Wait|Freelancer|SHE|The Crush|Between Days|2 AM Coffee"), 1, 0)

# find videos entered during main experiment
videos_entered <- action %>% filter(User.Id %in% df_followup$User.Id)
videos_entered  <- merge(videos_entered , video %>% dplyr::select(Video.Id, cutoff_time, Title), by = "Video.Id", all.x = T)
videos_entered <- videos_entered %>% arrange(User.Id, desc(duration))
# remove nondistinct Video.Id User.Id pair
videos_entered <- videos_entered %>% distinct(User.Id, Video.Id, .keep_all = T)
videos_entered <- reshape2::dcast(videos_entered, User.Id ~ Video.Id, value.var = "Title")
videos_entered$videos_entered_main <- apply(videos_entered %>% dplyr::select(-starts_with("U")), 1, function(x) paste(na.omit(x), collapse = ","))
df_followup <- merge(df_followup, videos_entered %>% dplyr::select(User.Id, videos_entered_main), by = "User.Id", all.x = T)

# find videos passed cutoff during main experiment
videos_passed_cutoff <- df_panel %>% filter(prolific_id %in% df_followup$prolific_id) %>% dplyr::select(User.Id, Video.Id)
videos_passed_cutoff <- merge(videos_passed_cutoff, video %>% dplyr::select(Video.Id, Title), by = "Video.Id", all.x = T)
videos_passed_cutoff <- reshape2::dcast(videos_passed_cutoff, User.Id ~ Video.Id, value.var = "Title")
videos_passed_cutoff$videos_passed_cutoff <- apply(videos_passed_cutoff %>% dplyr::select(-starts_with("U")), 1, function(x) paste(na.omit(x), collapse = ","))
df_followup <- merge(df_followup, videos_passed_cutoff %>% dplyr::select(User.Id, videos_passed_cutoff), by = "User.Id", all.x = T)


df_followup$videos_watched <- gsub("\\$CROOK", "CROOK\\$", df_followup$videos_watched)
df_followup$videos_watched <- gsub("Different", "DIFFERENT", df_followup$videos_watched)


detect_morethanset <- function(watched, entered){
  # watched is which video the user claimed to watch
  # entered is which video the user entered
  if(is.na(watched) | is.na(entered)){
    return(NA)
  }
  watched <- str_split(watched, ",", simplify = T) %>% as.vector()
  entered <- str_split(entered, ",", simplify = T) %>% as.vector()
  # remove additional videos from watched
  watched <- watched[!watched %in% c("Wireless", "The Flying Sailor", "Score", "Kayak", "Tunnel", "The Wait", "Freelancer", "SHE", "The Crush", "Between Days", "2 AM Coffee")]
  morethanset <- ifelse(setdiff(watched, entered) %>% length() >=1 , 1, 0)
  return(morethanset)
}


df_followup$morethanset <- mapply(detect_morethanset, df_followup$videos_watched, df_followup$videos_entered_main)

detect_memory_perc <- function(watched, entered){
  # watched is which video the user claimed to watch
  # entered is which video the user entered
  if(is.na(watched) | is.na(entered)){
    return(NA)
  }
  watched <- str_split(watched, ",", simplify = T) %>% as.vector()
  entered <- str_split(entered, ",", simplify = T) %>% as.vector()
  percentageremember <- mean(watched %in% entered, na.rm = T)
  return(percentageremember)
}
  

df_followup$memory_perc_all <- mapply(detect_memory_perc, df_followup$videos_watched, df_followup$videos_entered_main)
df_followup$memory_perc <- mapply(detect_memory_perc, df_followup$videos_watched, df_followup$videos_passed_cutoff)

df_followup <- df_followup %>% 
  mutate(memory_numeric = case_when(
    memory == "Hardly anything" ~ 1,
    memory == "Very little" ~ 2,
    memory == "Some of it" ~ 3,
    memory == "Most of it" ~ 4,
    memory == "Almost everything" ~ 5,
    TRUE ~ NA
  ))

df_followup$interacted <- ifelse(df_followup$User.Id %in% chat_sub_data$User.Id, 1, ifelse(df_followup$User.Id %in% api_sub_data$User.Id, 1, 0))
df_followup$sawCommentBox <- ifelse(df_followup$User.Id %in% sawCommentBoxUser, 1, 0)

Generate processed dataset

df_followup_columns <- c(
  "prolific_id", "User.Id", 
  df_followup %>% dplyr::select(starts_with("comment_")) %>% colnames(), 
  df_followup %>% dplyr::select(starts_with("review_")) %>% colnames(), 
  df_followup %>% dplyr::select(starts_with("mode4_")) %>% colnames(), 
  "AlternativeMath", "Boom", "CoinOperated", "Crook", "Different", "ForeverSleep", 
  "FrenchRoast", "RadicalHonesty", "SoftRain", "TheCook", "TimeMachine", "Skipped",
  "FL_36_DO", "order1", "order2", "order3", "comment1", "comment2", "comment3",
  "order1_duration", "order2_duration", "order3_duration", 
  "videos_watched", "memory", "notinset", "morethanset",  
  df_followup %>% dplyr::select(starts_with("sdb")) %>% colnames(), 
  "attentioncheck", "interact", "sawCommentBox", "additional_video", 
  "additional_comment", "additional_comment_exists", "additional_video_pass_cutoff",
  "videos_entered_main", "videos_passed_cutoff", "memory_perc_all", "memory_perc", "memory", "memory_numeric")

followup_participants <- df_followup %>% pull(prolific_id) %>% unique()

# merge with covariates
df_followup_final <- merge(df_followup %>% dplyr::select(all_of(df_followup_columns)), df_wide_all, by = c("prolific_id", "User.Id"))

saveRDS(df_followup_final, "processed_final_data/df_followup_final.RDS")

followup_comment_df <- df_followup %>% dplyr::select(starts_with("comment_"), "User.Id", "prolific_id") %>% 
  reshape2::melt(id.vars = c("User.Id", "prolific_id")) %>% 
  rename(VideoName = variable, ContentFollowup = value) %>%
  # remove "comment_" prefix from VideoName
  mutate(VideoName = str_remove(VideoName, "comment_")) %>%
  mutate(VideoName = case_when(
    VideoName == "coinoperated" ~ "CoinOperated",
    VideoName == "CROOK." ~ "Crook",
    VideoName == "foreversleep" ~ "ForeverSleep",
    VideoName == "softrain" ~ "SoftRain",
    VideoName == "1mtmachine" ~ "TimeMachine",
    VideoName == "radhonesty" ~ "RadicalHonesty",
    VideoName == "altmath" ~ "AlternativeMath",
    VideoName == "frenchroast" ~ "FrenchRoast",
    VideoName == "different" ~ "Different",
    VideoName == "thecook" ~ "TheCook",
    VideoName == "skipped" ~ "Skipped",
    VideoName == "boom" ~ "Boom"
  )) %>%
  filter(ContentFollowup != "") %>%
  arrange(User.Id)

followup_comment_order_df <- df_followup %>% dplyr::select(starts_with("order"), "User.Id", "prolific_id") %>% 
  reshape2::melt(id.vars = c("User.Id", "prolific_id")) 


followup_comment_order_duration_df <- followup_comment_order_df %>% filter(variable %in% c("order1_duration", "order2_duration", "order3_duration")) %>% 
  mutate(orderFollowup = case_when(
    variable == "order1_duration" ~ 1,
    variable == "order2_duration" ~ 2,
    variable == "order3_duration" ~ 3,
  )) %>% 
  dplyr::select(-variable) %>% 
  rename(Duration = value) %>% 
  arrange(User.Id)


followup_comment_order_df <- followup_comment_order_df %>% filter(variable %in% c("order1", "order2", "order3")) %>% 
  mutate(orderFollowup = case_when(
    variable == "order1" ~ 1,
    variable == "order2" ~ 2,
    variable == "order3" ~ 3,
  )) %>% 
  dplyr::select(-variable) %>% 
  rename(VideoName = value) %>% 
  arrange(User.Id)


followup_comment_order_df <- merge(followup_comment_order_df, followup_comment_order_duration_df, by = c("User.Id", "prolific_id", "orderFollowup")) %>% 
  arrange(User.Id, orderFollowup)

followup_comment_df <- merge(followup_comment_df, followup_comment_order_df, by = c("User.Id", "prolific_id", "VideoName")) 

followup_comment_df <- followup_comment_df %>%
  mutate(Video.Id = case_when(
    VideoName == "CoinOperated" ~ 17,
    VideoName == "Crook" ~ 22,
    VideoName == "ForeverSleep" ~ 20,
    VideoName == "SoftRain" ~ 16,
    VideoName == "TimeMachine" ~ 23,
    VideoName == "RadicalHonesty" ~ 11,
    VideoName == "AlternativeMath" ~ 14,
    VideoName == "FrenchRoast" ~ 15,
    VideoName == "Different" ~ 21,
    VideoName == "TheCook" ~ 18,
    VideoName == "Skipped" ~ 19,
    VideoName == "Boom" ~ 13))

df_followup_panel_final <- merge(merge(df_panel, 
            followup_comment_df, by = c("User.Id", "prolific_id", "Video.Id")), 
      df_followup %>% 
        dplyr::select(all_of(df_followup_columns)) %>% 
        dplyr::select(-starts_with("order"), -starts_with("comment"), 
                      -AlternativeMath, -Boom, -CoinOperated, -Crook, 
                      -Different, -ForeverSleep, -FrenchRoast, -RadicalHonesty, 
                      -SoftRain, -TheCook, -TimeMachine, -Skipped), 
      by = c("User.Id", "prolific_id")) 

saveRDS(df_followup_panel_final, "processed_final_data/df_followup_panel_final.RDS")

# subset to dataframe with non-empty entries in both Content and ContentFollowup
df_followup_panel_final_nonempty <- df_followup_panel_final %>% filter(!is.na(Content) & !is.na(ContentFollowup))
# replace "&quot;" with '
df_followup_panel_final_nonempty$Content <- str_replace_all(df_followup_panel_final_nonempty$Content, "&quot;", "'")
df_followup_panel_final_nonempty$ContentFollowup <- str_replace_all(df_followup_panel_final_nonempty$ContentFollowup, "&quot;", "'")

write.csv(df_followup_panel_final_nonempty, "processed_final_data/df_followup_panel_final_nonempty.csv")

# NEED TO RUN SIMILARITY_ANALYSIS.PY FIRST
df_followup_panel_final_nonempty_with_sim_score <- read.csv("processed_final_data/df_followup_panel_final_nonempty_with_sim_score_sentiment.csv")

Analysis

User Level Regression

Review Dummy ~ treatment + video (+ demographics)

# control for how many comments previously written
feols(additional_comment_exists ~ Treatment + sdb | additional_video, 
      data = df_followup_final %>% 
        filter(attentioncheck == "Disagree") %>% 
        filter(!is.na(additional_comment)) %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% 
  summary()

## OLS estimation, Dep. Var.: additional_comment_exists
## Observations: 1,568
## Fixed-effects: additional_video: 4
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value Pr(>|t|)    
## TreatmentHint Control       -0.002923   0.029046 -0.100640  0.91985    
## TreatmentOne-Click Generate  0.026545   0.027867  0.952541  0.34097    
## TreatmentChat Generate      -0.006812   0.028818 -0.236389  0.81316    
## sdb                         -0.005310   0.002922 -1.817381  0.06935 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.395441     Adj. R2: 0.002425
##                  Within R2: 0.003312

# control for how many comments previously written
feols(additional_comment_exists ~ Treatment + sdb | additional_video + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric,
      data = df_followup_final %>% 
        filter(attentioncheck == "Disagree") %>% 
        filter(!is.na(additional_comment)) %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% 
  summary()

## OLS estimation, Dep. Var.: additional_comment_exists
## Observations: 1,568
## Fixed-effects: additional_video: 4,  age: 59,  social_media_YT: 2,  social_media_nonUser: 2,  social_media_user: 2,  social_media_use_numeric: 4,  website_use_numeric: 4,  genderFemale: 2,  raceAsian: 2,  raceBlack: 2,  raceHispanic: 2,  raceWhite: 2,  raceOther: 2,  eduHighSchoolOrLess: 2,  eduSomeCollege: 2,  eduBachelor: 2,  eduPostGrad: 2,  polpartyDem: 2,  polpartyRep: 2,  polpartyOther: 2,  libcons_numeric: 5,  income_numeric: 6,  social_media_reply_numeric: 6,  review_freq_numeric: 6
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value Pr(>|t|) 
## TreatmentHint Control        0.000705   0.029425  0.023946  0.98090 
## TreatmentOne-Click Generate  0.042247   0.028141  1.501234  0.13350 
## TreatmentChat Generate       0.002168   0.029054  0.074618  0.94053 
## sdb                         -0.003128   0.003180 -0.983610  0.32546 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.381312     Adj. R2: 0.009581
##                  Within R2: 0.00286

Panel Regression

Review Similarities

Review Similarities Outcome ~ treatment + video + memory question (+ demographics)

df_followup_panel_final_nonempty_with_sim_score$Treatment <- factor(df_followup_panel_final_nonempty_with_sim_score$Treatment, levels = c("Pure Control", "Hint Control", "One-Click Generate", "Chat Generate"))

SBERT EMBEDDDING

sbert_main <- feols(similarity ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()
sbert_main

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value   Pr(>|t|)    
## TreatmentHint Control       0.001576   0.013316 0.118332 0.90583010    
## TreatmentOne-Click Generate 0.048806   0.014047 3.474550 0.00053518 ***
## TreatmentChat Generate      0.048922   0.014317 3.417065 0.00066026 ***
## memory_perc                 0.061768   0.020332 3.038027 0.00244741 ** 
## memory_numeric              0.009579   0.005845 1.638856 0.10157938    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184405     Adj. R2: 0.043869
##                  Within R2: 0.032398

sbert_main_with_cov <- feols(similarity ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

GPT EMBEDDING

gpt_main <- feols(similarity_gpt ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()
gpt_main

## OLS estimation, Dep. Var.: similarity_gpt
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error  t value    Pr(>|t|)    
## TreatmentHint Control       -0.011713   0.010805 -1.08403 0.278632470    
## TreatmentOne-Click Generate  0.028630   0.011515  2.48636 0.013079028 *  
## TreatmentChat Generate       0.035560   0.011454  3.10462 0.001962538 ** 
## memory_perc                  0.065437   0.016074  4.07097 0.000050773 ***
## memory_numeric               0.006593   0.004630  1.42408 0.154755469    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.151238     Adj. R2: 0.046425
##                  Within R2: 0.037947

gpt_main_with_cov <- feols(similarity_gpt ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

model_list <- list(sbert_main,  gpt_main, sbert_main_with_cov, gpt_main_with_cov)
texreg::texreg(model_list, 
               stars = c(0.05, 0.01, 0.001), 
               caption = "Panel Regression Results", 
               label = "tab:panel_regression", 
               digits = 4,
               custom.note = "Standard errors are clustered at the user level.",
               custom.model.names = c("Without Covariates (SBERT)", "Without Covariates (GPT)", "With Covariates (SBERT)", "With Covariates (GPT)"), 
               custom.coef.names = c("Hint Control", "One-Click Generate", "Chat Generate", "Memory (Measured)", "Memory (Self-Reported)"))

## 
## \begin{table}
## \begin{center}
## \begin{tabular}{l c c c c}
## \hline
##  & Without Covariates (SBERT) & Without Covariates (GPT) & With Covariates (SBERT) & With Covariates (GPT) \\
## \hline
## Hint Control                               & $0.0016$       & $-0.0117$      & $0.0017$      & $-0.0115$     \\
##                                            & $(0.0133)$     & $(0.0108)$     & $(0.0134)$    & $(0.0108)$    \\
## One-Click Generate                         & $0.0488^{***}$ & $0.0286^{*}$   & $0.0446^{**}$ & $0.0251^{*}$  \\
##                                            & $(0.0140)$     & $(0.0115)$     & $(0.0141)$    & $(0.0116)$    \\
## Chat Generate                              & $0.0489^{***}$ & $0.0356^{**}$  & $0.0401^{**}$ & $0.0292^{*}$  \\
##                                            & $(0.0143)$     & $(0.0115)$     & $(0.0147)$    & $(0.0117)$    \\
## Memory (Measured)                          & $0.0618^{**}$  & $0.0654^{***}$ & $0.0433^{*}$  & $0.0517^{**}$ \\
##                                            & $(0.0203)$     & $(0.0161)$     & $(0.0197)$    & $(0.0162)$    \\
## Memory (Self-Reported)                     & $0.0096$       & $0.0066$       & $0.0092$      & $0.0077$      \\
##                                            & $(0.0058)$     & $(0.0046)$     & $(0.0058)$    & $(0.0046)$    \\
## \hline
## Num. obs.                                  & $2450$         & $2450$         & $2450$        & $2450$        \\
## Num. groups: Video.Id                      & $12$           & $12$           & $12$          & $12$          \\
## Num. groups: order                         & $9$            & $9$            & $9$           & $9$           \\
## Num. groups: orderFollowup                 & $3$            & $3$            & $3$           & $3$           \\
## R$^2$ (full model)                         & $0.0540$       & $0.0565$       & $0.1384$      & $0.1363$      \\
## R$^2$ (proj model)                         & $0.0324$       & $0.0379$       & $0.0204$      & $0.0252$      \\
## Adj. R$^2$ (full model)                    & $0.0439$       & $0.0464$       & $0.0925$      & $0.0902$      \\
## Adj. R$^2$ (proj model)                    & $0.0304$       & $0.0360$       & $0.0183$      & $0.0231$      \\
## Num. groups: age                           & $$             & $$             & $58$          & $58$          \\
## Num. groups: social\_media\_YT             & $$             & $$             & $2$           & $2$           \\
## Num. groups: social\_media\_nonUser        & $$             & $$             & $2$           & $2$           \\
## Num. groups: social\_media\_user           & $$             & $$             & $2$           & $2$           \\
## Num. groups: social\_media\_use\_numeric   & $$             & $$             & $4$           & $4$           \\
## Num. groups: website\_use\_numeric         & $$             & $$             & $4$           & $4$           \\
## Num. groups: genderFemale                  & $$             & $$             & $2$           & $2$           \\
## Num. groups: raceAsian                     & $$             & $$             & $2$           & $2$           \\
## Num. groups: raceBlack                     & $$             & $$             & $2$           & $2$           \\
## Num. groups: raceHispanic                  & $$             & $$             & $2$           & $2$           \\
## Num. groups: raceWhite                     & $$             & $$             & $2$           & $2$           \\
## Num. groups: raceOther                     & $$             & $$             & $2$           & $2$           \\
## Num. groups: eduHighSchoolOrLess           & $$             & $$             & $2$           & $2$           \\
## Num. groups: eduSomeCollege                & $$             & $$             & $2$           & $2$           \\
## Num. groups: eduBachelor                   & $$             & $$             & $2$           & $2$           \\
## Num. groups: eduPostGrad                   & $$             & $$             & $2$           & $2$           \\
## Num. groups: polpartyDem                   & $$             & $$             & $2$           & $2$           \\
## Num. groups: polpartyRep                   & $$             & $$             & $2$           & $2$           \\
## Num. groups: polpartyOther                 & $$             & $$             & $2$           & $2$           \\
## Num. groups: libcons\_numeric              & $$             & $$             & $5$           & $5$           \\
## Num. groups: income\_numeric               & $$             & $$             & $6$           & $6$           \\
## Num. groups: social\_media\_reply\_numeric & $$             & $$             & $6$           & $6$           \\
## Num. groups: review\_freq\_numeric         & $$             & $$             & $6$           & $6$           \\
## \hline
## \multicolumn{5}{l}{\scriptsize{Standard errors are clustered at the user level.}}
## \end{tabular}
## \caption{Panel Regression Results}
## \label{tab:panel_regression}
## \end{center}
## \end{table}

Sentiment Similarities

Sentiment Similarities Outcome ~ treatment + video + memory question (+ demographics)

GPT Embedding Label Only

We ask GPT to label the sentence with positive, negative, and neutral. We then compare whether the main experiment and the followup experiment have the same sentiment label. 1 if the same label, 0 otherwise. Around 10 comments (out of 5000ish comments) were determined to be mixed. In this case, we label 0 only if the other comment is labeled as neutral.

feols(sentiment_similarity_gpt ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_gpt
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value  Pr(>|t|)    
## TreatmentHint Control       -0.008728   0.028689 -0.304229 0.7610210    
## TreatmentOne-Click Generate  0.007104   0.029076  0.244321 0.8070360    
## TreatmentChat Generate       0.030178   0.030008  1.005661 0.3148387    
## memory_perc                  0.027205   0.041369  0.657621 0.5109435    
## memory_numeric               0.037526   0.012080  3.106414 0.0019508 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.46302     Adj. R2: 0.0415  
##                 Within R2: 0.007439

feols(sentiment_similarity_gpt ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_gpt
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3,  age: 58,  social_media_YT: 2,  social_media_nonUser: 2,  social_media_user: 2,  social_media_use_numeric: 4,  website_use_numeric: 4,  genderFemale: 2,  raceAsian: 2,  raceBlack: 2,  raceHispanic: 2,  raceWhite: 2,  raceOther: 2,  eduHighSchoolOrLess: 2,  eduSomeCollege: 2,  eduBachelor: 2,  eduPostGrad: 2,  polpartyDem: 2,  polpartyRep: 2,  polpartyOther: 2,  libcons_numeric: 5,  income_numeric: 6,  social_media_reply_numeric: 6,  review_freq_numeric: 6
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value  Pr(>|t|)    
## TreatmentHint Control       -0.010922   0.029413 -0.371331 0.7104752    
## TreatmentOne-Click Generate  0.003227   0.029927  0.107839 0.9141462    
## TreatmentChat Generate       0.035493   0.031862  1.113953 0.2655852    
## memory_perc                  0.075068   0.044307  1.694264 0.0905477 .  
## memory_numeric               0.035125   0.012334  2.847855 0.0044979 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.452473     Adj. R2: 0.046087
##                  Within R2: 0.008563

Sentiment Label

df_followup_panel_final_nonempty_with_sim_score$sentiment_label <- ifelse(df_followup_panel_final_nonempty_with_sim_score$sentiment_score == 0, "neutral",
                                                                          ifelse(df_followup_panel_final_nonempty_with_sim_score$sentiment_score > 0, "positive", "negative"))
df_followup_panel_final_nonempty_with_sim_score$sentiment_label_followup <- ifelse(df_followup_panel_final_nonempty_with_sim_score$sentiment_score_followup == 0, "neutral",
                                                                          ifelse(df_followup_panel_final_nonempty_with_sim_score$sentiment_score_followup > 0, "positive", "negative"))

df_followup_panel_final_nonempty_with_sim_score$sentiment_label <- relevel(as.factor(df_followup_panel_final_nonempty_with_sim_score$sentiment_label), ref = "neutral")
df_followup_panel_final_nonempty_with_sim_score$sentiment_label_followup <- relevel(as.factor(df_followup_panel_final_nonempty_with_sim_score$sentiment_label_followup), ref = "neutral")

df_followup_panel_final_nonempty_with_sim_score$sentiment_label_similarity <- ifelse(df_followup_panel_final_nonempty_with_sim_score$sentiment_label == df_followup_panel_final_nonempty_with_sim_score$sentiment_label_followup, 1, 0)

feols(sentiment_label_similarity ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_label_similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.004606   0.028498 0.161613 0.871645    
## TreatmentOne-Click Generate 0.012879   0.028973 0.444518 0.656771    
## TreatmentChat Generate      0.010169   0.030316 0.335424 0.737381    
## memory_perc                 0.044655   0.040703 1.097077 0.272890    
## memory_numeric              0.022571   0.011954 1.888084 0.059324 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.459342     Adj. R2: 0.046857
##                  Within R2: 0.003737

feols(sentiment_label_similarity ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_label_similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3,  age: 58,  social_media_YT: 2,  social_media_nonUser: 2,  social_media_user: 2,  social_media_use_numeric: 4,  website_use_numeric: 4,  genderFemale: 2,  raceAsian: 2,  raceBlack: 2,  raceHispanic: 2,  raceWhite: 2,  raceOther: 2,  eduHighSchoolOrLess: 2,  eduSomeCollege: 2,  eduBachelor: 2,  eduPostGrad: 2,  polpartyDem: 2,  polpartyRep: 2,  polpartyOther: 2,  libcons_numeric: 5,  income_numeric: 6,  social_media_reply_numeric: 6,  review_freq_numeric: 6
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.000414   0.028343 0.014593 0.988360    
## TreatmentOne-Click Generate 0.006966   0.029786 0.233857 0.815147    
## TreatmentChat Generate      0.018888   0.031329 0.602902 0.546720    
## memory_perc                 0.044246   0.044293 0.998935 0.318084    
## memory_numeric              0.022626   0.012379 1.827780 0.067901 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.447578     Adj. R2: 0.056907
##                  Within R2: 0.003241

Triple Sentiment Score / Sentiment Segments (Pos, Neu, Neg)

We use twitter-roberta-base-{sentiment}-latest model to generate sentiment score for each sentence in the review. There will be three scores produced, a positive score, a neutral score, and a negative score, all range between 0 and 1 (roughly calculating the proportion of each sentiment). We then calculate the similarity between the sentiment score of the review and the sentiment score of the content followup separately for positive, neutral, and negative sentiment by taking an absolute difference. The sentiment similarity is then calculated as 1 - the average of the three differences.

feols(sentiment_similarity ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.007151   0.012778 0.559642 0.575858    
## TreatmentOne-Click Generate 0.002212   0.013255 0.166844 0.867529    
## TreatmentChat Generate      0.016534   0.013840 1.194584 0.232552    
## memory_perc                 0.015360   0.018162 0.845743 0.397912    
## memory_numeric              0.012421   0.005439 2.283702 0.022612 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.209759     Adj. R2: 0.043362
##                  Within R2: 0.004833

feols(sentiment_similarity ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3,  age: 58,  social_media_YT: 2,  social_media_nonUser: 2,  social_media_user: 2,  social_media_use_numeric: 4,  website_use_numeric: 4,  genderFemale: 2,  raceAsian: 2,  raceBlack: 2,  raceHispanic: 2,  raceWhite: 2,  raceOther: 2,  eduHighSchoolOrLess: 2,  eduSomeCollege: 2,  eduBachelor: 2,  eduPostGrad: 2,  polpartyDem: 2,  polpartyRep: 2,  polpartyOther: 2,  libcons_numeric: 5,  income_numeric: 6,  social_media_reply_numeric: 6,  review_freq_numeric: 6
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.003889   0.012948 0.300350 0.763977    
## TreatmentOne-Click Generate 0.000821   0.013675 0.060032 0.952143    
## TreatmentChat Generate      0.016609   0.014600 1.137618 0.255571    
## memory_perc                 0.014230   0.019411 0.733065 0.463702    
## memory_numeric              0.011101   0.005660 1.961161 0.050156 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.204548     Adj. R2: 0.051956
##                  Within R2: 0.003528

Split by Sentiment Positive, Neutral, Negative: here we look at the similarity score separately for each of the three sentiment scores.

Positive

df_followup_panel_final_nonempty_with_sim_score$sentiment_similarity_pos <- 1 - df_followup_panel_final_nonempty_with_sim_score$sentiment_diff_pos

feols(sentiment_similarity_pos ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_pos
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value Pr(>|t|) 
## TreatmentHint Control        0.005909   0.018683  0.316301  0.75184 
## TreatmentOne-Click Generate -0.012968   0.019567 -0.662743  0.50766 
## TreatmentChat Generate       0.000971   0.020334  0.047743  0.96193 
## memory_perc                  0.026811   0.026591  1.008267  0.31359 
## memory_numeric               0.012317   0.007845  1.569994  0.11675 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.314806     Adj. R2: 0.022674
##                  Within R2: 0.002801

feols(sentiment_similarity_pos ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_pos
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3,  age: 58,  social_media_YT: 2,  social_media_nonUser: 2,  social_media_user: 2,  social_media_use_numeric: 4,  website_use_numeric: 4,  genderFemale: 2,  raceAsian: 2,  raceBlack: 2,  raceHispanic: 2,  raceWhite: 2,  raceOther: 2,  eduHighSchoolOrLess: 2,  eduSomeCollege: 2,  eduBachelor: 2,  eduPostGrad: 2,  polpartyDem: 2,  polpartyRep: 2,  polpartyOther: 2,  libcons_numeric: 5,  income_numeric: 6,  social_media_reply_numeric: 6,  review_freq_numeric: 6
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value Pr(>|t|) 
## TreatmentHint Control        0.002473   0.019019  0.130053  0.89655 
## TreatmentOne-Click Generate -0.013535   0.020144 -0.671950  0.50178 
## TreatmentChat Generate       0.001542   0.021086  0.073126  0.94172 
## memory_perc                  0.018954   0.028637  0.661875  0.50821 
## memory_numeric               0.011374   0.008166  1.392854  0.16399 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.306933     Adj. R2: 0.031785
##                  Within R2: 0.00183

Neutral

df_followup_panel_final_nonempty_with_sim_score$sentiment_similarity_neu <- 1 - df_followup_panel_final_nonempty_with_sim_score$sentiment_diff_neu

feols(sentiment_similarity_neu ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_neu
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.000435   0.011639 0.037355 0.970210    
## TreatmentOne-Click Generate 0.008279   0.011658 0.710115 0.477810    
## TreatmentChat Generate      0.018365   0.012336 1.488667 0.136912    
## memory_perc                 0.019138   0.017159 1.115319 0.265000    
## memory_numeric              0.012637   0.004942 2.557238 0.010707 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.180165     Adj. R2: 0.028224
##                  Within R2: 0.00793

feols(sentiment_similarity_neu ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_neu
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3,  age: 58,  social_media_YT: 2,  social_media_nonUser: 2,  social_media_user: 2,  social_media_use_numeric: 4,  website_use_numeric: 4,  genderFemale: 2,  raceAsian: 2,  raceBlack: 2,  raceHispanic: 2,  raceWhite: 2,  raceOther: 2,  eduHighSchoolOrLess: 2,  eduSomeCollege: 2,  eduBachelor: 2,  eduPostGrad: 2,  polpartyDem: 2,  polpartyRep: 2,  polpartyOther: 2,  libcons_numeric: 5,  income_numeric: 6,  social_media_reply_numeric: 6,  review_freq_numeric: 6
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value Pr(>|t|)    
## TreatmentHint Control       -0.000026   0.011573 -0.002262 0.998195    
## TreatmentOne-Click Generate  0.006822   0.011948  0.570980 0.568150    
## TreatmentChat Generate       0.018267   0.013324  1.371046 0.170689    
## memory_perc                  0.009744   0.017499  0.556865 0.577753    
## memory_numeric               0.012549   0.005325  2.356718 0.018642 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.176366     Adj. R2: 0.029518
##                  Within R2: 0.005501

Negative

df_followup_panel_final_nonempty_with_sim_score$sentiment_similarity_neg <- 1 - df_followup_panel_final_nonempty_with_sim_score$sentiment_diff_neg

feols(sentiment_similarity_neg ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_neg
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.015109   0.014594 1.035241 0.300824    
## TreatmentOne-Click Generate 0.011324   0.015271 0.741515 0.458567    
## TreatmentChat Generate      0.030265   0.015997 1.891913 0.058811 .  
## memory_perc                 0.000132   0.019779 0.006683 0.994669    
## memory_numeric              0.012308   0.006115 2.012861 0.044415 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.251499     Adj. R2: 0.054397
##                  Within R2: 0.003549

feols(sentiment_similarity_neg ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_neg
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3,  age: 58,  social_media_YT: 2,  social_media_nonUser: 2,  social_media_user: 2,  social_media_use_numeric: 4,  website_use_numeric: 4,  genderFemale: 2,  raceAsian: 2,  raceBlack: 2,  raceHispanic: 2,  raceWhite: 2,  raceOther: 2,  eduHighSchoolOrLess: 2,  eduSomeCollege: 2,  eduBachelor: 2,  eduPostGrad: 2,  polpartyDem: 2,  polpartyRep: 2,  polpartyOther: 2,  libcons_numeric: 5,  income_numeric: 6,  social_media_reply_numeric: 6,  review_freq_numeric: 6
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.009219   0.014788 0.623463 0.533132    
## TreatmentOne-Click Generate 0.009177   0.016020 0.572828 0.566899    
## TreatmentChat Generate      0.030018   0.016618 1.806344 0.071186 .  
## memory_perc                 0.013991   0.022017 0.635438 0.525298    
## memory_numeric              0.009379   0.006160 1.522642 0.128186    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.245172     Adj. R2: 0.063499
##                  Within R2: 0.002962

Single Sentiment Score (-1 to 1)

We assign sentiment score based on label’s confidence level. If the label is positive, we use the confidence score directly. If the label is negative, we use negative confidence score. If the label is neutral, we use 0 so that we can get a sentiment score between -1 and 1.

We measure in terms of two differences (instead of similarity)

Absolute Difference
Followup - Main Difference

df_followup_panel_final_nonempty_with_sim_score$sentiment_score_diff_abs <- abs(df_followup_panel_final_nonempty_with_sim_score$sentiment_score_followup - df_followup_panel_final_nonempty_with_sim_score$sentiment_score)
df_followup_panel_final_nonempty_with_sim_score$sentiment_score_diff <- df_followup_panel_final_nonempty_with_sim_score$sentiment_score_followup - df_followup_panel_final_nonempty_with_sim_score$sentiment_score

feols(sentiment_score_diff_abs ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_score_diff_abs
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value Pr(>|t|)    
## TreatmentHint Control       -0.021849   0.031451 -0.694703 0.487414    
## TreatmentOne-Click Generate -0.003921   0.033579 -0.116779 0.907060    
## TreatmentChat Generate      -0.024408   0.034590 -0.705640 0.480587    
## memory_perc                 -0.029454   0.043619 -0.675254 0.499681    
## memory_numeric              -0.024362   0.013143 -1.853532 0.064121 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.539498     Adj. R2: 0.03625
##                  Within R2: 0.00268

feols(sentiment_score_diff_abs ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_score_diff_abs
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3,  age: 58,  social_media_YT: 2,  social_media_nonUser: 2,  social_media_user: 2,  social_media_use_numeric: 4,  website_use_numeric: 4,  genderFemale: 2,  raceAsian: 2,  raceBlack: 2,  raceHispanic: 2,  raceWhite: 2,  raceOther: 2,  eduHighSchoolOrLess: 2,  eduSomeCollege: 2,  eduBachelor: 2,  eduPostGrad: 2,  polpartyDem: 2,  polpartyRep: 2,  polpartyOther: 2,  libcons_numeric: 5,  income_numeric: 6,  social_media_reply_numeric: 6,  review_freq_numeric: 6
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value Pr(>|t|) 
## TreatmentHint Control       -0.014233   0.031783 -0.447811  0.65439 
## TreatmentOne-Click Generate -0.001685   0.034794 -0.048433  0.96138 
## TreatmentChat Generate      -0.027518   0.035517 -0.774796  0.43866 
## memory_perc                 -0.042414   0.047788 -0.887542  0.37502 
## memory_numeric              -0.020811   0.013441 -1.548398  0.12186 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.525298     Adj. R2: 0.047803
##                  Within R2: 0.002204

feols(sentiment_score_diff ~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup,
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_score_diff
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value    Pr(>|t|)    
## TreatmentHint Control       -0.017154   0.040116 -0.427622 0.669024586    
## TreatmentOne-Click Generate -0.157286   0.042448 -3.705353 0.000223473 ***
## TreatmentChat Generate      -0.188945   0.043685 -4.325117 0.000016881 ***
## memory_perc                  0.003025   0.057903  0.052250 0.958340925    
## memory_numeric              -0.012901   0.017540 -0.735556 0.462185404    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.679325     Adj. R2: 0.027227
##                  Within R2: 0.014558

feols(sentiment_score_diff~ Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_score_diff
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3,  age: 58,  social_media_YT: 2,  social_media_nonUser: 2,  social_media_user: 2,  social_media_use_numeric: 4,  website_use_numeric: 4,  genderFemale: 2,  raceAsian: 2,  raceBlack: 2,  raceHispanic: 2,  raceWhite: 2,  raceOther: 2,  eduHighSchoolOrLess: 2,  eduSomeCollege: 2,  eduBachelor: 2,  eduPostGrad: 2,  polpartyDem: 2,  polpartyRep: 2,  polpartyOther: 2,  libcons_numeric: 5,  income_numeric: 6,  social_media_reply_numeric: 6,  review_freq_numeric: 6
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value     Pr(>|t|)    
## TreatmentHint Control       -0.021114   0.041161 -0.512956 0.6081027989    
## TreatmentOne-Click Generate -0.183598   0.044763 -4.101526 0.0000446143 ***
## TreatmentChat Generate      -0.193737   0.043243 -4.480210 0.0000083785 ***
## memory_perc                  0.023534   0.062379  0.377268 0.7060599692    
## memory_numeric              -0.027374   0.018049 -1.516660 0.1296901171    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.66234     Adj. R2: 0.036285
##                 Within R2: 0.016373

Smiliarity Mechanism Understanding - Sentiment

For each score that we investigate (sentiment label, sentiment segments, single sentiment score), we follow the following structure

Followup
Main
Similarity

We use the following differences instead of similarity for single sentiment score.

Absolute Difference
Difference (Followup - Main)

Within each of these, we also look at the subgroup analysis by treatment.

Sentiment Label

Followup Reviews

feols(similarity ~ sentiment_label_followup + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), 
      cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                   Estimate Std. Error   t value   Pr(>|t|)    
## sentiment_label_followupnegative -0.000431   0.013709 -0.031440 0.97492520    
## sentiment_label_followuppositive  0.045243   0.012224  3.701168 0.00022714 ***
## memory_perc                       0.070595   0.020087  3.514482 0.00046172 ***
## memory_numeric                    0.008143   0.005888  1.382991 0.16699739    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184763     Adj. R2: 0.04054 
##                  Within R2: 0.028628

Subgroup Analysis

feols(similarity ~ Treatment * sentiment_label_followup + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                               Estimate
## TreatmentHint Control                                        -0.013451
## TreatmentOne-Click Generate                                   0.093021
## TreatmentChat Generate                                        0.109102
## sentiment_label_followupnegative                              0.033982
## sentiment_label_followuppositive                              0.069918
## memory_perc                                                   0.061002
## memory_numeric                                                0.008915
## TreatmentHint Control:sentiment_label_followupnegative        0.002757
## TreatmentOne-Click Generate:sentiment_label_followupnegative -0.050873
## TreatmentChat Generate:sentiment_label_followupnegative      -0.090534
## TreatmentHint Control:sentiment_label_followuppositive        0.021026
## TreatmentOne-Click Generate:sentiment_label_followuppositive -0.051497
## TreatmentChat Generate:sentiment_label_followuppositive      -0.064166
##                                                              Std. Error
## TreatmentHint Control                                          0.031233
## TreatmentOne-Click Generate                                    0.031243
## TreatmentChat Generate                                         0.031380
## sentiment_label_followupnegative                               0.027725
## sentiment_label_followuppositive                               0.025232
## memory_perc                                                    0.020046
## memory_numeric                                                 0.005728
## TreatmentHint Control:sentiment_label_followupnegative         0.037833
## TreatmentOne-Click Generate:sentiment_label_followupnegative   0.036351
## TreatmentChat Generate:sentiment_label_followupnegative        0.039033
## TreatmentHint Control:sentiment_label_followuppositive         0.034032
## TreatmentOne-Click Generate:sentiment_label_followuppositive   0.033135
## TreatmentChat Generate:sentiment_label_followuppositive        0.034084
##                                                                t value
## TreatmentHint Control                                        -0.430669
## TreatmentOne-Click Generate                                   2.977346
## TreatmentChat Generate                                        3.476842
## sentiment_label_followupnegative                              1.225666
## sentiment_label_followuppositive                              2.770964
## memory_perc                                                   3.043163
## memory_numeric                                                1.556499
## TreatmentHint Control:sentiment_label_followupnegative        0.072867
## TreatmentOne-Click Generate:sentiment_label_followupnegative -1.399507
## TreatmentChat Generate:sentiment_label_followupnegative      -2.319449
## TreatmentHint Control:sentiment_label_followuppositive        0.617833
## TreatmentOne-Click Generate:sentiment_label_followuppositive -1.554163
## TreatmentChat Generate:sentiment_label_followuppositive      -1.882601
##                                                                Pr(>|t|)    
## TreatmentHint Control                                        0.66680841    
## TreatmentOne-Click Generate                                  0.00298246 ** 
## TreatmentChat Generate                                       0.00053069 ***
## sentiment_label_followupnegative                             0.22063266    
## sentiment_label_followuppositive                             0.00569981 ** 
## memory_perc                                                  0.00240643 ** 
## memory_numeric                                               0.11992744    
## TreatmentHint Control:sentiment_label_followupnegative       0.94192754    
## TreatmentOne-Click Generate:sentiment_label_followupnegative 0.16199219    
## TreatmentChat Generate:sentiment_label_followupnegative      0.02058516 *  
## TreatmentHint Control:sentiment_label_followuppositive       0.53683593    
## TreatmentOne-Click Generate:sentiment_label_followuppositive 0.12048362    
## TreatmentChat Generate:sentiment_label_followuppositive      0.06006442 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.182694     Adj. R2: 0.05842
##                  Within R2: 0.05027

Main Reviews

feols(similarity ~ sentiment_label + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), 
      cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                         Estimate Std. Error t value   Pr(>|t|)    
## sentiment_labelnegative 0.023276   0.017171 1.35557 0.17556222    
## sentiment_labelpositive 0.045651   0.014759 3.09308 0.00203964 ** 
## memory_perc             0.069504   0.020305 3.42293 0.00064635 ***
## memory_numeric          0.008627   0.005913 1.45907 0.14488098    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.185391     Adj. R2: 0.034011
##                  Within R2: 0.022018

Subgroup Analysis

feols(similarity ~ Treatment * sentiment_label + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                      Estimate Std. Error
## TreatmentHint Control                               -0.027127   0.035619
## TreatmentOne-Click Generate                          0.072855   0.039701
## TreatmentChat Generate                               0.144352   0.043914
## sentiment_labelnegative                              0.043163   0.032798
## sentiment_labelpositive                              0.046967   0.028915
## memory_perc                                          0.059732   0.020130
## memory_numeric                                       0.008821   0.005761
## TreatmentHint Control:sentiment_labelnegative        0.011891   0.042570
## TreatmentOne-Click Generate:sentiment_labelnegative -0.036405   0.048524
## TreatmentChat Generate:sentiment_labelnegative      -0.119854   0.051355
## TreatmentHint Control:sentiment_labelpositive        0.036023   0.037493
## TreatmentOne-Click Generate:sentiment_labelpositive -0.027260   0.040265
## TreatmentChat Generate:sentiment_labelpositive      -0.103142   0.044996
##                                                       t value  Pr(>|t|)    
## TreatmentHint Control                               -0.761587 0.4464984    
## TreatmentOne-Click Generate                          1.835090 0.0668096 .  
## TreatmentChat Generate                               3.287182 0.0010496 ** 
## sentiment_labelnegative                              1.316022 0.1884888    
## sentiment_labelpositive                              1.624308 0.1046469    
## memory_perc                                          2.967335 0.0030804 ** 
## memory_numeric                                       1.531064 0.1260916    
## TreatmentHint Control:sentiment_labelnegative        0.279325 0.7800571    
## TreatmentOne-Click Generate:sentiment_labelnegative -0.750248 0.4532937    
## TreatmentChat Generate:sentiment_labelnegative      -2.333856 0.0198141 *  
## TreatmentHint Control:sentiment_labelpositive        0.960788 0.3369070    
## TreatmentOne-Click Generate:sentiment_labelpositive -0.677013 0.4985650    
## TreatmentChat Generate:sentiment_labelpositive      -2.292248 0.0221125 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.183555     Adj. R2: 0.049518
##                  Within R2: 0.041291

Label Similiarity

feols(similarity ~ sentiment_label_similarity + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), 
      cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                            Estimate Std. Error t value             Pr(>|t|)    
## sentiment_label_similarity 0.064773   0.008570 7.55794 0.000000000000097453 ***
## memory_perc                0.067688   0.020101 3.36746 0.000789565473174114 ***
## memory_numeric             0.007588   0.005849 1.29739 0.194817318904131870    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.183541     Adj. R2: 0.053589
##                  Within R2: 0.041444

Subgroup Analysis

feols(similarity ~ Treatment * sentiment_label_similarity + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                         Estimate Std. Error
## TreatmentHint Control                                   0.005797   0.020807
## TreatmentOne-Click Generate                             0.076900   0.020727
## TreatmentChat Generate                                  0.089177   0.020901
## sentiment_label_similarity                              0.091071   0.017589
## memory_perc                                             0.059284   0.019852
## memory_numeric                                          0.007636   0.005731
## TreatmentHint Control:sentiment_label_similarity       -0.007508   0.023862
## TreatmentOne-Click Generate:sentiment_label_similarity -0.043971   0.022945
## TreatmentChat Generate:sentiment_label_similarity      -0.062129   0.024084
##                                                          t value      Pr(>|t|)
## TreatmentHint Control                                   0.278583 0.78062621794
## TreatmentOne-Click Generate                             3.710208 0.00021929217
## TreatmentChat Generate                                  4.266529 0.00002187218
## sentiment_label_similarity                              5.177852 0.00000027478
## memory_perc                                             2.986247 0.00289780671
## memory_numeric                                          1.332454 0.18303546970
## TreatmentHint Control:sentiment_label_similarity       -0.314651 0.75309672506
## TreatmentOne-Click Generate:sentiment_label_similarity -1.916368 0.05562255708
## TreatmentChat Generate:sentiment_label_similarity      -2.579641 0.01004181998
##                                                           
## TreatmentHint Control                                     
## TreatmentOne-Click Generate                            ***
## TreatmentChat Generate                                 ***
## sentiment_label_similarity                             ***
## memory_perc                                            ** 
## memory_numeric                                            
## TreatmentHint Control:sentiment_label_similarity          
## TreatmentOne-Click Generate:sentiment_label_similarity .  
## TreatmentChat Generate:sentiment_label_similarity      *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.181648     Adj. R2: 0.070703
##                  Within R2: 0.061106

Sentiment Segments

We use the followup response sentiment segments given all followup responses are written by users.This is done for all modes.

Followup Reviews

feols(similarity ~ text_sent_pos_followup + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), 
      cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                        Estimate Std. Error t value        Pr(>|t|)    
## text_sent_pos_followup 0.063036   0.010761 5.85769 0.0000000064945 ***
## memory_perc            0.070658   0.020037 3.52632 0.0004418242087 ***
## memory_numeric         0.007875   0.005891 1.33670 0.1816445949750    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184396     Adj. R2: 0.044748
##                  Within R2: 0.03249

feols(similarity ~ text_sent_neu_followup + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), 
      cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                         Estimate Std. Error  t value    Pr(>|t|)    
## text_sent_neu_followup -0.091177   0.020742 -4.39586 0.000012297 ***
## memory_perc             0.068806   0.020218  3.40317 0.000694330 ***
## memory_numeric          0.008103   0.005904  1.37251 0.170233881    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184993     Adj. R2: 0.038551
##                  Within R2: 0.026213

feols(similarity ~ text_sent_neg_followup + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), 
      cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                         Estimate Std. Error  t value     Pr(>|t|)    
## text_sent_neg_followup -0.060595   0.013402 -4.52121 0.0000069368 ***
## memory_perc             0.071905   0.020216  3.55687 0.0003941091 ***
## memory_numeric          0.008547   0.005934  1.44045 0.1500757532    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.185085     Adj. R2: 0.037591
##                  Within R2: 0.025241

Subgroup Analysis

feols(similarity ~ Treatment * text_sent_pos_followup + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                     Estimate Std. Error
## TreatmentHint Control                              -0.014715   0.023115
## TreatmentOne-Click Generate                         0.067322   0.022481
## TreatmentChat Generate                              0.053449   0.022920
## text_sent_pos_followup                              0.067778   0.021948
## memory_perc                                         0.061929   0.020033
## memory_numeric                                      0.008654   0.005756
## TreatmentHint Control:text_sent_pos_followup        0.023280   0.029839
## TreatmentOne-Click Generate:text_sent_pos_followup -0.030506   0.029011
## TreatmentChat Generate:text_sent_pos_followup      -0.007967   0.029502
##                                                      t value  Pr(>|t|)    
## TreatmentHint Control                              -0.636607 0.5245367    
## TreatmentOne-Click Generate                         2.994637 0.0028200 ** 
## TreatmentChat Generate                              2.331917 0.0199164 *  
## text_sent_pos_followup                              3.088113 0.0020737 ** 
## memory_perc                                         3.091403 0.0020511 ** 
## memory_numeric                                      1.503483 0.1330518    
## TreatmentHint Control:text_sent_pos_followup        0.780195 0.4354733    
## TreatmentOne-Click Generate:text_sent_pos_followup -1.051518 0.2932923    
## TreatmentChat Generate:text_sent_pos_followup      -0.270037 0.7871914    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.182617     Adj. R2: 0.060764
##                  Within R2: 0.051065

feols(similarity ~ Treatment * text_sent_neu_followup + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                     Estimate Std. Error
## TreatmentHint Control                               0.012521   0.016465
## TreatmentOne-Click Generate                         0.026554   0.017079
## TreatmentChat Generate                              0.026834   0.017946
## text_sent_neu_followup                             -0.139115   0.038965
## memory_perc                                         0.059137   0.020183
## memory_numeric                                      0.008792   0.005739
## TreatmentHint Control:text_sent_neu_followup       -0.060286   0.054886
## TreatmentOne-Click Generate:text_sent_neu_followup  0.122249   0.054134
## TreatmentChat Generate:text_sent_neu_followup       0.123178   0.055331
##                                                      t value   Pr(>|t|)    
## TreatmentHint Control                               0.760459 0.44717170    
## TreatmentOne-Click Generate                         1.554762 0.12034085    
## TreatmentChat Generate                              1.495308 0.13517105    
## text_sent_neu_followup                             -3.570258 0.00037477 ***
## memory_perc                                         2.930082 0.00347105 ** 
## memory_numeric                                      1.531998 0.12586087    
## TreatmentHint Control:text_sent_neu_followup       -1.098389 0.27231700    
## TreatmentOne-Click Generate:text_sent_neu_followup  2.258275 0.02415809 *  
## TreatmentChat Generate:text_sent_neu_followup       2.226187 0.02623907 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.182584     Adj. R2: 0.061107
##                  Within R2: 0.051412

feols(similarity ~ Treatment * text_sent_neg_followup + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                     Estimate Std. Error
## TreatmentHint Control                               0.002408   0.014922
## TreatmentOne-Click Generate                         0.048840   0.015841
## TreatmentChat Generate                              0.056687   0.015922
## text_sent_neg_followup                             -0.048111   0.027624
## memory_perc                                         0.064058   0.020187
## memory_numeric                                      0.009146   0.005811
## TreatmentHint Control:text_sent_neg_followup       -0.010078   0.038062
## TreatmentOne-Click Generate:text_sent_neg_followup -0.002419   0.035719
## TreatmentChat Generate:text_sent_neg_followup      -0.043698   0.037593
##                                                      t value   Pr(>|t|)    
## TreatmentHint Control                               0.161391 0.87182001    
## TreatmentOne-Click Generate                         3.083111 0.00210852 ** 
## TreatmentChat Generate                              3.560274 0.00038911 ***
## text_sent_neg_followup                             -1.741644 0.08189922 .  
## memory_perc                                         3.173208 0.00155681 ** 
## memory_numeric                                      1.573823 0.11586631    
## TreatmentHint Control:text_sent_neg_followup       -0.264773 0.79124257    
## TreatmentOne-Click Generate:text_sent_neg_followup -0.067711 0.94603030    
## TreatmentChat Generate:text_sent_neg_followup      -1.162402 0.24536862    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.18348     Adj. R2: 0.051871
##                 Within R2: 0.042081

Main Reviews

feols(similarity ~ text_sent_pos + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                Estimate Std. Error t value     Pr(>|t|)    
## text_sent_pos  0.057949   0.012281 4.71854 0.0000027376 ***
## memory_perc    0.069751   0.020235 3.44709 0.0005918783 ***
## memory_numeric 0.007862   0.005867 1.34003 0.1805599221    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184906     Adj. R2: 0.039457
##                  Within R2: 0.027131

feols(similarity ~  text_sent_neu + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                 Estimate Std. Error  t value       Pr(>|t|)    
## text_sent_neu  -0.131969   0.024498 -5.38692 0.000000090703 ***
## memory_perc     0.067133   0.020247  3.31566 0.000949403667 ***
## memory_numeric  0.008493   0.005860  1.44945 0.147547229318    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184489     Adj. R2: 0.043784
##                  Within R2: 0.031513

feols(similarity ~ text_sent_neg + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                 Estimate Std. Error  t value   Pr(>|t|)    
## text_sent_neg  -0.038185   0.015444 -2.47244 0.01359624 *  
## memory_perc     0.071096   0.020365  3.49105 0.00050359 ***
## memory_numeric  0.008426   0.005937  1.41921 0.15616952    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.185676     Adj. R2: 0.031434
##                  Within R2: 0.019005

Subgroup Analysis

feols(similarity ~ Treatment * text_sent_pos + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                            Estimate Std. Error   t value
## TreatmentHint Control                     -0.030980   0.024712 -1.253634
## TreatmentOne-Click Generate                0.043217   0.024852  1.738961
## TreatmentChat Generate                     0.062034   0.029209  2.123792
## text_sent_pos                              0.039015   0.022462  1.736961
## memory_perc                                0.060278   0.020204  2.983490
## memory_numeric                             0.008255   0.005770  1.430571
## TreatmentHint Control:text_sent_pos        0.045008   0.030512  1.475101
## TreatmentOne-Click Generate:text_sent_pos  0.002285   0.029797  0.076675
## TreatmentChat Generate:text_sent_pos      -0.023922   0.034849 -0.686440
##                                            Pr(>|t|)    
## TreatmentHint Control                     0.2102879    
## TreatmentOne-Click Generate               0.0823703 .  
## TreatmentChat Generate                    0.0339493 *  
## text_sent_pos                             0.0827230 .  
## memory_perc                               0.0029238 ** 
## memory_numeric                            0.1528869    
## TreatmentHint Control:text_sent_pos       0.1405217    
## TreatmentOne-Click Generate:text_sent_pos 0.9388983    
## TreatmentChat Generate:text_sent_pos      0.4926055    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.183472     Adj. R2: 0.05195
##                  Within R2: 0.04216

feols(similarity ~ Treatment * text_sent_neu + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                            Estimate Std. Error   t value
## TreatmentHint Control                      0.015432   0.016162  0.954803
## TreatmentOne-Click Generate                0.039685   0.016735  2.371415
## TreatmentChat Generate                     0.021877   0.017658  1.238922
## text_sent_neu                             -0.120458   0.050497 -2.385473
## memory_perc                                0.055697   0.020127  2.767215
## memory_numeric                             0.008337   0.005715  1.458779
## TreatmentHint Control:text_sent_neu       -0.094137   0.062614 -1.503448
## TreatmentOne-Click Generate:text_sent_neu  0.036544   0.065756  0.555752
## TreatmentChat Generate:text_sent_neu       0.170036   0.077378  2.197459
##                                            Pr(>|t|)    
## TreatmentHint Control                     0.3399240    
## TreatmentOne-Click Generate               0.0179217 *  
## TreatmentChat Generate                    0.2156852    
## text_sent_neu                             0.0172552 *  
## memory_perc                               0.0057652 ** 
## memory_numeric                            0.1449616    
## TreatmentHint Control:text_sent_neu       0.1330609    
## TreatmentOne-Click Generate:text_sent_neu 0.5785134    
## TreatmentChat Generate:text_sent_neu      0.0282321 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.182597     Adj. R2: 0.06097 
##                  Within R2: 0.051273

feols(similarity ~ Treatment * text_sent_neg + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                            Estimate Std. Error   t value
## TreatmentHint Control                      0.005537   0.014857  0.372716
## TreatmentOne-Click Generate                0.050001   0.015602  3.204763
## TreatmentChat Generate                     0.050773   0.015747  3.224221
## text_sent_neg                             -0.011759   0.029386 -0.400143
## memory_perc                                0.062536   0.020330  3.076048
## memory_numeric                             0.009105   0.005828  1.562200
## TreatmentHint Control:text_sent_neg       -0.028031   0.039170 -0.715618
## TreatmentOne-Click Generate:text_sent_neg -0.016625   0.038994 -0.426343
## TreatmentChat Generate:text_sent_neg      -0.025556   0.044536 -0.573839
##                                            Pr(>|t|)    
## TreatmentHint Control                     0.7094446    
## TreatmentOne-Click Generate               0.0013975 ** 
## TreatmentChat Generate                    0.0013069 ** 
## text_sent_neg                             0.6891425    
## memory_perc                               0.0021586 ** 
## memory_numeric                            0.1185789    
## TreatmentHint Control:text_sent_neg       0.4744058    
## TreatmentOne-Click Generate:text_sent_neg 0.6699560    
## TreatmentChat Generate:text_sent_neg      0.5662148    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184237     Adj. R2: 0.044029
##                  Within R2: 0.034157

Sentiment Similarity

feols(similarity ~ sentiment_similarity_pos + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                          Estimate Std. Error t value   Pr(>|t|)    
## sentiment_similarity_pos 0.106286   0.011781 9.02188  < 2.2e-16 ***
## memory_perc              0.067913   0.020056 3.38613 0.00073836 ***
## memory_numeric           0.007770   0.005842 1.33009 0.18381252    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.1829     Adj. R2: 0.060185
##                Within R2: 0.048125

feols(similarity ~ sentiment_similarity_neu + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                          Estimate Std. Error t value             Pr(>|t|)    
## sentiment_similarity_neu 0.182686   0.022799 8.01295 0.000000000000003318 ***
## memory_perc              0.066629   0.020155 3.30589 0.000982731446671672 ***
## memory_numeric           0.006813   0.005824 1.16985 0.242360596084592567    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.182996     Adj. R2: 0.059198
##                  Within R2: 0.047125

feols(similarity ~ sentiment_similarity_neg + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                          Estimate Std. Error t value             Pr(>|t|)    
## sentiment_similarity_neg 0.113231   0.014456 7.83263 0.000000000000012922 ***
## memory_perc              0.070245   0.020024 3.50808 0.000472816277513144 ***
## memory_numeric           0.007736   0.005851 1.32213 0.186448814647865696    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.183739     Adj. R2: 0.051537
##                  Within R2: 0.039366

Subgroup Analysis

feols(similarity ~ Treatment * sentiment_similarity_pos + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                       Estimate Std. Error
## TreatmentHint Control                                -0.003193   0.027965
## TreatmentOne-Click Generate                           0.102000   0.027047
## TreatmentChat Generate                                0.105390   0.028635
## sentiment_similarity_pos                              0.142795   0.025488
## memory_perc                                           0.060048   0.019978
## memory_numeric                                        0.007649   0.005741
## TreatmentHint Control:sentiment_similarity_pos        0.005109   0.033656
## TreatmentOne-Click Generate:sentiment_similarity_pos -0.071928   0.032530
## TreatmentChat Generate:sentiment_similarity_pos      -0.077804   0.035253
##                                                        t value       Pr(>|t|)
## TreatmentHint Control                                -0.114191 0.909110511915
## TreatmentOne-Click Generate                           3.771239 0.000172631705
## TreatmentChat Generate                                3.680449 0.000246125253
## sentiment_similarity_pos                              5.602329 0.000000027806
## memory_perc                                           3.005686 0.002720516068
## memory_numeric                                        1.332479 0.183027059438
## TreatmentHint Control:sentiment_similarity_pos        0.151794 0.879381967216
## TreatmentOne-Click Generate:sentiment_similarity_pos -2.211137 0.027267442101
## TreatmentChat Generate:sentiment_similarity_pos      -2.206997 0.027556306829
##                                                         
## TreatmentHint Control                                   
## TreatmentOne-Click Generate                          ***
## TreatmentChat Generate                               ***
## sentiment_similarity_pos                             ***
## memory_perc                                          ** 
## memory_numeric                                          
## TreatmentHint Control:sentiment_similarity_pos          
## TreatmentOne-Click Generate:sentiment_similarity_pos *  
## TreatmentChat Generate:sentiment_similarity_pos      *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.180854     Adj. R2: 0.078809
##                  Within R2: 0.069296

feols(similarity ~ Treatment * sentiment_similarity_neu + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                       Estimate Std. Error
## TreatmentHint Control                                -0.023995   0.051647
## TreatmentOne-Click Generate                           0.183699   0.055584
## TreatmentChat Generate                                0.232981   0.056435
## sentiment_similarity_neu                              0.257052   0.046895
## memory_perc                                           0.057711   0.019908
## memory_numeric                                        0.006495   0.005671
## TreatmentHint Control:sentiment_similarity_neu        0.030567   0.060103
## TreatmentOne-Click Generate:sentiment_similarity_neu -0.164181   0.063262
## TreatmentChat Generate:sentiment_similarity_neu      -0.223422   0.064656
##                                                        t value       Pr(>|t|)
## TreatmentHint Control                                -0.464593 0.642330689067
## TreatmentOne-Click Generate                           3.304885 0.000986231902
## TreatmentChat Generate                                4.128309 0.000039805883
## sentiment_similarity_neu                              5.481454 0.000000054257
## memory_perc                                           2.898844 0.003832885233
## memory_numeric                                        1.145272 0.252389327276
## TreatmentHint Control:sentiment_similarity_neu        0.508580 0.611166374840
## TreatmentOne-Click Generate:sentiment_similarity_neu -2.595272 0.009599553876
## TreatmentChat Generate:sentiment_similarity_neu      -3.455549 0.000573843711
##                                                         
## TreatmentHint Control                                   
## TreatmentOne-Click Generate                          ***
## TreatmentChat Generate                               ***
## sentiment_similarity_neu                             ***
## memory_perc                                          ** 
## memory_numeric                                          
## TreatmentHint Control:sentiment_similarity_neu          
## TreatmentOne-Click Generate:sentiment_similarity_neu ** 
## TreatmentChat Generate:sentiment_similarity_neu      ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.180559     Adj. R2: 0.081817
##                  Within R2: 0.072336

feols(similarity ~ Treatment * sentiment_similarity_neg + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                       Estimate Std. Error
## TreatmentHint Control                                -0.014575   0.036618
## TreatmentOne-Click Generate                           0.065192   0.033459
## TreatmentChat Generate                                0.061257   0.036292
## sentiment_similarity_neg                              0.116596   0.030875
## memory_perc                                           0.061867   0.019946
## memory_numeric                                        0.008080   0.005741
## TreatmentHint Control:sentiment_similarity_neg        0.017337   0.041579
## TreatmentOne-Click Generate:sentiment_similarity_neg -0.021574   0.038589
## TreatmentChat Generate:sentiment_similarity_neg      -0.018909   0.042048
##                                                        t value   Pr(>|t|)    
## TreatmentHint Control                                -0.398015 0.69071005    
## TreatmentOne-Click Generate                           1.948434 0.05166164 .  
## TreatmentChat Generate                                1.687912 0.09176133 .  
## sentiment_similarity_neg                              3.776406 0.00016914 ***
## memory_perc                                           3.101667 0.00198201 ** 
## memory_numeric                                        1.407480 0.15961717    
## TreatmentHint Control:sentiment_similarity_neg        0.416950 0.67681065    
## TreatmentOne-Click Generate:sentiment_similarity_neg -0.559090 0.57623403    
## TreatmentChat Generate:sentiment_similarity_neg      -0.449705 0.65302733    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.182225     Adj. R2: 0.064793
##                  Within R2: 0.055135

Sentiment Similarity (Together)

feols(similarity ~ sentiment_similarity_pos +sentiment_similarity_neu + sentiment_similarity_neg + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                          Estimate Std. Error  t value     Pr(>|t|)    
## sentiment_similarity_pos 0.020375   0.022238 0.916248 0.3597725927    
## sentiment_similarity_neu 0.137095   0.029259 4.685632 0.0000032045 ***
## sentiment_similarity_neg 0.070617   0.023223 3.040825 0.0024250039 ** 
## memory_perc              0.066837   0.019918 3.355557 0.0008239000 ***
## memory_numeric           0.006308   0.005786 1.090243 0.2758868796    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.181761     Adj. R2: 0.07109 
##                  Within R2: 0.059945

Subgroup Analysis

feols(similarity ~ Treatment * (sentiment_similarity_pos + sentiment_similarity_neu + sentiment_similarity_neg) + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                       Estimate Std. Error
## TreatmentHint Control                                -0.041744   0.062871
## TreatmentOne-Click Generate                           0.140727   0.067586
## TreatmentChat Generate                                0.220928   0.075825
## sentiment_similarity_pos                              0.077442   0.043218
## sentiment_similarity_neu                              0.182373   0.054446
## sentiment_similarity_neg                              0.018121   0.045838
## memory_perc                                           0.058428   0.019847
## memory_numeric                                        0.006082   0.005661
## TreatmentHint Control:sentiment_similarity_pos       -0.044311   0.056336
## TreatmentOne-Click Generate:sentiment_similarity_pos -0.083568   0.060227
## TreatmentChat Generate:sentiment_similarity_pos      -0.033468   0.072300
## TreatmentHint Control:sentiment_similarity_neu        0.050764   0.072651
## TreatmentOne-Click Generate:sentiment_similarity_neu -0.111447   0.079230
## TreatmentChat Generate:sentiment_similarity_neu      -0.224529   0.086981
## TreatmentHint Control:sentiment_similarity_neg        0.039891   0.059083
## TreatmentOne-Click Generate:sentiment_similarity_neg  0.073242   0.060624
## TreatmentChat Generate:sentiment_similarity_neg       0.045395   0.071428
##                                                        t value   Pr(>|t|)    
## TreatmentHint Control                                -0.663958 0.50688075    
## TreatmentOne-Click Generate                           2.082177 0.03759747 *  
## TreatmentChat Generate                                2.913643 0.00365739 ** 
## sentiment_similarity_pos                              1.791893 0.07347300 .  
## sentiment_similarity_neu                              3.349614 0.00084156 ***
## sentiment_similarity_neg                              0.395335 0.69268587    
## memory_perc                                           2.943979 0.00332030 ** 
## memory_numeric                                        1.074289 0.28296997    
## TreatmentHint Control:sentiment_similarity_pos       -0.786537 0.43175197    
## TreatmentOne-Click Generate:sentiment_similarity_pos -1.387547 0.16560522    
## TreatmentChat Generate:sentiment_similarity_pos      -0.462910 0.64353675    
## TreatmentHint Control:sentiment_similarity_neu        0.698738 0.48488930    
## TreatmentOne-Click Generate:sentiment_similarity_neu -1.406628 0.15986967    
## TreatmentChat Generate:sentiment_similarity_neu      -2.581358 0.00999237 ** 
## TreatmentHint Control:sentiment_similarity_neg        0.675166 0.49973726    
## TreatmentOne-Click Generate:sentiment_similarity_neg  1.208135 0.22730052    
## TreatmentChat Generate:sentiment_similarity_neg       0.635539 0.52523214    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.179159     Adj. R2: 0.093004
##                  Within R2: 0.086668

Single Sentiment Score (-1 to 1)

Followup Reviews

feols(similarity ~ sentiment_score_followup + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                          Estimate Std. Error t value      Pr(>|t|)    
## sentiment_score_followup 0.031611   0.006124 5.16212 0.00000029822 ***
## memory_perc              0.071537   0.020143 3.55140 0.00040228099 ***
## memory_numeric           0.008208   0.005916 1.38725 0.16569445951    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184783     Adj. R2: 0.040731
##                  Within R2: 0.028421

Subgroup Analysis

feols(similarity ~ sentiment_score_followup * Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                       Estimate Std. Error
## sentiment_score_followup                              0.029878   0.012591
## TreatmentHint Control                                -0.005927   0.015332
## TreatmentOne-Click Generate                           0.052192   0.015435
## TreatmentChat Generate                                0.045910   0.015895
## memory_perc                                           0.063040   0.020120
## memory_numeric                                        0.008895   0.005791
## sentiment_score_followup:TreatmentHint Control        0.013022   0.017165
## sentiment_score_followup:TreatmentOne-Click Generate -0.009248   0.016419
## sentiment_score_followup:TreatmentChat Generate       0.005472   0.016985
##                                                        t value  Pr(>|t|)    
## sentiment_score_followup                              2.372938 0.0178484 *  
## TreatmentHint Control                                -0.386600 0.6991400    
## TreatmentOne-Click Generate                           3.381407 0.0007510 ***
## TreatmentChat Generate                                2.888337 0.0039621 ** 
## memory_perc                                           3.133136 0.0017833 ** 
## memory_numeric                                        1.536072 0.1248586    
## sentiment_score_followup:TreatmentHint Control        0.758623 0.4482688    
## sentiment_score_followup:TreatmentOne-Click Generate -0.563243 0.5734044    
## sentiment_score_followup:TreatmentChat Generate       0.322181 0.7473876    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.183124     Adj. R2: 0.055538
##                  Within R2: 0.045785

Main Reviews

feols(similarity ~ sentiment_score + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                 Estimate Std. Error t value   Pr(>|t|)    
## sentiment_score 0.024883   0.007006 3.55143 0.00040225 ***
## memory_perc     0.070621   0.020322 3.47521 0.00053389 ***
## memory_numeric  0.008133   0.005905 1.37728 0.16875517    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.185367     Adj. R2: 0.03466 
##                  Within R2: 0.022272

Subgroup Analysis

feols(similarity ~ sentiment_score * Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                              Estimate Std. Error   t value
## sentiment_score                              0.014068   0.012915  1.089300
## TreatmentHint Control                       -0.011496   0.016232 -0.708252
## TreatmentOne-Click Generate                  0.044948   0.016656  2.698629
## TreatmentChat Generate                       0.050276   0.018748  2.681616
## memory_perc                                  0.061503   0.020280  3.032652
## memory_numeric                               0.008651   0.005803  1.490697
## sentiment_score:TreatmentHint Control        0.022507   0.017628  1.276794
## sentiment_score:TreatmentOne-Click Generate  0.002644   0.017102  0.154602
## sentiment_score:TreatmentChat Generate      -0.006438   0.019972 -0.322355
##                                              Pr(>|t|)    
## sentiment_score                             0.2763018    
## TreatmentHint Control                       0.4789654    
## TreatmentOne-Click Generate                 0.0070880 ** 
## TreatmentChat Generate                      0.0074557 ** 
## memory_perc                                 0.0024910 ** 
## memory_numeric                              0.1363779    
## sentiment_score:TreatmentHint Control       0.2019915    
## sentiment_score:TreatmentOne-Click Generate 0.8771687    
## sentiment_score:TreatmentChat Generate      0.7472560    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.183914     Adj. R2: 0.047372
##                  Within R2: 0.037534

Absolute Difference

feols(similarity ~ sentiment_score_diff_abs + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                           Estimate Std. Error  t value               Pr(>|t|)
## sentiment_score_diff_abs -0.056600   0.006840 -8.27510 0.00000000000000043892
## memory_perc               0.068901   0.020166  3.41673 0.00066106157629721829
## memory_numeric            0.007708   0.005869  1.31326 0.18941865258401630046
##                             
## sentiment_score_diff_abs ***
## memory_perc              ***
## memory_numeric              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.183411     Adj. R2: 0.054922
##                  Within R2: 0.042794

Subgroup Analysis

feols(similarity ~ sentiment_score_diff_abs * Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                       Estimate Std. Error
## sentiment_score_diff_abs                             -0.071819   0.014747
## TreatmentHint Control                                 0.001571   0.015529
## TreatmentOne-Click Generate                           0.034134   0.016409
## TreatmentChat Generate                                0.032009   0.017203
## memory_perc                                           0.061065   0.020035
## memory_numeric                                        0.007669   0.005770
## sentiment_score_diff_abs:TreatmentHint Control       -0.004092   0.019857
## sentiment_score_diff_abs:TreatmentOne-Click Generate  0.031525   0.018452
## sentiment_score_diff_abs:TreatmentChat Generate       0.034639   0.020192
##                                                        t value     Pr(>|t|)    
## sentiment_score_diff_abs                             -4.870211 0.0000013083 ***
## TreatmentHint Control                                 0.101177 0.9194313467    
## TreatmentOne-Click Generate                           2.080143 0.0377839839 *  
## TreatmentChat Generate                                1.860674 0.0631036876 .  
## memory_perc                                           3.047980 0.0023685655 ** 
## memory_numeric                                        1.329139 0.1841261592    
## sentiment_score_diff_abs:TreatmentHint Control       -0.206086 0.8367685628    
## sentiment_score_diff_abs:TreatmentOne-Click Generate  1.708481 0.0878785970 .  
## sentiment_score_diff_abs:TreatmentChat Generate       1.715459 0.0865920324 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.181607     Adj. R2: 0.071124
##                  Within R2: 0.061532

Difference (Followup - Main)

feols(similarity ~ sentiment_score_diff + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                      Estimate Std. Error t value   Pr(>|t|)    
## sentiment_score_diff 0.010739   0.005762 1.86365 0.06268428 .  
## memory_perc          0.070999   0.020386 3.48279 0.00051919 ***
## memory_numeric       0.009156   0.005985 1.52984 0.12639367    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.185792     Adj. R2: 0.030233
##                  Within R2: 0.017788

Subgroup Analysis

feols(similarity ~ sentiment_score_diff * Treatment + memory_perc + memory_numeric | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                   Estimate Std. Error   t value
## sentiment_score_diff                              0.016050   0.011219  1.430608
## TreatmentHint Control                             0.001214   0.013372  0.090750
## TreatmentOne-Click Generate                       0.049288   0.014260  3.456462
## TreatmentChat Generate                            0.055760   0.014696  3.794302
## memory_perc                                       0.062594   0.020288  3.085325
## memory_numeric                                    0.009753   0.005862  1.663613
## sentiment_score_diff:TreatmentHint Control       -0.006642   0.016235 -0.409132
## sentiment_score_diff:TreatmentOne-Click Generate -0.008639   0.015384 -0.561525
## sentiment_score_diff:TreatmentChat Generate       0.015729   0.016325  0.963497
##                                                    Pr(>|t|)    
## sentiment_score_diff                             0.15287636    
## TreatmentHint Control                            0.92771103    
## TreatmentOne-Click Generate                      0.00057193 ***
## TreatmentChat Generate                           0.00015757 ***
## memory_perc                                      0.00209304 ** 
## memory_numeric                                   0.09652463 .  
## sentiment_score_diff:TreatmentHint Control       0.68253672    
## sentiment_score_diff:TreatmentOne-Click Generate 0.57457431    
## sentiment_score_diff:TreatmentChat Generate      0.33554716    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184021     Adj. R2: 0.046266
##                  Within R2: 0.036417

Smiliarity Mechanism Understanding - Review Length

df_followup_panel_final_nonempty_with_sim_score$ContentLength <- nchar(df_followup_panel_final_nonempty_with_sim_score$Content)
df_followup_panel_final_nonempty_with_sim_score$ContentFollowupLength <- nchar(df_followup_panel_final_nonempty_with_sim_score$ContentFollowup)
df_followup_panel_final_nonempty_with_sim_score$ContentLengthDifference <- abs(df_followup_panel_final_nonempty_with_sim_score$ContentFollowupLength - df_followup_panel_final_nonempty_with_sim_score$ContentLength)

Followup

feols(similarity ~ ContentFollowupLength | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                       Estimate Std. Error t value  Pr(>|t|)    
## ContentFollowupLength 0.000553   0.000055 9.98241 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.179692     Adj. R2: 0.093607
##                  Within R2: 0.081218

Subgroup Analysis

feols(similarity ~ Treatment * ContentFollowupLength | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                    Estimate Std. Error  t value
## TreatmentHint Control                             -0.033660   0.020725 -1.62409
## TreatmentOne-Click Generate                       -0.027784   0.020577 -1.35023
## TreatmentChat Generate                            -0.028011   0.020172 -1.38857
## ContentFollowupLength                              0.000205   0.000088  2.33610
## TreatmentHint Control:ContentFollowupLength        0.000279   0.000134  2.07555
## TreatmentOne-Click Generate:ContentFollowupLength  0.000569   0.000131  4.34074
## TreatmentChat Generate:ContentFollowupLength       0.000612   0.000125  4.88696
##                                                       Pr(>|t|)    
## TreatmentHint Control                             0.1046933051    
## TreatmentOne-Click Generate                       0.1772672263    
## TreatmentChat Generate                            0.1652936842    
## ContentFollowupLength                             0.0196965201 *  
## TreatmentHint Control:ContentFollowupLength       0.0382083107 *  
## TreatmentOne-Click Generate:ContentFollowupLength 0.0000157467 ***
## TreatmentChat Generate:ContentFollowupLength      0.0000012044 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.176484     Adj. R2: 0.123518
##                  Within R2: 0.113735

Main

feols(similarity ~ ContentLength | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##               Estimate Std. Error t value  Pr(>|t|)    
## ContentLength 0.000371   0.000038 9.74238 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.181379     Adj. R2: 0.076509
##                  Within R2: 0.063887

Subgroup Analysis

feols(similarity ~ Treatment * ContentLength | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                            Estimate Std. Error   t value
## TreatmentHint Control                      0.004508   0.017328  0.260186
## TreatmentOne-Click Generate                0.050181   0.024457  2.051804
## TreatmentChat Generate                     0.123144   0.055676  2.211799
## ContentLength                              0.001074   0.000143  7.492220
## TreatmentHint Control:ContentLength       -0.000241   0.000199 -1.213994
## TreatmentOne-Click Generate:ContentLength -0.000782   0.000166 -4.713932
## TreatmentChat Generate:ContentLength      -0.001089   0.000235 -4.623475
##                                                      Pr(>|t|)    
## TreatmentHint Control                     0.79477751251092710    
## TreatmentOne-Click Generate               0.04046646455797675 *  
## TreatmentChat Generate                    0.02722148156038800 *  
## ContentLength                             0.00000000000015659 ***
## TreatmentHint Control:ContentLength       0.22505606489890448    
## TreatmentOne-Click Generate:ContentLength 0.00000279878582053 ***
## TreatmentChat Generate:ContentLength      0.00000430320622397 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.178119     Adj. R2: 0.107201
##                  Within R2: 0.097236

Absolute Difference

We look at length difference between main and followup using absolute values.

feols(similarity ~ ContentLengthDifference | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                          Estimate Std. Error  t value    Pr(>|t|)    
## ContentLengthDifference -0.000191   0.000046 -4.16961 0.000033345 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.186633     Adj. R2: 0.022231
##                  Within R2: 0.008867

Subgroup Analysis

feols(similarity ~ Treatment * ContentLengthDifference | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                      Estimate Std. Error
## TreatmentHint Control                               -0.000239   0.017742
## TreatmentOne-Click Generate                          0.114199   0.023649
## TreatmentChat Generate                               0.181953   0.025908
## ContentLengthDifference                             -0.000154   0.000098
## TreatmentHint Control:ContentLengthDifference        0.000016   0.000136
## TreatmentOne-Click Generate:ContentLengthDifference -0.000362   0.000139
## TreatmentChat Generate:ContentLengthDifference      -0.000658   0.000147
##                                                       t value
## TreatmentHint Control                               -0.013446
## TreatmentOne-Click Generate                          4.828885
## TreatmentChat Generate                               7.023034
## ContentLengthDifference                             -1.566678
## TreatmentHint Control:ContentLengthDifference        0.119354
## TreatmentOne-Click Generate:ContentLengthDifference -2.598110
## TreatmentChat Generate:ContentLengthDifference      -4.459880
##                                                               Pr(>|t|)    
## TreatmentHint Control                               0.9892745810230138    
## TreatmentOne-Click Generate                         0.0000016031352365 ***
## TreatmentChat Generate                              0.0000000000041723 ***
## ContentLengthDifference                             0.1175278674247023    
## TreatmentHint Control:ContentLengthDifference       0.9050208985456247    
## TreatmentOne-Click Generate:ContentLengthDifference 0.0095211508241840 ** 
## TreatmentChat Generate:ContentLengthDifference      0.0000091955911197 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.181644     Adj. R2: 0.071514
##                  Within R2: 0.06115

Sentiment Distribution

Sentiment Label Distribution

We first look at the labels distribution

sent_distribution_df_gpt <- merge(df_followup_panel_final_nonempty_with_sim_score %>% 
  dplyr::select(text_sentiment_gpt, Treatment) %>%
  filter(text_sentiment_gpt != "mixed") %>% 
  group_by(Treatment, text_sentiment_gpt) %>%
  summarise(count = n(), .groups = "drop"), 
  df_followup_panel_final_nonempty_with_sim_score %>% 
  dplyr::select(text_sentiment_gpt, Treatment) %>%
  filter(text_sentiment_gpt != "mixed") %>% 
  group_by(Treatment) %>%
  summarise(sum = n(), .groups = "drop")) %>%
  mutate(percentage = count / sum)

ggplot(sent_distribution_df_gpt, aes(x = Treatment, y = percentage, fill = text_sentiment_gpt)) +
  geom_bar(stat = "identity", position = "dodge") + 
  scale_y_continuous(labels = scales::percent_format(scale = 100), limits = c(0, 1)) +
  geom_text(aes(label = paste0(round(percentage * 100, 2), "%")),
            position = position_dodge(width = 0.9), vjust = -1, size = 3) +
  labs(title = "Sentiment Label Distribution by Treatment", y = "Percentage") +
  theme_minimal()

sent_distribution_df_gpt_followup <- merge(df_followup_panel_final_nonempty_with_sim_score %>% 
  dplyr::select(text_sentiment_gpt_followup, Treatment) %>%
  filter(text_sentiment_gpt_followup != "mixed") %>% 
  group_by(Treatment, text_sentiment_gpt_followup) %>%
  summarise(count = n(), .groups = "drop"), 
  df_followup_panel_final_nonempty_with_sim_score %>% 
  dplyr::select(text_sentiment_gpt_followup, Treatment) %>%
  filter(text_sentiment_gpt_followup != "mixed") %>% 
  group_by(Treatment) %>%
  summarise(sum = n(), .groups = "drop")) %>%
  mutate(percentage = count / sum)

ggplot(sent_distribution_df_gpt_followup, aes(x = Treatment, y = percentage, fill = text_sentiment_gpt_followup)) +
  geom_bar(stat = "identity", position = "dodge") + 
  scale_y_continuous(labels = scales::percent_format(scale = 100), limits = c(0, 1)) +
  geom_text(aes(label = paste0(round(percentage * 100, 2), "%")),
            position = position_dodge(width = 0.9), vjust = -1, size = 3) +
  labs(title = "Sentiment Label Distribution by Treatment (Followup)", y = "Percentage") +
  theme_minimal()

Sentiment Segment Distribution

sent_distribution_df <- df_followup_panel_final_nonempty_with_sim_score %>% 
  dplyr::select(text_sent_pos, text_sent_neu, text_sent_neg, 
                text_sent_pos_followup, text_sent_neu_followup, text_sent_neg_followup, 
                User.Id, Video.Id, Treatment)

sent_distribution_df_melt <- reshape2::melt(sent_distribution_df, id.vars = c("User.Id", "Video.Id", "Treatment"))
sent_distribution_df_melt$main_followup <- ifelse(str_detect(sent_distribution_df_melt$variable, "followup"), "followup", "main")
sent_distribution_df_melt <- sent_distribution_df_melt %>% rename(sentiment_type = variable, sentiment_value = value)

ggplot(sent_distribution_df_melt %>% filter(main_followup  == "main")) +
  geom_density(aes(x = sentiment_value, fill = sentiment_type, group = sentiment_type), alpha = 0.7, position = "identity") +
  scale_y_sqrt() +
  facet_wrap(~Treatment) + 
  theme_minimal() +
  labs(title = "Sentiment Distribution by Sentiment Type and Treatment", x = "Sentiment Value", y = "Sqaure Root Scaled Density")

ggplot(sent_distribution_df_melt %>% filter(main_followup  == "followup")) +
  geom_density(aes(x = sentiment_value, fill = sentiment_type, group = sentiment_type), alpha = 0.7, position = "identity") +
  scale_y_sqrt() +
  facet_wrap(~Treatment) + 
  theme_minimal() +
  labs(title = "Sentiment Distribution by Sentiment Type and Treatment (Followup)", x = "Sentiment Value", y = "Sqaure Root Scaled Density")

Sentiment Distribution Analysis for LLM

We do a sentiment distribution analysis, where we use existing comments (30 from each video) on YouTube and extract keywords using LLMs. We then use the keywords and use the exact prompts in the main experiment and ask LLM to generate reviews based on these keywords.

sentiment_df <- read.csv("sentiment_distribution_analysis/comments_English.csv")
sentiment_df

# remove rows where author does not start with @
sentiment_df <- sentiment_df %>% filter(str_detect(author, "^@"))

# convert publishedTimeText from texts to month
# if contains years, then *12
# if contains week then use numbers of weeks * 1/4
# if contains days then use numbers of days * 1/30

sentiment_df$publishedTimeText <- as.character(sentiment_df$publishedTimeText)
sentiment_df$publishedTimeNumeric <- round(ifelse(str_detect(sentiment_df$publishedTimeText, "year"), 
                                         as.numeric(str_extract(sentiment_df$publishedTimeText, "[0-9]+")) * 12,
                                         ifelse(str_detect(sentiment_df$publishedTimeText, "week"), 
                                                as.numeric(str_extract(sentiment_df$publishedTimeText, "[0-9]+")) * 1/4,
                                                ifelse(str_detect(sentiment_df$publishedTimeText, "day"), 
                                                       as.numeric(str_extract(sentiment_df$publishedTimeText, "[0-9]+")) * 1/30,
                                                       as.numeric(str_extract(sentiment_df$publishedTimeText, "[0-9]+"))))), 2)

sentiment_df_col_of_interest <- c("publishedTimeNumeric", "isHearted", "isPinned",
                                  "Id", "Title", "description", 
                                  "simpleText", "keywords", "generatedReview", 
                                  "text_sent_pos", "text_sent_neu", "text_sent_neg", 
                                  "generated_sent_pos", "generated_sent_neu", "generated_sent_neg")

sentiment_df$isHearted <- factor(sentiment_df$isHearted, levels = c("False", "True"))
sentiment_df$isPinned <- factor(sentiment_df$isPinned, levels = c("False", "True"))

sentiment_df$generated_sent <- ifelse(sentiment_df$generated_sent_pos > sentiment_df$generated_sent_neu & 
                                        sentiment_df$generated_sent_pos > sentiment_df$generated_sent_neg, 
                                      sentiment_df$generated_sent_pos,
                                      ifelse(sentiment_df$generated_sent_neg > sentiment_df$generated_sent_pos & sentiment_df$generated_sent_neg > sentiment_df$generated_sent_neu, 
                                             -sentiment_df$generated_sent_neg, 0))
sentiment_df$text_sent <- ifelse(sentiment_df$text_sent_pos > sentiment_df$text_sent_neu & 
                                    sentiment_df$text_sent_pos > sentiment_df$text_sent_neg, 
                                  sentiment_df$text_sent_pos,
                                  ifelse(sentiment_df$text_sent_neg > sentiment_df$text_sent_pos & sentiment_df$text_sent_neg > sentiment_df$text_sent_neu, 
                                         -sentiment_df$text_sent_neg, 0))

sentiment_df$diff_sent_pos <- sentiment_df$generated_sent_pos - sentiment_df$text_sent_pos
sentiment_df$diff_sent_neu <- sentiment_df$generated_sent_neu - sentiment_df$text_sent_neu
sentiment_df$diff_sent_neg <- sentiment_df$generated_sent_neg - sentiment_df$text_sent_neg


sentiment_df$sent_score_diff <- sentiment_df$generated_sent - sentiment_df$text_sent

Sentiment Segments

Direct Test: Whether the difference between generated and original reviews’ sentiments is statistically different from 0

one sample t-test for difference in positive sentiments (generated - original)

# t test
t.test(sentiment_df$diff_sent_pos, mu = 0)

## 
##  One Sample t-test
## 
## data:  sentiment_df$diff_sent_pos
## t = 15.357, df = 358, p-value < 0.00000000000000022
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  0.260327 0.336795
## sample estimates:
## mean of x 
##  0.298561

one sample t-test for difference in neutral sentiments (generated - original)

# t test
t.test(sentiment_df$diff_sent_neu, mu = 0)

## 
##  One Sample t-test
## 
## data:  sentiment_df$diff_sent_neu
## t = -7.274, df = 358, p-value = 0.00000000000222
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.14235167 -0.08176004
## sample estimates:
##  mean of x 
## -0.1120559

one sample t-test for difference in negative sentiments (generated - original)

# t test
t.test(sentiment_df$diff_sent_neg, mu = 0)

## 
##  One Sample t-test
## 
## data:  sentiment_df$diff_sent_neg
## t = -11.661, df = 358, p-value < 0.00000000000000022
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.2114471 -0.1504211
## sample estimates:
##  mean of x 
## -0.1809341

one sample t-test for overall sentiment score

# t test
t.test(sentiment_df$sent_score_diff, mu = 0)

## 
##  One Sample t-test
## 
## data:  sentiment_df$sent_score_diff
## t = 14.795, df = 358, p-value < 0.00000000000000022
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  0.4161080 0.5436883
## sample estimates:
## mean of x 
## 0.4798981

Panel Test (without associating original and generated reviews): we obtain a ~2160 rows of dataset [3 sentiment type (pos, neu, neg) x 2 sentiment group (original vs generated) x 12 videos x 30 comments per video] and run three regressions (i.e. one regression for each sentiment type) with the following specification

Sentiment value ~ Sentiment Group + PublishedTime + isHearted + isPinned | VideoId

# melt
sentiment_df_melt <- reshape2::melt(sentiment_df %>% dplyr::select(all_of(sentiment_df_col_of_interest)), 
                                    id.vars = c("publishedTimeNumeric", "isHearted", "isPinned",
                                                "Id", "Title", "description", 
                                                "simpleText", "keywords", "generatedReview"), 
                                    variable.name = "sentiment_name", value.name = "sentiment_value")
sentiment_df_melt$sentiment_type <- ifelse(str_detect(sentiment_df_melt$sentiment_name, "sent_pos"), "positive", 
                                           ifelse(str_detect(sentiment_df_melt$sentiment_name, "sent_neu"), "neutral", "negative"))
sentiment_df_melt$sentiment_type <- factor(sentiment_df_melt$sentiment_type, levels = c("positive", "neutral", "negative"))

sentiment_df_melt$sentiment_group <- ifelse(str_detect(sentiment_df_melt$sentiment_name, "generated"), "generated", "original")
sentiment_df_melt$sentiment_group <- factor(sentiment_df_melt$sentiment_group, levels = c("original", "generated"))

# plot
ggplot(sentiment_df_melt %>% filter(!is.na(sentiment_value))) +
  geom_density(aes(x = sentiment_value, fill = sentiment_group, group = sentiment_group), alpha = 0.7, position = "identity") +
  scale_y_sqrt() +
  facet_wrap(~sentiment_type) + 
  theme_minimal() +
  labs(title = "Sentiment Distribution by Sentiment Type and Sentiment Group", x = "Sentiment Value", y = "Sqaure Root Scaled Density")

The following three regressions were done using a panel data (without associating original and generated reviews)

# three regressions separately for each sentiment type
feols(sentiment_value ~ sentiment_group + publishedTimeNumeric + isHearted + isPinned | Id ,
      data = sentiment_df_melt %>% 
        filter(!is.na(sentiment_value)) %>% 
        filter(sentiment_type == "positive"), cluster = ~Id)

## OLS estimation, Dep. Var.: sentiment_value
## Observations: 718
## Fixed-effects: Id: 12
## Standard-errors: Clustered (Id) 
##                          Estimate Std. Error   t value      Pr(>|t|)    
## sentiment_groupgenerated 0.298561   0.029335 10.177660 0.00000061983 ***
## publishedTimeNumeric     0.000781   0.002758  0.283271 0.78222931653    
## isHeartedTrue            0.331326   0.063468  5.220325 0.00028536296 ***
## isPinnedTrue             0.057504   0.061768  0.930963 0.37185410333    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.341345     Adj. R2: 0.259242
##                  Within R2: 0.193888

# three regressions separately for each sentiment type
feols(sentiment_value ~ sentiment_group + publishedTimeNumeric + isHearted + isPinned | Id ,
      data = sentiment_df_melt %>% 
        filter(!is.na(sentiment_value)) %>% 
        filter(sentiment_type == "neutral"), cluster = ~Id)

## OLS estimation, Dep. Var.: sentiment_value
## Observations: 718
## Fixed-effects: Id: 12
## Standard-errors: Clustered (Id) 
##                           Estimate Std. Error   t value   Pr(>|t|)    
## sentiment_groupgenerated -0.112056   0.022407 -5.000867 0.00040198 ***
## publishedTimeNumeric     -0.001795   0.000770 -2.330947 0.03980256 *  
## isHeartedTrue            -0.160517   0.036647 -4.380048 0.00109902 ** 
## isPinnedTrue              0.016735   0.036460  0.458998 0.65517232    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.215655     Adj. R2: 0.143103
##                  Within R2: 0.094848

# three regressions separately for each sentiment type
feols(sentiment_value ~ sentiment_group + publishedTimeNumeric + isHearted + isPinned | Id ,
      data = sentiment_df_melt %>% 
        filter(!is.na(sentiment_value)) %>% 
        filter(sentiment_type == "negative"), cluster = ~Id)

## OLS estimation, Dep. Var.: sentiment_value
## Observations: 718
## Fixed-effects: Id: 12
## Standard-errors: Clustered (Id) 
##                           Estimate Std. Error   t value    Pr(>|t|)    
## sentiment_groupgenerated -0.180934   0.023729 -7.624975 0.000010281 ***
## publishedTimeNumeric      0.001032   0.002256  0.457226 0.656406539    
## isHeartedTrue            -0.164957   0.032809 -5.027776 0.000385300 ***
## isPinnedTrue             -0.080118   0.030687 -2.610770 0.024224213 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.263163     Adj. R2: 0.153679
##                  Within R2: 0.12216

Single Sentiment Score (-1 to 1)

# melt
sentiment_df_melt_sentscore <- reshape2::melt(sentiment_df %>% dplyr::select(c("publishedTimeNumeric", "isHearted", "isPinned",
                                                "Id", "Title", "description", 
                                                "simpleText", "keywords", "generatedReview", "generated_sent", "text_sent")), 
                                    id.vars = c("publishedTimeNumeric", "isHearted", "isPinned",
                                                "Id", "Title", "description", 
                                                "simpleText", "keywords", "generatedReview"), 
                                    variable.name = "sentiment_name", value.name = "sentiment_value")

sentiment_df_melt_sentscore$sentiment_group <- ifelse(str_detect(sentiment_df_melt_sentscore$sentiment_name, "generated"), "generated", "original")
sentiment_df_melt_sentscore$sentiment_group <- factor(sentiment_df_melt_sentscore$sentiment_group, levels = c("original", "generated"))

ggplot(sentiment_df_melt_sentscore %>% filter(!is.na(sentiment_value))) +
  geom_density(aes(x = sentiment_value, fill = sentiment_group, group = sentiment_group), alpha = 0.7, position = "identity") +
  scale_y_sqrt() +
  theme_minimal() +
  labs(title = "Sentiment Distribution by Sentiment Group", x = "Sentiment Value", y = "Sqaure Root Scaled Density")

feols(sentiment_value ~ sentiment_group + publishedTimeNumeric + isHearted + isPinned | Id ,
      data = sentiment_df_melt_sentscore %>% 
        filter(!is.na(sentiment_value)), cluster = ~Id)

## OLS estimation, Dep. Var.: sentiment_value
## Observations: 718
## Fixed-effects: Id: 12
## Standard-errors: Clustered (Id) 
##                           Estimate Std. Error   t value      Pr(>|t|)    
## sentiment_groupgenerated  0.479898   0.047866 10.025881 0.00000072044 ***
## publishedTimeNumeric     -0.000039   0.005516 -0.007011 0.99453139867    
## isHeartedTrue             0.494162   0.094950  5.204476 0.00029244544 ***
## isPinnedTrue              0.159476   0.090614  1.759952 0.10615760723    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.581041     Adj. R2: 0.226602
##                  Within R2: 0.172025

Noise Analysis

We ask GPT to extract themes from the review using the following prompt:

“Extract the themes in the following review. Return a list of themes in one line, separated by commas.”

We count the number of themes in each of the reviews and compare the number of themes difference in the main experiment and follow-up survey using an absolute difference.

df_followup_panel_final_nonempty_with_sim_score$themes_count_diff <- abs(df_followup_panel_final_nonempty_with_sim_score$themes_count - df_followup_panel_final_nonempty_with_sim_score$themes_followup_count)

feols(similarity ~ themes_count_diff | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                   Estimate Std. Error  t value  Pr(>|t|)    
## themes_count_diff -0.02248   0.002611 -8.61057 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184481     Adj. R2: 0.044654
##                  Within R2: 0.031596

Subgroup Analysis

feols(similarity ~ Treatment * themes_count_diff | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                                                Estimate Std. Error   t value
## TreatmentHint Control                          0.011623   0.019478  0.596720
## TreatmentOne-Click Generate                    0.093422   0.021489  4.347435
## TreatmentChat Generate                         0.083899   0.021897  3.831598
## themes_count_diff                             -0.021397   0.007189 -2.976439
## TreatmentHint Control:themes_count_diff       -0.008783   0.009297 -0.944723
## TreatmentOne-Click Generate:themes_count_diff -0.012737   0.008536 -1.492160
## TreatmentChat Generate:themes_count_diff      -0.005778   0.008623 -0.670086
##                                                  Pr(>|t|)    
## TreatmentHint Control                         0.550838402    
## TreatmentOne-Click Generate                   0.000015283 ***
## TreatmentChat Generate                        0.000135807 ***
## themes_count_diff                             0.002991208 ** 
## TreatmentHint Control:themes_count_diff       0.345044225    
## TreatmentOne-Click Generate:themes_count_diff 0.135994047    
## TreatmentChat Generate:themes_count_diff      0.502968073    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.180895     Adj. R2: 0.079161
##                  Within R2: 0.068883

“Mediation Analysis”

We conduct mediation analysis for the following differences and similarities that we have identified:

Sentiment Label Similarities
Sentiment Segment Similarities
Single Sentiment Score Difference (Followup - Main)
Single Sentiment Score Absolute Difference
Content Length Difference
Theme Count Difference

We include an overall regression first.

feols(similarity ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value   Pr(>|t|)    
## TreatmentHint Control       0.001576   0.013316 0.118332 0.90583010    
## TreatmentOne-Click Generate 0.048806   0.014047 3.474550 0.00053518 ***
## TreatmentChat Generate      0.048922   0.014317 3.417065 0.00066026 ***
## memory_numeric              0.009579   0.005845 1.638856 0.10157938    
## memory_perc                 0.061768   0.020332 3.038027 0.00244741 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184405     Adj. R2: 0.043869
##                  Within R2: 0.032398

Sentiment Label Similarities

We do not see difference across treatments here in the first stage.

feols(sentiment_label_similarity ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_label_similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.004606   0.028498 0.161613 0.871645    
## TreatmentOne-Click Generate 0.012879   0.028973 0.444518 0.656771    
## TreatmentChat Generate      0.010169   0.030316 0.335424 0.737381    
## memory_numeric              0.022571   0.011954 1.888084 0.059324 .  
## memory_perc                 0.044655   0.040703 1.097077 0.272890    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.459342     Adj. R2: 0.046857
##                  Within R2: 0.003737

Sentiment Segment Similarities

We separate by positive, neutral, and negative -> Do not observe meaningful differences

feols(sentiment_similarity_pos ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_pos
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value Pr(>|t|) 
## TreatmentHint Control        0.005909   0.018683  0.316301  0.75184 
## TreatmentOne-Click Generate -0.012968   0.019567 -0.662743  0.50766 
## TreatmentChat Generate       0.000971   0.020334  0.047743  0.96193 
## memory_numeric               0.012317   0.007845  1.569994  0.11675 
## memory_perc                  0.026811   0.026591  1.008267  0.31359 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.314806     Adj. R2: 0.022674
##                  Within R2: 0.002801

feols(sentiment_similarity_neu ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_neu
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.000435   0.011639 0.037355 0.970210    
## TreatmentOne-Click Generate 0.008279   0.011658 0.710115 0.477810    
## TreatmentChat Generate      0.018365   0.012336 1.488667 0.136912    
## memory_numeric              0.012637   0.004942 2.557238 0.010707 *  
## memory_perc                 0.019138   0.017159 1.115319 0.265000    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.180165     Adj. R2: 0.028224
##                  Within R2: 0.00793

feols(sentiment_similarity_neg ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_similarity_neg
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value Pr(>|t|)    
## TreatmentHint Control       0.015109   0.014594 1.035241 0.300824    
## TreatmentOne-Click Generate 0.011324   0.015271 0.741515 0.458567    
## TreatmentChat Generate      0.030265   0.015997 1.891913 0.058811 .  
## memory_numeric              0.012308   0.006115 2.012861 0.044415 *  
## memory_perc                 0.000132   0.019779 0.006683 0.994669    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.251499     Adj. R2: 0.054397
##                  Within R2: 0.003549

Single Sentiment Score Difference (Followup - Main)

feols(sentiment_score_diff ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_score_diff
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value    Pr(>|t|)    
## TreatmentHint Control       -0.017154   0.040116 -0.427622 0.669024586    
## TreatmentOne-Click Generate -0.157286   0.042448 -3.705353 0.000223473 ***
## TreatmentChat Generate      -0.188945   0.043685 -4.325117 0.000016881 ***
## memory_numeric              -0.012901   0.017540 -0.735556 0.462185404    
## memory_perc                  0.003025   0.057903  0.052250 0.958340925    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.679325     Adj. R2: 0.027227
##                  Within R2: 0.014558

feols(similarity ~ sentiment_score_diff + Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error  t value   Pr(>|t|)    
## sentiment_score_diff        0.015048   0.005702 2.638813 0.00845790 ** 
## TreatmentHint Control       0.001834   0.013290 0.137991 0.89027749    
## TreatmentOne-Click Generate 0.051172   0.014044 3.643737 0.00028348 ***
## TreatmentChat Generate      0.051765   0.014314 3.616396 0.00031469 ***
## memory_numeric              0.009773   0.005862 1.667263 0.09579671 .  
## memory_perc                 0.061723   0.020238 3.049924 0.00235344 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.184121     Adj. R2: 0.046413
##                  Within R2: 0.035371

Single Sentiment Score Absolute Difference

We do not observe difference here.

feols(sentiment_score_diff_abs ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: sentiment_score_diff_abs
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value Pr(>|t|)    
## TreatmentHint Control       -0.021849   0.031451 -0.694703 0.487414    
## TreatmentOne-Click Generate -0.003921   0.033579 -0.116779 0.907060    
## TreatmentChat Generate      -0.024408   0.034590 -0.705640 0.480587    
## memory_numeric              -0.024362   0.013143 -1.853532 0.064121 .  
## memory_perc                 -0.029454   0.043619 -0.675254 0.499681    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.539498     Adj. R2: 0.03625
##                  Within R2: 0.00268

Content Length Difference

feols(ContentLengthDifference ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: ContentLengthDifference
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                             Estimate Std. Error   t value  Pr(>|t|)    
## TreatmentHint Control       -4.52278    6.39170 -0.707602   0.47937    
## TreatmentOne-Click Generate 62.28697    7.01439  8.879887 < 2.2e-16 ***
## TreatmentChat Generate      92.63603    7.05714 13.126563 < 2.2e-16 ***
## memory_numeric               1.78466    2.84660  0.626946   0.53085    
## memory_perc                 -2.70265   10.56450 -0.255823   0.79814    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 83.3     Adj. R2: 0.18872 
##              Within R2: 0.189469

feols(similarity ~ ContentLengthDifference + Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value         Pr(>|t|)    
## ContentLengthDifference     -0.000381   0.000059 -6.482315 0.00000000014593 ***
## TreatmentHint Control       -0.000147   0.013514 -0.010914 0.99129440419222    
## TreatmentOne-Click Generate  0.072537   0.014718  4.928436 0.00000098006596 ***
## TreatmentChat Generate       0.084217   0.015180  5.547889 0.00000003763396 ***
## memory_numeric               0.010259   0.005744  1.785946 0.07443171344996 .  
## memory_perc                  0.060739   0.019945  3.045315 0.00238944636722 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.181651     Adj. R2: 0.071823
##                  Within R2: 0.061075

Theme Count Difference

feols(themes_count_diff ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: themes_count_diff
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error   t value
## TreatmentHint Control       -0.132556   0.085757 -1.545713
## TreatmentOne-Click Generate  0.622250   0.094090  6.613351
## TreatmentChat Generate       0.814871   0.103147  7.900095
## memory_numeric              -0.024377   0.044870 -0.543282
## memory_perc                 -0.212895   0.150818 -1.411598
##                                          Pr(>|t|)    
## TreatmentHint Control       0.1225119101552196305    
## TreatmentOne-Click Generate 0.0000000000630793726 ***
## TreatmentChat Generate      0.0000000000000077936 ***
## memory_numeric              0.5870650708842575227    
## memory_perc                 0.1584007726284619444    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 1.42837     Adj. R2: 0.073824
##                 Within R2: 0.071483

feols(similarity ~ themes_count_diff + Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

## OLS estimation, Dep. Var.: similarity
## Observations: 2,450
## Fixed-effects: Video.Id: 12,  order: 9,  orderFollowup: 3
## Standard-errors: Clustered (User.Id) 
##                              Estimate Std. Error    t value     Pr(>|t|)    
## themes_count_diff           -0.028279   0.002797 -10.110005    < 2.2e-16 ***
## TreatmentHint Control       -0.002173   0.013025  -0.166822 0.8675461409    
## TreatmentOne-Click Generate  0.066402   0.013731   4.835776 0.0000015499 ***
## TreatmentChat Generate       0.071965   0.014081   5.110739 0.0000003890 ***
## memory_numeric               0.008890   0.005558   1.599415 0.1100659971    
## memory_perc                  0.055748   0.019367   2.878509 0.0040865522 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.179926     Adj. R2: 0.089369
##                  Within R2: 0.078824

Output Dataframe

theme_count_feols <- feols(similarity ~ themes_count_diff + Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

content_length_feols <- feols(similarity ~ ContentLengthDifference + Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

original_feols <- feols(similarity ~ Treatment + memory_numeric + memory_perc | Video.Id + order + orderFollowup, 
      data = df_followup_panel_final_nonempty_with_sim_score %>% 
        filter(!prolific_id %in% invalid_prolific_ids), cluster = ~User.Id) %>% summary()

model_list <- list(original_feols, content_length_feols, theme_count_feols)
texreg::texreg(model_list)

## 
## \begin{table}
## \begin{center}
## \begin{tabular}{l c c c}
## \hline
##  & Model 1 & Model 2 & Model 3 \\
## \hline
## TreatmentHint Control       & $0.00$       & $-0.00$       & $-0.00$       \\
##                             & $(0.01)$     & $(0.01)$      & $(0.01)$      \\
## TreatmentOne-Click Generate & $0.05^{***}$ & $0.07^{***}$  & $0.07^{***}$  \\
##                             & $(0.01)$     & $(0.01)$      & $(0.01)$      \\
## TreatmentChat Generate      & $0.05^{***}$ & $0.08^{***}$  & $0.07^{***}$  \\
##                             & $(0.01)$     & $(0.02)$      & $(0.01)$      \\
## memory\_numeric             & $0.01$       & $0.01$        & $0.01$        \\
##                             & $(0.01)$     & $(0.01)$      & $(0.01)$      \\
## memory\_perc                & $0.06^{**}$  & $0.06^{**}$   & $0.06^{**}$   \\
##                             & $(0.02)$     & $(0.02)$      & $(0.02)$      \\
## ContentLengthDifference     &              & $-0.00^{***}$ &               \\
##                             &              & $(0.00)$      &               \\
## themes\_count\_diff         &              &               & $-0.03^{***}$ \\
##                             &              &               & $(0.00)$      \\
## \hline
## Num. obs.                   & $2450$       & $2450$        & $2450$        \\
## Num. groups: Video.Id       & $12$         & $12$          & $12$          \\
## Num. groups: order          & $9$          & $9$           & $9$           \\
## Num. groups: orderFollowup  & $3$          & $3$           & $3$           \\
## R$^2$ (full model)          & $0.05$       & $0.08$        & $0.10$        \\
## R$^2$ (proj model)          & $0.03$       & $0.06$        & $0.08$        \\
## Adj. R$^2$ (full model)     & $0.04$       & $0.07$        & $0.09$        \\
## Adj. R$^2$ (proj model)     & $0.03$       & $0.06$        & $0.08$        \\
## \hline
## \multicolumn{4}{l}{\scriptsize{$^{***}p<0.001$; $^{**}p<0.01$; $^{*}p<0.05$}}
## \end{tabular}
## \caption{Statistical models}
## \label{table:coefficients}
## \end{center}
## \end{table}

Robustness Check

whether memory the same across all modes
whether length the same across all modes (grouped into user level)
whether social desirability the same across all modes

cov_form <- c("memory_perc", "memory_numeric", "ContentFollowupLength", "sdb")
cov_name_fancy <- c("Memory (Observed)", "Memory (Self-Reported)", "Review Length (Follow-up)", "Social Desirability Bias")

# Create formula #  
balance_fmla_cov_followup = formula(paste("Treatment != 'Pure Control' ~",paste(cov_form,collapse="+")))
# Pre-allocate dataframe #
balance_plot_treatments_followup = data.frame(matrix(NA,0,3))
# Compute standardized differences for each of the treatment groups and fill dataframe #
for (t in c("Hint Control", "One-Click Generate", "Chat Generate")){
        balance_temp = xBalance(balance_fmla_cov_followup,
                                data=df_followup_panel_final_nonempty_with_sim_score %>% 
                                  filter(!prolific_id %in% invalid_prolific_ids) %>% 
                                  filter(Treatment %in% c(t, "Pure Control"))  %>%
                                  group_by(Treatment, User.Id) %>%
                                  summarise(memory_perc = mean(memory_perc), 
                                            memory_numeric = mean(memory_numeric),
                                            ContentFollowupLength = mean(ContentFollowupLength),
                                            sdb = mean(sdb)),
                                report="std.diffs",na.rm=TRUE)
        balance_temp = data.frame(balance_temp)
        balance_temp = cbind(cov_name_fancy, balance_temp[,1],rep(t,length(cov_name_fancy)))
    
        balance_plot_treatments_followup = rbind(balance_plot_treatments_followup,balance_temp)
        }

## `summarise()` has grouped output by 'Treatment'. You can override using the
## `.groups` argument.
## `summarise()` has grouped output by 'Treatment'. You can override using the
## `.groups` argument.
## `summarise()` has grouped output by 'Treatment'. You can override using the
## `.groups` argument.

# Fill dataframe to input into the plot function with the remaining columns #
colnames(balance_plot_treatments_followup) = c("covariates","diff","grouping")
balance_plot_treatments_followup[,2] = as.numeric(balance_plot_treatments_followup[,2])


balance_plot_treatments_followup$grouping <- factor(balance_plot_treatments_followup$grouping, levels = c("Hint Control", "One-Click Generate", "Chat Generate"))

match_plot(balance_plot_treatments_followup, "Robustness Check: Standardized Differences by Treatment Group")

ggsave("tables_and_figures/std_diff_by_samples_robust_followup.png", width = 18, height = 20)

whether similarity the same across all videos

# Pre-allocate dataframe #
balance_plot_treatments_followup_video = data.frame(matrix(NA,0,11))
# Compute standardized differences for each of the treatment groups and fill dataframe #
for (t in c(13, 14, 11, 16, 17, 18, 19, 20, 21, 22, 23)){
        balance_temp = xBalance(formula("Video.Id != 15 ~ similarity + similarity_gpt"),
                                data=df_followup_panel_final_nonempty_with_sim_score %>% 
                                  filter(!prolific_id %in% invalid_prolific_ids) %>% 
                                  filter(Video.Id %in% c(15, t)),
                                report="std.diffs",na.rm=TRUE)
        balance_temp = data.frame(balance_temp)
        balance_temp = cbind(c("Similarity Score (SBERT)", "Similarity Score (GPT)"), balance_temp[,1],rep(t,2))
    
        balance_plot_treatments_followup_video = rbind(balance_plot_treatments_followup_video,balance_temp)
        }
# Fill dataframe to input into the plot function with the remaining columns #
colnames(balance_plot_treatments_followup_video) = c("covariates","diff","grouping")
balance_plot_treatments_followup_video[,2] = as.numeric(balance_plot_treatments_followup_video[,2])
balance_plot_treatments_followup_video[,3] = as.numeric(balance_plot_treatments_followup_video[,3])

balance_plot_treatments_followup_video$grouping <- case_when(
  balance_plot_treatments_followup_video$grouping == 17 ~ "Coin Operated",
  balance_plot_treatments_followup_video$grouping == 22 ~ "Crook",
  balance_plot_treatments_followup_video$grouping == 20 ~ "Forever Sleep",
  balance_plot_treatments_followup_video$grouping == 16 ~ "Soft Rain",
  balance_plot_treatments_followup_video$grouping == 23 ~ "Time Machine",
  balance_plot_treatments_followup_video$grouping == 14 ~ "Alternative Math",
  balance_plot_treatments_followup_video$grouping == 11 ~ "Radical Honesty",
  balance_plot_treatments_followup_video$grouping == 21 ~ "Different",
  balance_plot_treatments_followup_video$grouping == 18 ~ "The Cook",
  balance_plot_treatments_followup_video$grouping == 19 ~ "Skipped",
  balance_plot_treatments_followup_video$grouping == 13 ~ "Boom"
)

match_plot_video = function(data,title){
    pic = ggplot(data=data,aes(x=diff,y=factor(covariates,levels = unique(covariates)),group=grouping)) + 
        theme_bw()+
        theme(axis.line.y = element_line(colour="black"),axis.line.x = element_line(colour="black"),
              panel.border = element_blank(),
              panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
        legend.title = element_text(face='bold', size=20, hjust=0.5, vjust=0.5),
        legend.position = c(0.97,.80),legend.justification = c("right", "bottom"),
        legend.key = element_rect(colour = "transparent"),
        legend.box.just = "right", legend.text = element_text(size=20), legend.margin = margin(0, 6, 6, 6),
        legend.box.background = element_rect( fill="transparent", size=1),legend.background = element_blank()) +
        geom_vline(aes(xintercept=-0.1),color="black", linetype="longdash", size=0.75) +
        geom_vline(aes(xintercept=0.1),color="black", linetype="longdash", size=0.75) +
        geom_vline(aes(xintercept=0.25),color="black", linetype="dashed", size=0.5) +
        geom_vline(aes(xintercept=0), color="black", linetype="solid", size=0.5) +
        geom_vline(aes(xintercept=-0.25),color="black", linetype="dashed", size=0.5) +
        geom_vline(aes(xintercept=-0.5),color="black", linetype="dotted", size=0.5) +
        geom_vline(aes(xintercept=0.5),color="black", linetype="dotted", size=0.5) +
        geom_point(aes(color=grouping,fill=grouping,shape=grouping),size=5) +
    
        scale_color_manual(name = "vs French Roast",
                           values = c("Coin Operated" = "black", "Crook" = "black",
                                      "Forever Sleep" = "black", "Soft Rain" = "black",
                                      "Time Machine" = "black", "Alternative Math" = "black",
                                      "Radical Honesty" = "black", "Different" = "black",
                                      "The Cook" = "black", "Skipped" = "black", "Boom" = "black"))+
        scale_fill_manual(name = "vs French Roast",
                           values = c("Coin Operated" = "turquoise1", "Crook" = "black",
                                      "Forever Sleep" = "pink", "Soft Rain" = "orange",
                                      "Time Machine" = "green3", "Alternative Math" = "red",
                                      "Radical Honesty" = "navy", "Different" = "yellow2",
                                      "The Cook" = "grey", "Skipped" = "purple3", "Boom" = "turquoise"))+
        
        scale_shape_manual(name = "vs French Roast",
                           values = c("Coin Operated" = 2, "Crook" = 3,
                                      "Forever Sleep" = 4, "Soft Rain" = 5,
                                      "Time Machine" = 6, "Alternative Math" = 7,
                                      "Radical Honesty" = 8, "Different" = 9,
                                      "The Cook" = 10, "Skipped" = 11, "Boom" = 12))+

    
        labs(y="Variable",x="Standardized Difference")+
        #xlim(-0.5,0.5) +
        scale_x_discrete(limits = c(-0.5,-0.25,-0.1,0,0.1,0.25,0.5), labels = c("-0.50","-0.25","-0.10","0","0.10","0.25","0.50")) +
        scale_y_discrete(limits = rev) +
        theme(axis.text.x = element_text(color = "black", size = 20, angle = 0, hjust = .5, vjust = 0, face = "plain"),
        axis.text.y = element_text(color = "black", size = 20, angle = 0, hjust = 1, vjust = .5, face = "plain",
                                   margin=unit(rep(0.5,4),"cm")),
        axis.title.x = element_text(color = "black", size = 25, angle = 0, hjust = .5, vjust = 0, face = "bold"),
        axis.title.y = element_text(color = "black", size = 30, angle = 90, hjust = .5, vjust = .5, face = "bold"),
        axis.ticks.length.y = unit(-0.25,"cm"), axis.ticks.x=element_blank()) +
        ggtitle(title) +
        theme(plot.title = element_text(face='bold', size=30, hjust=0.5, vjust=0.5)) +
        # 
        # facet_grid(rows = vars(cov_group),
        #      scales = "free_y", # Let the x axis vary across facets.
        #      space = "free_y",  # Let the width of facets vary and force all bars to have the same width.
        #      switch = "y")+
        theme(strip.placement = "outside",    # Place facet labels outside x axis labels.
         strip.background = element_blank(),  # Make facet label background white.
         strip.text.y.left = element_text(size = 21,face = "bold",angle = 0, hjust=0.5),
         axis.title = element_blank(),
              panel.border = element_rect(color = "grey70", fill = NA, size = 2))

pic
}

match_plot_video(balance_plot_treatments_followup_video, "Similarity Standardized Differences by Video")

ggsave("tables_and_figures/std_diff_by_samples_similarity_followup_video.png", width = 18, height = 20)

Primary Analysis Re-Done for Selected Participants

Only Keep Mode 4 respondents who said they “knew” they needed to click the button or said they did not know but saw the comment box (note that they will have to click the button to see the comment box)

# mode 4 respondents to keep
mode4_respondents_to_keep <- df_followup_final %>% filter(Treatment == "Chat Generate") %>% filter(mode4_button == "No, I knew I needed to click this button." | (mode4_button == "Yes, I did NOT know I needed to click this button." & sawCommentBox == 1)) %>% pull(prolific_id)
mode123_respondents_to_keep <- df_wide_all %>% filter(Treatment != "Chat Generate") %>% pull(prolific_id) 
respondents_to_keep <- c(mode4_respondents_to_keep, mode123_respondents_to_keep)

User Level Regression

All Videos

cov_form_lm <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "))
user_lm_all <- lm(cov_form_lm, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))

cov_form_lm_with_cov <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple, collapse = " + "), 
                               "+ willingness_to_pay")
user_lm_with_cov_all <- lm(cov_form_lm_with_cov, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))

stargazer(user_lm_all, user_lm_with_cov_all,  
          star.cutoffs = c(0.05, 0.01, 0.001), 
          dep.var.labels = "Number of Comments", 
          covariate.labels = c("Treatment: Hint Control", "Treatment: One-Click Generate", "Treatment: Chat Generate",
                               "Video 11", "Video 13", "Video 14", "Video 15", "Video 16", "Video 17", 
                               "Video 18", "Video 19", "Video 20", "Video 21", "Video 22", "Video 23",
                               covariates_simple_fancy, "Willingness to Pay"),
          column.labels = c("Without Covariates", "With Covariates"))

## 
## % Table created by stargazer v.5.2.3 by Marek Hlavac, Social Policy Institute. E-mail: marek.hlavac at gmail.com
## % Date and time: Fri, Mar 28, 2025 - 11:02:42
## \begin{table}[!htbp] \centering 
##   \caption{} 
##   \label{} 
## \begin{tabular}{@{\extracolsep{5pt}}lcc} 
## \\[-1.8ex]\hline 
## \hline \\[-1.8ex] 
##  & \multicolumn{2}{c}{\textit{Dependent variable:}} \\ 
## \cline{2-3} 
## \\[-1.8ex] & \multicolumn{2}{c}{Number of Comments} \\ 
##  & Without Covariates & With Covariates \\ 
## \\[-1.8ex] & (1) & (2)\\ 
## \hline \\[-1.8ex] 
##  Treatment: Hint Control & 0.229$^{*}$ & 0.246$^{**}$ \\ 
##   & (0.093) & (0.091) \\ 
##   & & \\ 
##  Treatment: One-Click Generate & 0.195$^{*}$ & 0.236$^{**}$ \\ 
##   & (0.092) & (0.090) \\ 
##   & & \\ 
##  Treatment: Chat Generate & 0.564$^{***}$ & 0.545$^{***}$ \\ 
##   & (0.110) & (0.107) \\ 
##   & & \\ 
##  Video 11 & 0.204$^{*}$ & 0.148 \\ 
##   & (0.084) & (0.082) \\ 
##   & & \\ 
##  Video 13 & 0.142 & 0.135 \\ 
##   & (0.084) & (0.083) \\ 
##   & & \\ 
##  Video 14 & 0.327$^{***}$ & 0.363$^{***}$ \\ 
##   & (0.092) & (0.091) \\ 
##   & & \\ 
##  Video 15 & 0.310$^{**}$ & 0.324$^{***}$ \\ 
##   & (0.098) & (0.096) \\ 
##   & & \\ 
##  Video 16 & 0.276$^{**}$ & 0.277$^{**}$ \\ 
##   & (0.085) & (0.086) \\ 
##   & & \\ 
##  Video 17 & 0.147 & 0.145 \\ 
##   & (0.086) & (0.085) \\ 
##   & & \\ 
##  Video 18 & 0.124 & 0.155 \\ 
##   & (0.089) & (0.088) \\ 
##   & & \\ 
##  Video 19 & 0.379$^{***}$ & 0.445$^{***}$ \\ 
##   & (0.095) & (0.094) \\ 
##   & & \\ 
##  Video 20 & 0.274$^{**}$ & 0.230$^{*}$ \\ 
##   & (0.094) & (0.093) \\ 
##   & & \\ 
##  Video 21 & 0.283$^{**}$ & 0.302$^{**}$ \\ 
##   & (0.098) & (0.097) \\ 
##   & & \\ 
##  Video 22 & 0.191$^{*}$ & 0.171 \\ 
##   & (0.096) & (0.094) \\ 
##   & & \\ 
##  Video 23 & 0.377$^{***}$ & 0.377$^{***}$ \\ 
##   & (0.086) & (0.085) \\ 
##   & & \\ 
##  Age &  & 0.003 \\ 
##   &  & (0.002) \\ 
##   & & \\ 
##  YouTube User &  & 0.243$^{*}$ \\ 
##   &  & (0.102) \\ 
##   & & \\ 
##  Social Media: Non-User &  & 0.468 \\ 
##   &  & (0.395) \\ 
##   & & \\ 
##  Social Media: User &  & $-$0.148 \\ 
##   &  & (0.124) \\ 
##   & & \\ 
##  Social Media Usage (1 - 4 Scale) &  & $-$0.080 \\ 
##   &  & (0.044) \\ 
##   & & \\ 
##  Online Usage (1 - 4 Scale) &  & 0.031 \\ 
##   &  & (0.036) \\ 
##   & & \\ 
##  Female &  & $-$0.175$^{*}$ \\ 
##   &  & (0.074) \\ 
##   & & \\ 
##  Race: Asian &  & 0.211 \\ 
##   &  & (0.167) \\ 
##   & & \\ 
##  Race: Black &  & $-$0.108 \\ 
##   &  & (0.137) \\ 
##   & & \\ 
##  Race: Hispanic &  & $-$0.105 \\ 
##   &  & (0.185) \\ 
##   & & \\ 
##  Race: White &  & $-$0.099 \\ 
##   &  & (0.118) \\ 
##   & & \\ 
##  Race: Other &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Education: High School or Less &  & 0.090 \\ 
##   &  & (0.128) \\ 
##   & & \\ 
##  Education: Some College &  & 0.092 \\ 
##   &  & (0.115) \\ 
##   & & \\ 
##  Education: Bachelor &  & 0.115 \\ 
##   &  & (0.097) \\ 
##   & & \\ 
##  Education: Postgraduate &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Political Party: Democrat &  & $-$0.077 \\ 
##   &  & (0.090) \\ 
##   & & \\ 
##  Political Party: Republican &  & $-$0.215$^{*}$ \\ 
##   &  & (0.104) \\ 
##   & & \\ 
##  Political Party: Other &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Political Ideology (1 - 5 Scale; 5 Strong Liberal) &  & $-$0.068 \\ 
##   &  & (0.041) \\ 
##   & & \\ 
##  Income (1 - 5 Scale) &  & $-$0.028 \\ 
##   &  & (0.035) \\ 
##   & & \\ 
##  Social Media Reply Frequency (1 - 6 Scale) &  & 0.128$^{***}$ \\ 
##   &  & (0.028) \\ 
##   & & \\ 
##  Review Frequency (1 - 6 Scale) &  & 0.148$^{***}$ \\ 
##   &  & (0.033) \\ 
##   & & \\ 
##  Willingness to Pay &  & 0.024 \\ 
##   &  & (0.029) \\ 
##   & & \\ 
##  Constant & 0.772$^{***}$ & 0.245 \\ 
##   & (0.160) & (0.343) \\ 
##   & & \\ 
## \hline \\[-1.8ex] 
## Observations & 1,660 & 1,660 \\ 
## R$^{2}$ & 0.042 & 0.101 \\ 
## Adjusted R$^{2}$ & 0.033 & 0.081 \\ 
## Residual Std. Error & 1.409 (df = 1644) & 1.373 (df = 1623) \\ 
## F Statistic & 4.789$^{***}$ (df = 15; 1644) & 5.088$^{***}$ (df = 36; 1623) \\ 
## \hline 
## \hline \\[-1.8ex] 
## \textit{Note:}  & \multicolumn{2}{r}{$^{*}$p$<$0.05; $^{**}$p$<$0.01; $^{***}$p$<$0.001} \\ 
## \end{tabular} 
## \end{table}

user_lm_basic <- lm(num_comment ~ Treatment, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))

coef_table <- summary(user_lm_basic)$coefficients
# row names contain Treatment or Intercept
coef_table <- coef_table[row.names(coef_table) %in% c("TreatmentHint Control", "TreatmentOne-Click Generate", "TreatmentChat Generate", "(Intercept)"),]

# ggplot
coef_df <- coef_table %>% as.data.frame()
coef_df$ci_lower <- coef_df$Estimate - coef_df$`Std. Error` * 1.96
coef_df$ci_upper <- coef_df$Estimate + coef_df$`Std. Error` * 1.96

coef_df$Treatment <- rownames(coef_df)
coef_df$Treatment <- str_replace(coef_df$Treatment, "Treatment", "")
coef_df$Treatment <- factor(coef_df$Treatment, levels = c( "(Intercept)", "Chat Generate", "One-Click Generate", "Hint Control"))

ggplot(coef_df %>% filter(Treatment != "(Intercept)"), aes(x = Estimate, y = Treatment)) +
  geom_vline(xintercept = 0, linetype = "dashed") +
  geom_point(aes(color = ifelse(ci_lower > 0 | ci_upper < 0, "Significant", "Not Significant")), size = 3) +
  geom_errorbarh(aes(xmin = ci_lower, xmax = ci_upper), height = 0.1) +
  theme_minimal() +
  theme(legend.position = "none") +
  # make fonts bigger
  theme(axis.text.x = element_text(face = "bold", size = 12.5),
        axis.text.y = element_text(face = "bold", size = 12.5),
        axis.title.x = element_text(face = "bold",size = 15),
        axis.title.y = element_text(face = "bold",size = 15)) +
  # make title bigger and bold
  theme(plot.title = element_text(face='bold', size=15),
        plot.subtitle = element_text(size = 12.5)) +
  labs(x = "Coefficient Estimate", y = "Treatment", title = "Number of Comments ATE: User Level Regression", subtitle = paste0("vs Pure Control (Baseline): ", round(coef_df$Estimate[1], 2)))

ggsave("tables_and_figures/num_comment_ate_redo.png", width = 8, height = 6)

First Three Videos

cov_form_lm <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "))
user_lm <- lm(cov_form_lm, data = df_wide %>% filter(prolific_id %in% respondents_to_keep))

cov_form_lm_with_cov <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple, collapse = " + "))
user_lm_with_cov <- lm(cov_form_lm_with_cov, data = df_wide %>% filter(prolific_id %in% respondents_to_keep))

stargazer(user_lm, user_lm_with_cov, type = "text", 
          star.cutoffs = c(0.05, 0.01, 0.001), 
          dep.var.labels = "Number of Comments (Upper Bound: 3)", 
          covariate.labels = c("Treatment: Hint Control", "Treatment: One-Click Generate", "Treatment: Chat Generate",
                               "Video 11", "Video 13", "Video 14", "Video 15", "Video 16", "Video 17", 
                               "Video 18", "Video 19", "Video 20", "Video 21", "Video 22", "Video 23",
                               covariates_simple_fancy),
          column.labels = c("Without Covariates", "With Covariates"))

## 
## ====================================================================================================
##                                                                   Dependent variable:               
##                                                    -------------------------------------------------
##                                                           Number of Comments (Upper Bound: 3)       
##                                                       Without Covariates        With Covariates     
##                                                              (1)                      (2)           
## ----------------------------------------------------------------------------------------------------
## Treatment: Hint Control                                     0.208*                  0.229**         
##                                                            (0.089)                  (0.087)         
##                                                                                                     
## Treatment: One-Click Generate                               0.135                    0.175*         
##                                                            (0.089)                  (0.087)         
##                                                                                                     
## Treatment: Chat Generate                                   0.576***                 0.564***        
##                                                            (0.105)                  (0.103)         
##                                                                                                     
## Video 11                                                    -0.156                  -0.208*         
##                                                            (0.099)                  (0.099)         
##                                                                                                     
## Video 13                                                   -0.223*                  -0.224*         
##                                                            (0.095)                  (0.094)         
##                                                                                                     
## Video 14                                                    -0.069                   -0.038         
##                                                            (0.105)                  (0.104)         
##                                                                                                     
## Video 15                                                    -0.065                   -0.055         
##                                                            (0.113)                  (0.112)         
##                                                                                                     
## Video 16                                                    -0.134                   -0.129         
##                                                            (0.098)                  (0.099)         
##                                                                                                     
## Video 17                                                   -0.214*                  -0.215*         
##                                                            (0.102)                  (0.101)         
##                                                                                                     
## Video 18                                                   -0.248*                  -0.212*         
##                                                            (0.104)                  (0.104)         
##                                                                                                     
## Video 19                                                    -0.035                   0.039          
##                                                            (0.114)                  (0.113)         
##                                                                                                     
## Video 20                                                    -0.135                   -0.168         
##                                                            (0.110)                  (0.109)         
##                                                                                                     
## Video 21                                                    -0.081                   -0.065         
##                                                            (0.117)                  (0.117)         
##                                                                                                     
## Video 22                                                    -0.216                  -0.234*         
##                                                            (0.113)                  (0.112)         
##                                                                                                     
## Video 23                                                                                            
##                                                                                                     
##                                                                                                     
## Age                                                                                  0.003          
##                                                                                     (0.002)         
##                                                                                                     
## YouTube User                                                                         0.218*         
##                                                                                     (0.098)         
##                                                                                                     
## Social Media: Non-User                                                               0.460          
##                                                                                     (0.380)         
##                                                                                                     
## Social Media: User                                                                   -0.151         
##                                                                                     (0.119)         
##                                                                                                     
## Social Media Usage (1 - 4 Scale)                                                     -0.082         
##                                                                                     (0.043)         
##                                                                                                     
## Online Usage (1 - 4 Scale)                                                           0.032          
##                                                                                     (0.034)         
##                                                                                                     
## Female                                                                              -0.164*         
##                                                                                     (0.071)         
##                                                                                                     
## Race: Asian                                                                          0.144          
##                                                                                     (0.161)         
##                                                                                                     
## Race: Black                                                                          -0.048         
##                                                                                     (0.130)         
##                                                                                                     
## Race: Hispanic                                                                       -0.057         
##                                                                                     (0.178)         
##                                                                                                     
## Race: White                                                                          -0.070         
##                                                                                     (0.113)         
##                                                                                                     
## Race: Other                                                                                         
##                                                                                                     
##                                                                                                     
## Education: High School or Less                                                       0.029          
##                                                                                     (0.123)         
##                                                                                                     
## Education: Some College                                                              0.096          
##                                                                                     (0.110)         
##                                                                                                     
## Education: Bachelor                                                                  0.111          
##                                                                                     (0.093)         
##                                                                                                     
## Education: Postgraduate                                                                             
##                                                                                                     
##                                                                                                     
## Political Party: Democrat                                                            -0.072         
##                                                                                     (0.087)         
##                                                                                                     
## Political Party: Republican                                                          -0.192         
##                                                                                     (0.099)         
##                                                                                                     
## Political Party: Other                                                                              
##                                                                                                     
##                                                                                                     
## Political Ideology (1 - 5 Scale; 5 Strong Liberal)                                   -0.070         
##                                                                                     (0.039)         
##                                                                                                     
## Income (1 - 5 Scale)                                                                 -0.025         
##                                                                                     (0.033)         
##                                                                                                     
## Social Media Reply Frequency (1 - 6 Scale)                                          0.112***        
##                                                                                     (0.027)         
##                                                                                                     
## Review Frequency (1 - 6 Scale)                                                      0.156***        
##                                                                                     (0.031)         
##                                                                                                     
## Constant                                                   1.932***                 1.455***        
##                                                            (0.211)                  (0.364)         
##                                                                                                     
## ----------------------------------------------------------------------------------------------------
## Observations                                                1,660                    1,660          
## R2                                                          0.026                    0.082          
## Adjusted R2                                                 0.017                    0.063          
## Residual Std. Error                                   1.354 (df = 1645)        1.322 (df = 1625)    
## F Statistic                                        3.088*** (df = 14; 1645) 4.278*** (df = 34; 1625)
## ====================================================================================================
## Note:                                                                  *p<0.05; **p<0.01; ***p<0.001

Panel Regression

Outcome: {Review or Not, Time Spent, Input Length, Informativeness}

Outcome ~ treatment + video + user (+ demographics) [adding time fixed effect to detect any time-dependent effect]

Note: we are focusing on the outcome Review or Not first.

All Videos

panel_lm <- feols(hasComment ~ Treatment + order | Video.Id, data = df_panel %>% filter(prolific_id %in% respondents_to_keep), cluster = ~User.Id)

# panel_lm_with_cov <- feols(hasComment ~ Treatment + order | Video.Id + social_media_nonUser + social_media_user + social_media_YT + social_media_use + website_use + gender + age + raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + edu + polparty + libcons + income + social_media_reply + review_freq, data = df_panel)
panel_lm_with_cov <- feols(hasComment ~ Treatment + order | Video.Id + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, data = df_panel %>% filter(prolific_id %in% respondents_to_keep), cluster = ~User.Id)

model_list <- list(panel_lm, panel_lm_with_cov)
texreg::screenreg(model_list, 
               stars = c(0.05, 0.01, 0.001), 
               caption = "Panel Regression Results", 
               label = "tab:panel_regression", 
               digits = 4,
               custom.note = "Standard errors are clustered at the video level.",
               custom.model.names = c("Without Covariates", "With Covariates"), 
               custom.coef.names = c("Hint Control", "One-Click Generate", "Chat Generate", "Order"))

## 
## ============================================================================
##                                          Without Covariates  With Covariates
## ----------------------------------------------------------------------------
## Hint Control                                0.0752 **           0.0705 *    
##                                            (0.0287)            (0.0274)     
## One-Click Generate                          0.0614 *            0.0662 *    
##                                            (0.0296)            (0.0278)     
## Chat Generate                               0.1832 ***          0.1749 ***  
##                                            (0.0343)            (0.0348)     
## Order                                      -0.0436 ***         -0.0421 ***  
##                                            (0.0097)            (0.0072)     
## ----------------------------------------------------------------------------
## Num. obs.                                5199                5199           
## Num. groups: Video.Id                      12                  12           
## R^2 (full model)                            0.0267              0.1233      
## R^2 (proj model)                            0.0241              0.0235      
## Adj. R^2 (full model)                       0.0239              0.1035      
## Adj. R^2 (proj model)                       0.0233              0.0227      
## Num. groups: age                                               60           
## Num. groups: social_media_YT                                    2           
## Num. groups: social_media_nonUser                               2           
## Num. groups: social_media_user                                  2           
## Num. groups: social_media_use_numeric                           4           
## Num. groups: website_use_numeric                                4           
## Num. groups: genderFemale                                       2           
## Num. groups: raceAsian                                          2           
## Num. groups: raceBlack                                          2           
## Num. groups: raceHispanic                                       2           
## Num. groups: raceWhite                                          2           
## Num. groups: raceOther                                          2           
## Num. groups: eduHighSchoolOrLess                                2           
## Num. groups: eduSomeCollege                                     2           
## Num. groups: eduBachelor                                        2           
## Num. groups: eduPostGrad                                        2           
## Num. groups: polpartyDem                                        2           
## Num. groups: polpartyRep                                        2           
## Num. groups: polpartyOther                                      2           
## Num. groups: libcons_numeric                                    5           
## Num. groups: income_numeric                                     6           
## Num. groups: social_media_reply_numeric                         6           
## Num. groups: review_freq_numeric                                6           
## ============================================================================
## Standard errors are clustered at the video level.

First Three Videos

panel_lm <- feols(hasComment ~ Treatment + order | Video.Id, data = df_panel %>% filter(firstThree == 1) %>% filter(prolific_id %in% respondents_to_keep), cluster = ~User.Id)

panel_lm_with_cov <- feols(hasComment ~ Treatment  + order | Video.Id + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, data = df_panel %>% filter(firstThree == 1) %>% filter(prolific_id %in% respondents_to_keep), cluster = ~User.Id)

model_list <- list(panel_lm, panel_lm_with_cov)
texreg::screenreg(model_list, 
               stars = c(0.05, 0.01, 0.001), 
               caption = "Panel Regression Results (First Three Videos)", 
               label = "tab:panel_regression_first_three", 
               digits = 4,
               custom.note = "Standard errors are clustered at the video level.",
               custom.model.names = c("Without Covariates", "With Covariates"), 
               custom.coef.names = c("Hint Control", "One-Click Generate", "Chat Generate", "Order"))

## 
## ============================================================================
##                                          Without Covariates  With Covariates
## ----------------------------------------------------------------------------
## Hint Control                                0.0693 *            0.0671 *    
##                                            (0.0292)            (0.0278)     
## One-Click Generate                          0.0455              0.0533      
##                                            (0.0297)            (0.0284)     
## Chat Generate                               0.1919 ***          0.1846 ***  
##                                            (0.0344)            (0.0350)     
## Order                                      -0.0244 ***         -0.0261 ***  
##                                            (0.0048)            (0.0048)     
## ----------------------------------------------------------------------------
## Num. obs.                                4980                4980           
## Num. groups: Video.Id                      12                  12           
## R^2 (full model)                            0.0196              0.1156      
## R^2 (proj model)                            0.0170              0.0167      
## Adj. R^2 (full model)                       0.0167              0.0947      
## Adj. R^2 (proj model)                       0.0162              0.0159      
## Num. groups: age                                               60           
## Num. groups: social_media_YT                                    2           
## Num. groups: social_media_nonUser                               2           
## Num. groups: social_media_user                                  2           
## Num. groups: social_media_use_numeric                           4           
## Num. groups: website_use_numeric                                4           
## Num. groups: genderFemale                                       2           
## Num. groups: raceAsian                                          2           
## Num. groups: raceBlack                                          2           
## Num. groups: raceHispanic                                       2           
## Num. groups: raceWhite                                          2           
## Num. groups: raceOther                                          2           
## Num. groups: eduHighSchoolOrLess                                2           
## Num. groups: eduSomeCollege                                     2           
## Num. groups: eduBachelor                                        2           
## Num. groups: eduPostGrad                                        2           
## Num. groups: polpartyDem                                        2           
## Num. groups: polpartyRep                                        2           
## Num. groups: polpartyOther                                      2           
## Num. groups: libcons_numeric                                    5           
## Num. groups: income_numeric                                     6           
## Num. groups: social_media_reply_numeric                         6           
## Num. groups: review_freq_numeric                                6           
## ============================================================================
## Standard errors are clustered at the video level.

Mediation Analysis (User Level)

First conduct a overall correlation analysis for the mediators.

mediator_columns <- c("mech_popup", "mech_speed", "mech_wording", "mech_formulate","mech_difficulty",  "mech_AIaversion", "mech_trueop")
mech_fancy_names <- c("Pop-up (+)", "Speed (+)", "Help Wording (+)", "Help Formulate (+)", "Difficult to Use (-)", "AI Aversion (-)",  "True Opinion (-)")
mech_mapping <- c("Speed (+)" = "mech_speed", "Help Wording (+)" = "mech_wording", "Difficult to Use (-)" = "mech_difficulty", "Help Formulate (+)" = "mech_formulate", "AI Aversion (-)" = "mech_AIaversion", "Pop-up (+)" = "mech_popup", "True Opinion (-)" = "mech_trueop")

covariates_simple_without_baseline <- covariates_simple[!covariates_simple %in% c("raceOther", "eduHighSchoolOrLess", "polpartyOther")]
covariates_simple_without_baseline_fancy <- covariates_simple_fancy[!covariates_simple_fancy %in% c("Race: Other", "Political Party: Other", "Education: High School or Less")]

# visual correlation plot
mediator_corr <- cor(df_wide_all[df_wide_all$prolific_id %in% respondents_to_keep, c(mediator_columns, "willingness_to_pay", "review_exp")], use = "pairwise.complete.obs")
# change with fancy names
rownames(mediator_corr) <- c(mech_fancy_names, "Willingness to Pay", "Review Experience")
colnames(mediator_corr) <- c(mech_fancy_names, "Willingness to Pay", "Review Experience")

# visualize
corrplot(mediator_corr,method = 'number')

Step 1: Mediator Treatment Effect

Mediator: {faster, not reflect true opinion, right word, difficulty of usage, thought formulation, AI aversion, pop-up feature} Mediator ~ treatment + video (+ demographics)

mediator_coef_df <- data.frame()
for (m in c("mech_speed", "mech_wording", "mech_difficulty", "mech_formulate", "mech_AIaversion", "mech_popup", "mech_trueop")){
  cov_form_lm <- paste0(m, " ~ Treatment + ", paste(video_columns, collapse = " + "))
  mediator_lm <- lm(cov_form_lm, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))
  mediator_lm_coef <- summary(mediator_lm)$coefficients[2:4, c("Estimate", "Std. Error")]
  mediator_coef_df <- rbind(mediator_coef_df, cbind(mediator_lm_coef, mech_fancy_names[which(mech_mapping == m)]))
}
colnames(mediator_coef_df) <- c("Estimate", "Std. Error", "Mediator")
mediator_coef_df$Mediator <- factor(mediator_coef_df$Mediator, levels = c("Pop-up (+)", "Speed (+)", "Help Wording (+)", "Help Formulate (+)", "Difficult to Use (-)","AI Aversion (-)", "True Opinion (-)"))
mediator_coef_df$Estimate <- as.numeric(as.character(mediator_coef_df$Estimate))
mediator_coef_df$`Std. Error` <- as.numeric(as.character(mediator_coef_df$`Std. Error`))
mediator_coef_df$Treatment <- rownames(mediator_coef_df)
mediator_coef_df$Treatment <- ifelse(str_detect(mediator_coef_df$Treatment, "Hint"), "Hint Control", 
                                     ifelse(str_detect(mediator_coef_df$Treatment, "One"), "One-Click Generate", "Chat Generate"))
mediator_coef_df$Treatment <- factor(mediator_coef_df$Treatment, levels = c("Hint Control", "One-Click Generate", "Chat Generate"))

This plot is generated based on estimating {Mediator ~ treatment + video}. We use all videos instead of limiting to first three.

#plot bar plot
ggplot(mediator_coef_df, aes(x = Mediator, y = Estimate, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = Estimate - 1.96*`Std. Error`, ymax = Estimate + 1.96*`Std. Error`), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Mediator Analysis", x = "Mediator", y = "Estimate") +
  scale_fill_manual(name = "vs Pure Control",
                    values=c("Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        # center title
        title = element_text(face = "bold"))

mediator_coef_with_cov_df <- data.frame()
for (m in c("mech_speed", "mech_wording", "mech_difficulty", "mech_formulate", "mech_AIaversion", "mech_popup", "mech_trueop")){
  cov_form_lm_with_cov <- paste0(m, " ~ Treatment + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple_without_baseline, collapse = " + "))
  mediator_lm_with_cov <- lm(cov_form_lm_with_cov, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))
  assign(paste0("mediator_lm_with_cov_", m), lm(cov_form_lm_with_cov, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep)))
  mediator_lm_coef <- summary(mediator_lm_with_cov)$coefficients[2:4, c("Estimate", "Std. Error")]
  mediator_coef_with_cov_df <- rbind(mediator_coef_with_cov_df, cbind(mediator_lm_coef, mech_fancy_names[which(mech_mapping == m)]))
}
colnames(mediator_coef_with_cov_df) <- c("Estimate", "Std. Error", "Mediator")
mediator_coef_with_cov_df$Mediator <- factor(mediator_coef_with_cov_df$Mediator, levels = c("Pop-up (+)", "Speed (+)", "Help Wording (+)", "Help Formulate (+)", "Difficult to Use (-)","AI Aversion (-)", "True Opinion (-)"))
mediator_coef_with_cov_df$Estimate <- as.numeric(as.character(mediator_coef_with_cov_df$Estimate))
mediator_coef_with_cov_df$`Std. Error` <- as.numeric(as.character(mediator_coef_with_cov_df$`Std. Error`))
mediator_coef_with_cov_df$Treatment <- rownames(mediator_coef_with_cov_df)
mediator_coef_with_cov_df$Treatment <- ifelse(str_detect(mediator_coef_with_cov_df$Treatment, "Hint"), "Hint Control", 
                                     ifelse(str_detect(mediator_coef_with_cov_df$Treatment, "One"), "One-Click Generate", "Chat Generate"))
mediator_coef_with_cov_df$Treatment <- factor(mediator_coef_with_cov_df$Treatment, levels = c("Hint Control", "One-Click Generate", "Chat Generate"))

#plot bar plot
ggplot(mediator_coef_with_cov_df, aes(x = Mediator, y = Estimate, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = Estimate - 1.96*`Std. Error`, ymax = Estimate + 1.96*`Std. Error`), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Mediator Analysis (With Covariates)", x = "Mediator", y = "Estimate") +
  scale_fill_manual(name = "vs Pure Control",
                    values=c("Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        # center title
        title = element_text(face = "bold"))

Step 2: Mediation Effect

Number of Reviews ~ treatment + mediator + video (+ demographics)

outcome_coef_with_cov_df <- data.frame()
for (m in mediator_columns){
  cov_form_lm_with_cov <- paste0("num_comment ~ Treatment + ", m, " + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple_without_baseline, collapse = " + "))
  assign(paste0("outcome_lm_with_cov_", m), lm(cov_form_lm_with_cov, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep)))
}

# include png
include_graphics("tables_and_figures/mediation_table_redone.png")

Individual Mediator Effect

ACME (Average Causal Mediation Effect): The indirect effect (IE) of the treatment through the mediator.
ADE (Average Direct Effect): The direct effect (DE) of the treatment on the outcome.
Proportion Mediated: The proportion of the total effect explained by the mediator.

Should note that the y-axis ranges are different for each mediator plot.

conduct_mediation_analysis <- function(mediator_lm_with_cov, outcome_lm_with_cov, mediator, sims = 500){
  mediation_hintcontrol <- mediate(mediator_lm_with_cov, outcome_lm_with_cov, treat = "Treatment", control.value = "Pure Control", treat.value = "Hint Control", mediator = mediator,  sims = 500, robustSE = TRUE)
  
  mediation_oneclick <- mediate(mediator_lm_with_cov, outcome_lm_with_cov, treat = "Treatment", control.value = "Pure Control", treat.value = "One-Click Generate", mediator = mediator, sims = 500, robustSE = TRUE)
  
  mediation_chat <- mediate(mediator_lm_with_cov, outcome_lm_with_cov, treat = "Treatment", control.value = "Pure Control", treat.value = "Chat Generate", mediator = mediator, sims = 500, robustSE = TRUE)
  
  hintcontrol_mediate_df <- rbind(c(mediation_hintcontrol$d.avg, mediation_hintcontrol$d.avg.ci), 
        c(mediation_hintcontrol$z.avg, mediation_hintcontrol$z.avg.ci), 
        c(mediation_hintcontrol$n.avg, mediation_hintcontrol$n.avg.ci)) %>% as.data.frame()
  hintcontrol_mediate_df$type <- c("ACME", "ADE", "Prop. Mediated")
  hintcontrol_mediate_df$treatment <- "Hint Control"
  
  oneclick_mediate_df <- rbind(c(mediation_oneclick$d.avg, mediation_oneclick$d.avg.ci), 
        c(mediation_oneclick$z.avg, mediation_oneclick$z.avg.ci), 
        c(mediation_oneclick$n.avg, mediation_oneclick$n.avg.ci)) %>% as.data.frame()
  
  oneclick_mediate_df$type <- c("ACME", "ADE", "Prop. Mediated")
  oneclick_mediate_df$treatment <- "One-click Generate"
  
  chat_mediate_df <- rbind(c(mediation_chat$d.avg, mediation_chat$d.avg.ci),
        c(mediation_chat$z.avg, mediation_chat$z.avg.ci), 
        c(mediation_chat$n.avg, mediation_chat$n.avg.ci)) %>% as.data.frame()
  chat_mediate_df$type <- c("ACME", "ADE", "Prop. Mediated")
  chat_mediate_df$treatment <- "Chat Generate"
  
  mediate_df_popup <- rbind(hintcontrol_mediate_df, oneclick_mediate_df, chat_mediate_df)
  colnames(mediate_df_popup) <- c("Estimate", "2.5% CI", "97.5% CI", "Estimate Type", "Treatment")
  mediate_df_popup$Treatment <- factor(mediate_df_popup$Treatment, levels = c("Hint Control", "One-click Generate", "Chat Generate"))
  
  # plot
  output_plot <- ggplot(mediate_df_popup, aes(x = `Treatment`, y = Estimate, fill = `Estimate Type`)) +
    geom_bar(stat = "identity", position = "dodge") +
    geom_errorbar(aes(ymin = `2.5% CI`, ymax = `97.5% CI`), width = 0.2, position = position_dodge(0.9)) +
    labs(title = paste0("Mediation Analysis: ", mech_mapping[which(mech_mapping == mediator)] %>% names()), x = "Treatment", y = "Estimate") +
    scale_fill_manual(name = "Estimate Type",
                      values=c("ACME" = "turquoise4","ADE"="pink4","Prop. Mediated" = "orange4")) +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1),
          # center title
          title = element_text(face = "bold"))
  return(list(mediate_df_popup, output_plot))
}

Pop-up

mediate_analysis_popup <- conduct_mediation_analysis(mediator_lm_with_cov_mech_popup, outcome_lm_with_cov_mech_popup, "mech_popup")
mediate_analysis_popup[[2]]

Speed

mediate_analysis_speed <- conduct_mediation_analysis(mediator_lm_with_cov_mech_speed, outcome_lm_with_cov_mech_speed, "mech_speed")
mediate_analysis_speed[[2]]

Help Wording

mediate_analysis_wording <- conduct_mediation_analysis(mediator_lm_with_cov_mech_wording, outcome_lm_with_cov_mech_wording, "mech_wording")
mediate_analysis_wording[[2]]

Help Formulate

mediate_analysis_formulate <- conduct_mediation_analysis(mediator_lm_with_cov_mech_formulate, outcome_lm_with_cov_mech_formulate, "mech_formulate")
mediate_analysis_formulate[[2]]

Difficult to Use

mediate_analysis_difficulty <- conduct_mediation_analysis(mediator_lm_with_cov_mech_difficulty, outcome_lm_with_cov_mech_difficulty, "mech_difficulty")
mediate_analysis_difficulty[[2]]

AI Aversion

mediate_analysis_AIaversion <- conduct_mediation_analysis(mediator_lm_with_cov_mech_AIaversion, outcome_lm_with_cov_mech_AIaversion, "mech_AIaversion")
mediate_analysis_AIaversion[[2]]

Not Reflect True Opinion

mediate_analysis_trueop <- conduct_mediation_analysis(mediator_lm_with_cov_mech_trueop, outcome_lm_with_cov_mech_trueop, "mech_trueop")
mediate_analysis_trueop[[2]]

Subgroup Analysis

The outcome variable here is number of videos that have comments.

Subgroups using median split: social media usage, comment frequency, demographics, popup. Currently do not include social media platform type.

subgroups_columns <- c("social_media_use_numeric_median", "website_use_numeric_median", "social_media_reply_numeric_median", "review_freq_numeric_median", "age_median", "income_numeric_median", "libcons_numeric_median", "mech_popup_median", "edu_combined", "race_combined", "polparty_combined")

for (subgroup in subgroups_columns){

  subgroup_lm <- lm(paste0("num_comment ~ Treatment * ", subgroup, " + ", paste(video_columns, collapse = " + ")), data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))
  assign(paste0("subgroup_lm_", subgroup), subgroup_lm)
}

for (subgroup in subgroups_columns){
  subgroup_lm <- get(paste0("subgroup_lm_", subgroup))
  coef_table <- summary(subgroup_lm)$coefficients
  cat("#### ", subgroup, "\n")
  print(paste0("Values for this variable: ", paste(unique(df_wide_all[[subgroup]]), collapse = ", ")))
  rows_to_extract <- rownames(coef_table)[!str_detect(rownames(coef_table), "video") & (rownames(coef_table) != "(Intercept)")]
  subgroup_coef <- coef_table[rows_to_extract, ]
  print(kable(subgroup_coef, format = "markdown"))
  cat("\n")
}

social_media_use_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1508350	0.1471220	1.0252375	0.3054023
TreatmentOne-Click Generate	0.2026259	0.1470442	1.3779931	0.1683934
TreatmentChat Generate	0.2997605	0.1849495	1.6207690	0.1052594
social_media_use_numeric_medianBelow Median	-0.1901414	0.1337946	-1.4211447	0.1554648
TreatmentHint Control:social_media_use_numeric_medianBelow Median	0.1215251	0.1889504	0.6431587	0.5202110
TreatmentOne-Click Generate:social_media_use_numeric_medianBelow Median	-0.0219085	0.1884382	-0.1162635	0.9074580
TreatmentChat Generate:social_media_use_numeric_medianBelow Median	0.4101947	0.2296697	1.7860201	0.0742807

website_use_numeric_median

[1] “Values for this variable: Above Median, Below Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.3283684	0.1508469	2.1768315	0.0296354
TreatmentOne-Click Generate	0.0551583	0.1491331	0.3698596	0.7115349
TreatmentChat Generate	0.6712905	0.1782302	3.7664243	0.0001715
website_use_numeric_medianBelow Median	-0.0972649	0.1346293	-0.7224648	0.4701118
TreatmentHint Control:website_use_numeric_medianBelow Median	-0.1652794	0.1908014	-0.8662380	0.3864863
TreatmentOne-Click Generate:website_use_numeric_medianBelow Median	0.2262604	0.1894917	1.1940385	0.2326356
TreatmentChat Generate:website_use_numeric_medianBelow Median	-0.1771663	0.2264243	-0.7824526	0.4340616

social_media_reply_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1728444	0.1722986	1.0031675	0.3159280
TreatmentOne-Click Generate	0.3979965	0.1758543	2.2632175	0.0237525
TreatmentChat Generate	-0.1540949	0.2004682	-0.7686752	0.4421968
social_media_reply_numeric_medianBelow Median	-0.4462911	0.1413933	-3.1563795	0.0016264
TreatmentHint Control:social_media_reply_numeric_medianBelow Median	0.0847321	0.2029841	0.4174323	0.6764169
TreatmentOne-Click Generate:social_media_reply_numeric_medianBelow Median	-0.2448531	0.2052857	-1.1927426	0.2331427
TreatmentChat Generate:social_media_reply_numeric_medianBelow Median	1.0113863	0.2383797	4.2427537	0.0000233

review_freq_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1349381	0.1651471	0.8170782	0.4140024
TreatmentOne-Click Generate	0.1763267	0.1666244	1.0582284	0.2901072
TreatmentChat Generate	-0.3098180	0.1961824	-1.5792343	0.1144752
review_freq_numeric_medianBelow Median	-0.6352782	0.1378514	-4.6084261	0.0000044
TreatmentHint Control:review_freq_numeric_medianBelow Median	0.1520215	0.1977095	0.7689132	0.4420555
TreatmentOne-Click Generate:review_freq_numeric_medianBelow Median	0.0555269	0.1981112	0.2802814	0.7792970
TreatmentChat Generate:review_freq_numeric_medianBelow Median	1.2523011	0.2345903	5.3382483	0.0000001

age_median

[1] “Values for this variable: Above Median, Below Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2888197	0.1315517	2.1954843	0.0282685
TreatmentOne-Click Generate	0.1374881	0.1304824	1.0536908	0.2921797
TreatmentChat Generate	0.3408758	0.1589556	2.1444714	0.0321417
age_medianBelow Median	-0.0709854	0.1301728	-0.5453171	0.5856096
TreatmentHint Control:age_medianBelow Median	-0.1251085	0.1854153	-0.6747474	0.4999314
TreatmentOne-Click Generate:age_medianBelow Median	0.1110670	0.1838469	0.6041275	0.5458424
TreatmentChat Generate:age_medianBelow Median	0.4215936	0.2193829	1.9217251	0.0548135

income_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2183693	0.1754816	1.2444004	0.2135301
TreatmentOne-Click Generate	0.1162543	0.1666547	0.6975757	0.4855414
TreatmentChat Generate	0.5458511	0.2116805	2.5786553	0.0100051
income_numeric_medianBelow Median	0.0817390	0.1421196	0.5751422	0.5652740
TreatmentHint Control:income_numeric_medianBelow Median	0.0116961	0.2063253	0.0566875	0.9548010
TreatmentOne-Click Generate:income_numeric_medianBelow Median	0.1154452	0.1998527	0.5776516	0.5635787
TreatmentChat Generate:income_numeric_medianBelow Median	0.0206267	0.2474946	0.0833420	0.9335898

libcons_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2287490	0.1394149	1.6407778	0.1010353
TreatmentOne-Click Generate	0.3416249	0.1376354	2.4821008	0.0131605
TreatmentChat Generate	0.6847832	0.1584974	4.3204707	0.0000165
libcons_numeric_medianBelow Median	0.2395846	0.1309827	1.8291310	0.0675615
TreatmentHint Control:libcons_numeric_medianBelow Median	-0.0045879	0.1869218	-0.0245447	0.9804211
TreatmentOne-Click Generate:libcons_numeric_medianBelow Median	-0.2664468	0.1857307	-1.4345866	0.1515956
TreatmentChat Generate:libcons_numeric_medianBelow Median	-0.2191693	0.2199288	-0.9965467	0.3191316

mech_popup_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.3658515	0.1349562	2.710891	0.0067803
TreatmentOne-Click Generate	0.4865368	0.1384421	3.514370	0.0004528
TreatmentChat Generate	0.3821181	0.1517818	2.517548	0.0119121
mech_popup_medianBelow Median	-0.4762149	0.1270124	-3.749356	0.0001834
TreatmentHint Control:mech_popup_medianBelow Median	-0.2027932	0.1816001	-1.116702	0.2642855
TreatmentOne-Click Generate:mech_popup_medianBelow Median	-0.4027106	0.1823382	-2.208592	0.0273409
TreatmentChat Generate:mech_popup_medianBelow Median	0.3341409	0.2139776	1.561570	0.1185823

edu_combined

[1] “Values for this variable: Bachelor’s Degree, High School or Less, Graduate Degree, Some College”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1957383	0.2443234	0.8011441	0.4231648
TreatmentOne-Click Generate	0.2805413	0.2623139	1.0694870	0.2850084
TreatmentChat Generate	0.6931569	0.2901608	2.3888720	0.0170132
edu_combinedSome College	0.1845947	0.2313182	0.7980121	0.4249796
edu_combinedBachelor’s Degree	0.0093550	0.2090512	0.0447498	0.9643122
edu_combinedGraduate Degree	-0.0462207	0.2341840	-0.1973693	0.8435631
TreatmentHint Control:edu_combinedSome College	-0.2550121	0.3148784	-0.8098749	0.4181301
TreatmentOne-Click Generate:edu_combinedSome College	-0.5350225	0.3283311	-1.6295213	0.1033957
TreatmentChat Generate:edu_combinedSome College	0.0228716	0.3798683	0.0602093	0.9519963
TreatmentHint Control:edu_combinedBachelor’s Degree	0.0664558	0.2813627	0.2361925	0.8133129
TreatmentOne-Click Generate:edu_combinedBachelor’s Degree	0.0337079	0.2951147	0.1142198	0.9090776
TreatmentChat Generate:edu_combinedBachelor’s Degree	-0.0027415	0.3319558	-0.0082585	0.9934117
TreatmentHint Control:edu_combinedGraduate Degree	0.3143028	0.3205477	0.9805182	0.3269757
TreatmentOne-Click Generate:edu_combinedGraduate Degree	0.0583927	0.3321757	0.1757886	0.8604819
TreatmentChat Generate:edu_combinedGraduate Degree	-0.6772368	0.3808468	-1.7782396	0.0755507

race_combined

[1] “Values for this variable: White, Hispanic, Black, Other, Asian”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2412597	0.1189426	2.0283715	0.0426849
TreatmentOne-Click Generate	0.1587837	0.1198295	1.3250802	0.1853305
TreatmentChat Generate	0.6851269	0.1428995	4.7944674	0.0000018
race_combinedBlack	0.2441061	0.1738135	1.4044139	0.1603864
race_combinedHispanic	-0.2591924	0.2986590	-0.8678541	0.3856021
race_combinedAsian	-0.1462594	0.2517847	-0.5808907	0.5613945
race_combinedOther	0.5590399	0.2344862	2.3841056	0.0172347
TreatmentHint Control:race_combinedBlack	0.0574304	0.2471644	0.2323569	0.8162900
TreatmentOne-Click Generate:race_combinedBlack	-0.0859740	0.2427430	-0.3541773	0.7232519
TreatmentChat Generate:race_combinedBlack	-0.8243798	0.2883724	-2.8587332	0.0043075
TreatmentHint Control:race_combinedHispanic	0.0083722	0.4309947	0.0194253	0.9845042
TreatmentOne-Click Generate:race_combinedHispanic	0.4751959	0.4193006	1.1333060	0.2572526
TreatmentChat Generate:race_combinedHispanic	0.6279406	0.4928982	1.2739762	0.2028537
TreatmentHint Control:race_combinedAsian	0.3409974	0.3530125	0.9659640	0.3342057
TreatmentOne-Click Generate:race_combinedAsian	0.8801678	0.3702073	2.3774999	0.0175454
TreatmentChat Generate:race_combinedAsian	0.7132055	0.4199831	1.6981768	0.0896655
TreatmentHint Control:race_combinedOther	-0.5849741	0.3284430	-1.7810523	0.0750903
TreatmentOne-Click Generate:race_combinedOther	-0.4491351	0.3139345	-1.4306648	0.1527182
TreatmentChat Generate:race_combinedOther	-0.6227573	0.3771381	-1.6512714	0.0988760

polparty_combined

[1] “Values for this variable: Other, Republican, Democrat”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2847961	0.1490462	1.9107902	0.0562062
TreatmentOne-Click Generate	0.3245722	0.1503267	2.1591127	0.0309864
TreatmentChat Generate	0.7808051	0.1718087	4.5446195	0.0000059
polparty_combinedRepublican	0.1894080	0.1600329	1.1835569	0.2367605
polparty_combinedOther	0.2734455	0.1550615	1.7634648	0.0780088
TreatmentHint Control:polparty_combinedRepublican	-0.0338419	0.2269975	-0.1490848	0.8815051
TreatmentOne-Click Generate:polparty_combinedRepublican	-0.1502948	0.2279003	-0.6594759	0.5096830
TreatmentChat Generate:polparty_combinedRepublican	-0.4369696	0.2723475	-1.6044559	0.1088066
TreatmentHint Control:polparty_combinedOther	-0.1246941	0.2210988	-0.5639745	0.5728488
TreatmentOne-Click Generate:polparty_combinedOther	-0.2550246	0.2185917	-1.1666710	0.2435133
TreatmentChat Generate:polparty_combinedOther	-0.2931067	0.2591618	-1.1309794	0.2582295

We specifically look at social media reply and review frequency where we split each into three groups.

subgroup_lm_valuesplit_smr <- lm(paste0("num_comment ~ Treatment * social_media_reply_numeric_valuesplit + ", paste(video_columns, collapse = " + ")), data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))


summary(subgroup_lm_valuesplit_smr)

## 
## Call:
## lm(formula = paste0("num_comment ~ Treatment * social_media_reply_numeric_valuesplit + ", 
##     paste(video_columns, collapse = " + ")), data = df_wide_all %>% 
##     filter(prolific_id %in% respondents_to_keep))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.9288 -1.3927  0.4902  1.1932  7.5054 
## 
## Coefficients:
##                                                                      Estimate
## (Intercept)                                                           1.10432
## TreatmentHint Control                                                 0.17316
## TreatmentOne-Click Generate                                           0.39807
## TreatmentChat Generate                                               -0.15335
## social_media_reply_numeric_valuesplit3-4                             -0.24635
## social_media_reply_numeric_valuesplit1-2                             -0.82985
## video11                                                               0.16339
## video13                                                               0.15569
## video14                                                               0.35265
## video15                                                               0.30770
## video16                                                               0.26666
## video17                                                               0.12910
## video18                                                               0.12533
## video19                                                               0.36014
## video20                                                               0.25601
## video21                                                               0.26566
## video22                                                               0.21399
## video23                                                               0.39585
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4        0.06370
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4 -0.31941
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4       0.92383
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2        0.13841
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2 -0.07105
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2       1.13183
##                                                                      Std. Error
## (Intercept)                                                             0.18551
## TreatmentHint Control                                                   0.17087
## TreatmentOne-Click Generate                                             0.17440
## TreatmentChat Generate                                                  0.19881
## social_media_reply_numeric_valuesplit3-4                                0.15033
## social_media_reply_numeric_valuesplit1-2                                0.17506
## video11                                                                 0.08235
## video13                                                                 0.08261
## video14                                                                 0.09050
## video15                                                                 0.09621
## video16                                                                 0.08351
## video17                                                                 0.08484
## video18                                                                 0.08766
## video19                                                                 0.09335
## video20                                                                 0.09260
## video21                                                                 0.09658
## video22                                                                 0.09417
## video23                                                                 0.08408
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4          0.21579
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4    0.21861
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4         0.25184
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2          0.24904
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2    0.24757
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2         0.30218
##                                                                      t value
## (Intercept)                                                            5.953
## TreatmentHint Control                                                  1.013
## TreatmentOne-Click Generate                                            2.283
## TreatmentChat Generate                                                -0.771
## social_media_reply_numeric_valuesplit3-4                              -1.639
## social_media_reply_numeric_valuesplit1-2                              -4.740
## video11                                                                1.984
## video13                                                                1.885
## video14                                                                3.896
## video15                                                                3.198
## video16                                                                3.193
## video17                                                                1.522
## video18                                                                1.430
## video19                                                                3.858
## video20                                                                2.765
## video21                                                                2.751
## video22                                                                2.272
## video23                                                                4.708
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4         0.295
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4  -1.461
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4        3.668
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2         0.556
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2  -0.287
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2        3.746
##                                                                           Pr(>|t|)
## (Intercept)                                                          0.00000000322
## TreatmentHint Control                                                     0.311030
## TreatmentOne-Click Generate                                               0.022586
## TreatmentChat Generate                                                    0.440609
## social_media_reply_numeric_valuesplit3-4                                  0.101466
## social_media_reply_numeric_valuesplit1-2                             0.00000231992
## video11                                                                   0.047407
## video13                                                                   0.059659
## video14                                                                   0.000102
## video15                                                                   0.001410
## video16                                                                   0.001434
## video17                                                                   0.128259
## video18                                                                   0.152975
## video19                                                                   0.000119
## video20                                                                   0.005762
## video21                                                                   0.006014
## video22                                                                   0.023191
## video23                                                              0.00000271428
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4            0.767879
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4      0.144187
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4           0.000252
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2            0.578455
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2      0.774143
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2           0.000186
##                                                                         
## (Intercept)                                                          ***
## TreatmentHint Control                                                   
## TreatmentOne-Click Generate                                          *  
## TreatmentChat Generate                                                  
## social_media_reply_numeric_valuesplit3-4                                
## social_media_reply_numeric_valuesplit1-2                             ***
## video11                                                              *  
## video13                                                              .  
## video14                                                              ***
## video15                                                              ** 
## video16                                                              ** 
## video17                                                                 
## video18                                                                 
## video19                                                              ***
## video20                                                              ** 
## video21                                                              ** 
## video22                                                              *  
## video23                                                              ***
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4          
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4    
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4      ***
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2          
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2    
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2      ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.38 on 1636 degrees of freedom
## Multiple R-squared:  0.08567,    Adjusted R-squared:  0.07281 
## F-statistic: 6.664 on 23 and 1636 DF,  p-value: < 0.00000000000000022

subgroup_lm_valuesplit_rf <- lm(paste0("num_comment ~ Treatment * review_freq_numeric_valuesplit + ", paste(video_columns, collapse = " + ")), data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))

summary(subgroup_lm_valuesplit_rf)

## 
## Call:
## lm(formula = paste0("num_comment ~ Treatment * review_freq_numeric_valuesplit + ", 
##     paste(video_columns, collapse = " + ")), data = df_wide_all %>% 
##     filter(prolific_id %in% respondents_to_keep))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.8043 -1.3146  0.4883  1.1444  7.6097 
## 
## Coefficients:
##                                                               Estimate
## (Intercept)                                                    1.18619
## TreatmentHint Control                                          0.15916
## TreatmentOne-Click Generate                                    0.41365
## TreatmentChat Generate                                        -0.48850
## review_freq_numeric_valuesplit3-4                             -0.23417
## review_freq_numeric_valuesplit1-2                             -0.70391
## video11                                                        0.15858
## video13                                                        0.15263
## video14                                                        0.34688
## video15                                                        0.31790
## video16                                                        0.25372
## video17                                                        0.15427
## video18                                                        0.10804
## video19                                                        0.42355
## video20                                                        0.28440
## video21                                                        0.29202
## video22                                                        0.18620
## video23                                                        0.42267
## TreatmentHint Control:review_freq_numeric_valuesplit3-4        0.14160
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4 -0.07622
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4       0.79473
## TreatmentHint Control:review_freq_numeric_valuesplit1-2        0.06200
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2 -0.30721
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2       1.54753
##                                                               Std. Error
## (Intercept)                                                      0.22703
## TreatmentHint Control                                            0.25437
## TreatmentOne-Click Generate                                      0.26116
## TreatmentChat Generate                                           0.31308
## review_freq_numeric_valuesplit3-4                                0.19965
## review_freq_numeric_valuesplit1-2                                0.19615
## video11                                                          0.08219
## video13                                                          0.08263
## video14                                                          0.09034
## video15                                                          0.09589
## video16                                                          0.08371
## video17                                                          0.08443
## video18                                                          0.08767
## video19                                                          0.09315
## video20                                                          0.09246
## video21                                                          0.09644
## video22                                                          0.09381
## video23                                                          0.08408
## TreatmentHint Control:review_freq_numeric_valuesplit3-4          0.29171
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4    0.29851
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4         0.35360
## TreatmentHint Control:review_freq_numeric_valuesplit1-2          0.28574
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2    0.29081
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2         0.35085
##                                                               t value
## (Intercept)                                                     5.225
## TreatmentHint Control                                           0.626
## TreatmentOne-Click Generate                                     1.584
## TreatmentChat Generate                                         -1.560
## review_freq_numeric_valuesplit3-4                              -1.173
## review_freq_numeric_valuesplit1-2                              -3.589
## video11                                                         1.929
## video13                                                         1.847
## video14                                                         3.840
## video15                                                         3.315
## video16                                                         3.031
## video17                                                         1.827
## video18                                                         1.232
## video19                                                         4.547
## video20                                                         3.076
## video21                                                         3.028
## video22                                                         1.985
## video23                                                         5.027
## TreatmentHint Control:review_freq_numeric_valuesplit3-4         0.485
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4  -0.255
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4        2.248
## TreatmentHint Control:review_freq_numeric_valuesplit1-2         0.217
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2  -1.056
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2        4.411
##                                                                  Pr(>|t|)    
## (Intercept)                                                   0.000000197 ***
## TreatmentHint Control                                            0.531600    
## TreatmentOne-Click Generate                                      0.113413    
## TreatmentChat Generate                                           0.118883    
## review_freq_numeric_valuesplit3-4                                0.241000    
## review_freq_numeric_valuesplit1-2                                0.000342 ***
## video11                                                          0.053858 .  
## video13                                                          0.064912 .  
## video14                                                          0.000128 ***
## video15                                                          0.000935 ***
## video16                                                          0.002477 ** 
## video17                                                          0.067862 .  
## video18                                                          0.218004    
## video19                                                       0.000005840 ***
## video20                                                          0.002133 ** 
## video21                                                          0.002501 ** 
## video22                                                          0.047329 *  
## video23                                                       0.000000553 ***
## TreatmentHint Control:review_freq_numeric_valuesplit3-4          0.627445    
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4    0.798485    
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4         0.024737 *  
## TreatmentHint Control:review_freq_numeric_valuesplit1-2          0.828263    
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2    0.290955    
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2      0.000010969 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.377 on 1636 degrees of freedom
## Multiple R-squared:  0.08912,    Adjusted R-squared:  0.07631 
## F-statistic: 6.959 on 23 and 1636 DF,  p-value: < 0.00000000000000022

We plot the number of comments by treatment and subgroup using group means.

subgroup_lm_valuesplit_smr_df <- df_wide_all %>% filter(prolific_id %in% respondents_to_keep) %>%
  group_by(Treatment, social_media_reply_numeric_valuesplit) %>%
  summarise(mean_comments = mean(num_comment, na.rm = T), se_comments = sd(num_comment, na.rm = T)/sqrt(n()), n = n())

## `summarise()` has grouped output by 'Treatment'. You can override using the
## `.groups` argument.

# rename valuesplit
subgroup_lm_valuesplit_smr_df$social_media_reply_numeric_valuesplit <- factor(subgroup_lm_valuesplit_smr_df$social_media_reply_numeric_valuesplit, levels = c("1-2", "3-4", "5-6"), labels = c("Low", "Medium", "High"))

subgroup_lm_valuesplit_rf_df <- df_wide_all %>% filter(prolific_id %in% respondents_to_keep) %>%
  group_by(Treatment, review_freq_numeric_valuesplit) %>%
  summarise(mean_comments = mean(num_comment, na.rm = T), se_comments = sd(num_comment, na.rm = T)/sqrt(n()), n = n())

## `summarise()` has grouped output by 'Treatment'. You can override using the
## `.groups` argument.

# rename valuesplit
subgroup_lm_valuesplit_rf_df$review_freq_numeric_valuesplit <- factor(subgroup_lm_valuesplit_rf_df$review_freq_numeric_valuesplit, levels = c("1-2", "3-4", "5-6"), labels = c("Low", "Medium", "High"))

ggplot(subgroup_lm_valuesplit_smr_df, aes(x = social_media_reply_numeric_valuesplit, y = mean_comments, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = mean_comments - 1.96*se_comments, ymax = mean_comments + 1.96*se_comments), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Number of Comments by Social Media Reply Frequency", x = "Social Media Reply Frequency", y = "Number of Comments") +
  scale_fill_manual(name = "Treatment",
                    values=c("Pure Control" = "yellowgreen", "Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(hjust = 1),
        # center title
        title = element_text(face = "bold"))

ggplot(subgroup_lm_valuesplit_rf_df, aes(x = review_freq_numeric_valuesplit, y = mean_comments, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = mean_comments - 1.96*se_comments, ymax = mean_comments + 1.96*se_comments), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Number of Comments by Review Frequency", x = "Frequency", y = "Number of Comments") +
  scale_fill_manual(name = "Treatment",
                    values=c("Pure Control" = "yellowgreen", "Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(hjust = 1),
        # center title
        title = element_text(face = "bold"))

Primary Analysis Re-Done for Aware Participants

Only Keep Mode 4 respondents who said they “knew” they needed to click the button.

# mode 4 respondents to keep
mode4_respondents_to_keep <- df_followup_final %>% filter(Treatment == "Chat Generate") %>% filter(mode4_button == "No, I knew I needed to click this button.") %>% pull(prolific_id)
mode123_respondents_to_keep <- df_wide_all %>% filter(Treatment != "Chat Generate") %>% pull(prolific_id) 
respondents_to_keep <- c(mode4_respondents_to_keep, mode123_respondents_to_keep)

User Level Regression

All Videos

cov_form_lm <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "))
user_lm_all <- lm(cov_form_lm, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))

cov_form_lm_with_cov <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple, collapse = " + "), 
                               "+ willingness_to_pay")
user_lm_with_cov_all <- lm(cov_form_lm_with_cov, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))

stargazer(user_lm_all, user_lm_with_cov_all,  
          star.cutoffs = c(0.05, 0.01, 0.001), 
          dep.var.labels = "Number of Comments", 
          covariate.labels = c("Treatment: Hint Control", "Treatment: One-Click Generate", "Treatment: Chat Generate",
                               "Video 11", "Video 13", "Video 14", "Video 15", "Video 16", "Video 17", 
                               "Video 18", "Video 19", "Video 20", "Video 21", "Video 22", "Video 23",
                               covariates_simple_fancy, "Willingness to Pay"),
          column.labels = c("Without Covariates", "With Covariates"))

## 
## % Table created by stargazer v.5.2.3 by Marek Hlavac, Social Policy Institute. E-mail: marek.hlavac at gmail.com
## % Date and time: Fri, Mar 28, 2025 - 11:03:51
## \begin{table}[!htbp] \centering 
##   \caption{} 
##   \label{} 
## \begin{tabular}{@{\extracolsep{5pt}}lcc} 
## \\[-1.8ex]\hline 
## \hline \\[-1.8ex] 
##  & \multicolumn{2}{c}{\textit{Dependent variable:}} \\ 
## \cline{2-3} 
## \\[-1.8ex] & \multicolumn{2}{c}{Number of Comments} \\ 
##  & Without Covariates & With Covariates \\ 
## \\[-1.8ex] & (1) & (2)\\ 
## \hline \\[-1.8ex] 
##  Treatment: Hint Control & 0.229$^{*}$ & 0.246$^{**}$ \\ 
##   & (0.094) & (0.092) \\ 
##   & & \\ 
##  Treatment: One-Click Generate & 0.192$^{*}$ & 0.233$^{*}$ \\ 
##   & (0.093) & (0.091) \\ 
##   & & \\ 
##  Treatment: Chat Generate & 0.313$^{*}$ & 0.300$^{*}$ \\ 
##   & (0.121) & (0.119) \\ 
##   & & \\ 
##  Video 11 & 0.225$^{*}$ & 0.158 \\ 
##   & (0.088) & (0.087) \\ 
##   & & \\ 
##  Video 13 & 0.152 & 0.150 \\ 
##   & (0.089) & (0.087) \\ 
##   & & \\ 
##  Video 14 & 0.367$^{***}$ & 0.408$^{***}$ \\ 
##   & (0.096) & (0.095) \\ 
##   & & \\ 
##  Video 15 & 0.343$^{***}$ & 0.357$^{***}$ \\ 
##   & (0.102) & (0.100) \\ 
##   & & \\ 
##  Video 16 & 0.295$^{**}$ & 0.303$^{***}$ \\ 
##   & (0.090) & (0.090) \\ 
##   & & \\ 
##  Video 17 & 0.167 & 0.162 \\ 
##   & (0.091) & (0.089) \\ 
##   & & \\ 
##  Video 18 & 0.157 & 0.187$^{*}$ \\ 
##   & (0.094) & (0.092) \\ 
##   & & \\ 
##  Video 19 & 0.378$^{***}$ & 0.444$^{***}$ \\ 
##   & (0.100) & (0.098) \\ 
##   & & \\ 
##  Video 20 & 0.296$^{**}$ & 0.253$^{**}$ \\ 
##   & (0.099) & (0.098) \\ 
##   & & \\ 
##  Video 21 & 0.309$^{**}$ & 0.318$^{**}$ \\ 
##   & (0.103) & (0.102) \\ 
##   & & \\ 
##  Video 22 & 0.232$^{*}$ & 0.214$^{*}$ \\ 
##   & (0.100) & (0.098) \\ 
##   & & \\ 
##  Video 23 & 0.393$^{***}$ & 0.408$^{***}$ \\ 
##   & (0.090) & (0.089) \\ 
##   & & \\ 
##  Age &  & 0.003 \\ 
##   &  & (0.003) \\ 
##   & & \\ 
##  YouTube User &  & 0.233$^{*}$ \\ 
##   &  & (0.104) \\ 
##   & & \\ 
##  Social Media: Non-User &  & 0.529 \\ 
##   &  & (0.399) \\ 
##   & & \\ 
##  Social Media: User &  & $-$0.138 \\ 
##   &  & (0.128) \\ 
##   & & \\ 
##  Social Media Usage (1 - 4 Scale) &  & $-$0.070 \\ 
##   &  & (0.046) \\ 
##   & & \\ 
##  Online Usage (1 - 4 Scale) &  & 0.033 \\ 
##   &  & (0.037) \\ 
##   & & \\ 
##  Female &  & $-$0.183$^{*}$ \\ 
##   &  & (0.076) \\ 
##   & & \\ 
##  Race: Asian &  & 0.184 \\ 
##   &  & (0.173) \\ 
##   & & \\ 
##  Race: Black &  & $-$0.094 \\ 
##   &  & (0.140) \\ 
##   & & \\ 
##  Race: Hispanic &  & $-$0.071 \\ 
##   &  & (0.188) \\ 
##   & & \\ 
##  Race: White &  & $-$0.104 \\ 
##   &  & (0.121) \\ 
##   & & \\ 
##  Race: Other &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Education: High School or Less &  & 0.079 \\ 
##   &  & (0.132) \\ 
##   & & \\ 
##  Education: Some College &  & 0.099 \\ 
##   &  & (0.118) \\ 
##   & & \\ 
##  Education: Bachelor &  & 0.127 \\ 
##   &  & (0.100) \\ 
##   & & \\ 
##  Education: Postgraduate &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Political Party: Democrat &  & $-$0.080 \\ 
##   &  & (0.093) \\ 
##   & & \\ 
##  Political Party: Republican &  & $-$0.230$^{*}$ \\ 
##   &  & (0.107) \\ 
##   & & \\ 
##  Political Party: Other &  &  \\ 
##   &  &  \\ 
##   & & \\ 
##  Political Ideology (1 - 5 Scale; 5 Strong Liberal) &  & $-$0.068 \\ 
##   &  & (0.042) \\ 
##   & & \\ 
##  Income (1 - 5 Scale) &  & $-$0.027 \\ 
##   &  & (0.036) \\ 
##   & & \\ 
##  Social Media Reply Frequency (1 - 6 Scale) &  & 0.127$^{***}$ \\ 
##   &  & (0.029) \\ 
##   & & \\ 
##  Review Frequency (1 - 6 Scale) &  & 0.156$^{***}$ \\ 
##   &  & (0.034) \\ 
##   & & \\ 
##  Willingness to Pay &  & 0.028 \\ 
##   &  & (0.030) \\ 
##   & & \\ 
##  Constant & 0.703$^{***}$ & 0.115 \\ 
##   & (0.174) & (0.356) \\ 
##   & & \\ 
## \hline \\[-1.8ex] 
## Observations & 1,599 & 1,599 \\ 
## R$^{2}$ & 0.032 & 0.094 \\ 
## Adjusted R$^{2}$ & 0.023 & 0.073 \\ 
## Residual Std. Error & 1.422 (df = 1583) & 1.385 (df = 1562) \\ 
## F Statistic & 3.473$^{***}$ (df = 15; 1583) & 4.511$^{***}$ (df = 36; 1562) \\ 
## \hline 
## \hline \\[-1.8ex] 
## \textit{Note:}  & \multicolumn{2}{r}{$^{*}$p$<$0.05; $^{**}$p$<$0.01; $^{***}$p$<$0.001} \\ 
## \end{tabular} 
## \end{table}

user_lm_basic <- lm(num_comment ~ Treatment, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))

coef_table <- summary(user_lm_basic)$coefficients
# row names contain Treatment or Intercept
coef_table <- coef_table[row.names(coef_table) %in% c("TreatmentHint Control", "TreatmentOne-Click Generate", "TreatmentChat Generate", "(Intercept)"),]

# ggplot
coef_df <- coef_table %>% as.data.frame()
coef_df$ci_lower <- coef_df$Estimate - coef_df$`Std. Error` * 1.96
coef_df$ci_upper <- coef_df$Estimate + coef_df$`Std. Error` * 1.96

coef_df$Treatment <- rownames(coef_df)
coef_df$Treatment <- str_replace(coef_df$Treatment, "Treatment", "")
coef_df$Treatment <- factor(coef_df$Treatment, levels = c( "(Intercept)", "Chat Generate", "One-Click Generate", "Hint Control"))

ggplot(coef_df %>% filter(Treatment != "(Intercept)"), aes(x = Estimate, y = Treatment)) +
  geom_vline(xintercept = 0, linetype = "dashed") +
  geom_point(aes(color = ifelse(ci_lower > 0 | ci_upper < 0, "Significant", "Not Significant")), size = 3) +
  geom_errorbarh(aes(xmin = ci_lower, xmax = ci_upper), height = 0.1) +
  theme_minimal() +
  theme(legend.position = "none") +
  # make fonts bigger
  theme(axis.text.x = element_text(face = "bold", size = 12.5),
        axis.text.y = element_text(face = "bold", size = 12.5),
        axis.title.x = element_text(face = "bold",size = 15),
        axis.title.y = element_text(face = "bold",size = 15)) +
  # make title bigger and bold
  theme(plot.title = element_text(face='bold', size=15),
        plot.subtitle = element_text(size = 12.5)) +
  labs(x = "Coefficient Estimate", y = "Treatment", title = "Number of Comments ATE: User Level Regression", subtitle = paste0("vs Pure Control (Baseline): ", round(coef_df$Estimate[1], 2)))

ggsave("tables_and_figures/num_comment_ate_redo.png", width = 8, height = 6)

First Three Videos

cov_form_lm <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "))
user_lm <- lm(cov_form_lm, data = df_wide %>% filter(prolific_id %in% respondents_to_keep))

cov_form_lm_with_cov <- paste0("num_comment ~ Treatment + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple, collapse = " + "))
user_lm_with_cov <- lm(cov_form_lm_with_cov, data = df_wide %>% filter(prolific_id %in% respondents_to_keep))

stargazer(user_lm, user_lm_with_cov, type = "text", 
          star.cutoffs = c(0.05, 0.01, 0.001), 
          dep.var.labels = "Number of Comments (Upper Bound: 3)", 
          covariate.labels = c("Treatment: Hint Control", "Treatment: One-Click Generate", "Treatment: Chat Generate",
                               "Video 11", "Video 13", "Video 14", "Video 15", "Video 16", "Video 17", 
                               "Video 18", "Video 19", "Video 20", "Video 21", "Video 22", "Video 23",
                               covariates_simple_fancy),
          column.labels = c("Without Covariates", "With Covariates"))

## 
## =================================================================================================
##                                                                 Dependent variable:              
##                                                    ----------------------------------------------
##                                                         Number of Comments (Upper Bound: 3)      
##                                                     Without Covariates       With Covariates     
##                                                             (1)                    (2)           
## -------------------------------------------------------------------------------------------------
## Treatment: Hint Control                                   0.208*                 0.229**         
##                                                           (0.090)                (0.088)         
##                                                                                                  
## Treatment: One-Click Generate                              0.134                  0.174*         
##                                                           (0.089)                (0.088)         
##                                                                                                  
## Treatment: Chat Generate                                  0.316**                0.308**         
##                                                           (0.117)                (0.114)         
##                                                                                                  
## Video 11                                                  -0.152                 -0.227*         
##                                                           (0.102)                (0.102)         
##                                                                                                  
## Video 13                                                  -0.227*                -0.237*         
##                                                           (0.097)                (0.096)         
##                                                                                                  
## Video 14                                                  -0.047                  -0.026         
##                                                           (0.108)                (0.107)         
##                                                                                                  
## Video 15                                                  -0.053                  -0.059         
##                                                           (0.116)                (0.115)         
##                                                                                                  
## Video 16                                                  -0.132                  -0.133         
##                                                           (0.100)                (0.102)         
##                                                                                                  
## Video 17                                                  -0.206                 -0.225*         
##                                                           (0.105)                (0.104)         
##                                                                                                  
## Video 18                                                  -0.234*                -0.211*         
##                                                           (0.107)                (0.106)         
##                                                                                                  
## Video 19                                                  -0.049                  0.012          
##                                                           (0.119)                (0.117)         
##                                                                                                  
## Video 20                                                  -0.130                  -0.176         
##                                                           (0.113)                (0.113)         
##                                                                                                  
## Video 21                                                  -0.070                  -0.075         
##                                                           (0.121)                (0.120)         
##                                                                                                  
## Video 22                                                  -0.191                  -0.222         
##                                                           (0.116)                (0.115)         
##                                                                                                  
## Video 23                                                                                         
##                                                                                                  
##                                                                                                  
## Age                                                                               0.003          
##                                                                                  (0.002)         
##                                                                                                  
## YouTube User                                                                      0.203*         
##                                                                                  (0.100)         
##                                                                                                  
## Social Media: Non-User                                                            0.524          
##                                                                                  (0.384)         
##                                                                                                  
## Social Media: User                                                                -0.140         
##                                                                                  (0.123)         
##                                                                                                  
## Social Media Usage (1 - 4 Scale)                                                  -0.072         
##                                                                                  (0.044)         
##                                                                                                  
## Online Usage (1 - 4 Scale)                                                        0.036          
##                                                                                  (0.035)         
##                                                                                                  
## Female                                                                           -0.179*         
##                                                                                  (0.073)         
##                                                                                                  
## Race: Asian                                                                       0.126          
##                                                                                  (0.167)         
##                                                                                                  
## Race: Black                                                                       -0.022         
##                                                                                  (0.134)         
##                                                                                                  
## Race: Hispanic                                                                    -0.013         
##                                                                                  (0.181)         
##                                                                                                  
## Race: White                                                                       -0.068         
##                                                                                  (0.116)         
##                                                                                                  
## Race: Other                                                                                      
##                                                                                                  
##                                                                                                  
## Education: High School or Less                                                    0.019          
##                                                                                  (0.127)         
##                                                                                                  
## Education: Some College                                                           0.107          
##                                                                                  (0.113)         
##                                                                                                  
## Education: Bachelor                                                               0.121          
##                                                                                  (0.096)         
##                                                                                                  
## Education: Postgraduate                                                                          
##                                                                                                  
##                                                                                                  
## Political Party: Democrat                                                         -0.079         
##                                                                                  (0.089)         
##                                                                                                  
## Political Party: Republican                                                      -0.200*         
##                                                                                  (0.102)         
##                                                                                                  
## Political Party: Other                                                                           
##                                                                                                  
##                                                                                                  
## Political Ideology (1 - 5 Scale; 5 Strong Liberal)                                -0.070         
##                                                                                  (0.040)         
##                                                                                                  
## Income (1 - 5 Scale)                                                              -0.026         
##                                                                                  (0.034)         
##                                                                                                  
## Social Media Reply Frequency (1 - 6 Scale)                                       0.113***        
##                                                                                  (0.028)         
##                                                                                                  
## Review Frequency (1 - 6 Scale)                                                   0.161***        
##                                                                                  (0.032)         
##                                                                                                  
## Constant                                                 1.916***                1.422***        
##                                                           (0.217)                (0.373)         
##                                                                                                  
## -------------------------------------------------------------------------------------------------
## Observations                                               1,599                  1,599          
## R2                                                         0.013                  0.072          
## Adjusted R2                                                0.004                  0.052          
## Residual Std. Error                                  1.366 (df = 1584)      1.333 (df = 1564)    
## F Statistic                                        1.469 (df = 14; 1584) 3.591*** (df = 34; 1564)
## =================================================================================================
## Note:                                                               *p<0.05; **p<0.01; ***p<0.001

Panel Regression

Outcome: {Review or Not, Time Spent, Input Length, Informativeness}

Outcome ~ treatment + video + user (+ demographics) [adding time fixed effect to detect any time-dependent effect]

Note: we are focusing on the outcome Review or Not first.

All Videos

panel_lm <- feols(hasComment ~ Treatment + order | Video.Id, data = df_panel %>% filter(prolific_id %in% respondents_to_keep), cluster = ~User.Id)

# panel_lm_with_cov <- feols(hasComment ~ Treatment + order | Video.Id + social_media_nonUser + social_media_user + social_media_YT + social_media_use + website_use + gender + age + raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + edu + polparty + libcons + income + social_media_reply + review_freq, data = df_panel)
panel_lm_with_cov <- feols(hasComment ~ Treatment + order | Video.Id + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, data = df_panel %>% filter(prolific_id %in% respondents_to_keep), cluster = ~User.Id)

model_list <- list(panel_lm, panel_lm_with_cov)
texreg::screenreg(model_list, 
               stars = c(0.05, 0.01, 0.001), 
               caption = "Panel Regression Results", 
               label = "tab:panel_regression", 
               digits = 4,
               custom.note = "Standard errors are clustered at the video level.",
               custom.model.names = c("Without Covariates", "With Covariates"), 
               custom.coef.names = c("Hint Control", "One-Click Generate", "Chat Generate", "Order"))

## 
## ============================================================================
##                                          Without Covariates  With Covariates
## ----------------------------------------------------------------------------
## Hint Control                                0.0753 **           0.0699 *    
##                                            (0.0288)            (0.0274)     
## One-Click Generate                          0.0606 *            0.0643 *    
##                                            (0.0296)            (0.0277)     
## Chat Generate                               0.1029 **           0.0934 *    
##                                            (0.0396)            (0.0408)     
## Order                                      -0.0379 ***         -0.0369 ***  
##                                            (0.0100)            (0.0066)     
## ----------------------------------------------------------------------------
## Num. obs.                                4997                4997           
## Num. groups: Video.Id                      12                  12           
## R^2 (full model)                            0.0155              0.1171      
## R^2 (proj model)                            0.0127              0.0124      
## Adj. R^2 (full model)                       0.0125              0.0963      
## Adj. R^2 (proj model)                       0.0119              0.0116      
## Num. groups: age                                               60           
## Num. groups: social_media_YT                                    2           
## Num. groups: social_media_nonUser                               2           
## Num. groups: social_media_user                                  2           
## Num. groups: social_media_use_numeric                           4           
## Num. groups: website_use_numeric                                4           
## Num. groups: genderFemale                                       2           
## Num. groups: raceAsian                                          2           
## Num. groups: raceBlack                                          2           
## Num. groups: raceHispanic                                       2           
## Num. groups: raceWhite                                          2           
## Num. groups: raceOther                                          2           
## Num. groups: eduHighSchoolOrLess                                2           
## Num. groups: eduSomeCollege                                     2           
## Num. groups: eduBachelor                                        2           
## Num. groups: eduPostGrad                                        2           
## Num. groups: polpartyDem                                        2           
## Num. groups: polpartyRep                                        2           
## Num. groups: polpartyOther                                      2           
## Num. groups: libcons_numeric                                    5           
## Num. groups: income_numeric                                     6           
## Num. groups: social_media_reply_numeric                         6           
## Num. groups: review_freq_numeric                                6           
## ============================================================================
## Standard errors are clustered at the video level.

First Three Videos

panel_lm <- feols(hasComment ~ Treatment + order | Video.Id, data = df_panel %>% filter(firstThree == 1) %>% filter(prolific_id %in% respondents_to_keep), cluster = ~User.Id)

panel_lm_with_cov <- feols(hasComment ~ Treatment  + order | Video.Id + age + social_media_YT + social_media_nonUser + social_media_user + social_media_use_numeric + website_use_numeric + genderFemale +  raceAsian + raceBlack + raceHispanic + raceWhite + raceOther + eduHighSchoolOrLess + eduSomeCollege + eduBachelor + eduPostGrad + polpartyDem + polpartyRep + polpartyOther + libcons_numeric + income_numeric + social_media_reply_numeric + review_freq_numeric, data = df_panel %>% filter(firstThree == 1) %>% filter(prolific_id %in% respondents_to_keep), cluster = ~User.Id)

model_list <- list(panel_lm, panel_lm_with_cov)
texreg::screenreg(model_list, 
               stars = c(0.05, 0.01, 0.001), 
               caption = "Panel Regression Results (First Three Videos)", 
               label = "tab:panel_regression_first_three", 
               digits = 4,
               custom.note = "Standard errors are clustered at the video level.",
               custom.model.names = c("Without Covariates", "With Covariates"), 
               custom.coef.names = c("Hint Control", "One-Click Generate", "Chat Generate", "Order"))

## 
## ============================================================================
##                                          Without Covariates  With Covariates
## ----------------------------------------------------------------------------
## Hint Control                                0.0692 *            0.0668 *    
##                                            (0.0292)            (0.0278)     
## One-Click Generate                          0.0453              0.0528      
##                                            (0.0297)            (0.0284)     
## Chat Generate                               0.1042 **           0.0962 *    
##                                            (0.0395)            (0.0410)     
## Order                                      -0.0259 ***         -0.0277 ***  
##                                            (0.0049)            (0.0049)     
## ----------------------------------------------------------------------------
## Num. obs.                                4797                4797           
## Num. groups: Video.Id                      12                  12           
## R^2 (full model)                            0.0093              0.1097      
## R^2 (proj model)                            0.0066              0.0067      
## Adj. R^2 (full model)                       0.0062              0.0878      
## Adj. R^2 (proj model)                       0.0057              0.0059      
## Num. groups: age                                               60           
## Num. groups: social_media_YT                                    2           
## Num. groups: social_media_nonUser                               2           
## Num. groups: social_media_user                                  2           
## Num. groups: social_media_use_numeric                           4           
## Num. groups: website_use_numeric                                4           
## Num. groups: genderFemale                                       2           
## Num. groups: raceAsian                                          2           
## Num. groups: raceBlack                                          2           
## Num. groups: raceHispanic                                       2           
## Num. groups: raceWhite                                          2           
## Num. groups: raceOther                                          2           
## Num. groups: eduHighSchoolOrLess                                2           
## Num. groups: eduSomeCollege                                     2           
## Num. groups: eduBachelor                                        2           
## Num. groups: eduPostGrad                                        2           
## Num. groups: polpartyDem                                        2           
## Num. groups: polpartyRep                                        2           
## Num. groups: polpartyOther                                      2           
## Num. groups: libcons_numeric                                    5           
## Num. groups: income_numeric                                     6           
## Num. groups: social_media_reply_numeric                         6           
## Num. groups: review_freq_numeric                                6           
## ============================================================================
## Standard errors are clustered at the video level.

Mediation Analysis (User Level)

First conduct a overall correlation analysis for the mediators.

mediator_columns <- c("mech_popup", "mech_speed", "mech_wording", "mech_formulate","mech_difficulty",  "mech_AIaversion", "mech_trueop")
mech_fancy_names <- c("Pop-up (+)", "Speed (+)", "Help Wording (+)", "Help Formulate (+)", "Difficult to Use (-)", "AI Aversion (-)",  "True Opinion (-)")
mech_mapping <- c("Speed (+)" = "mech_speed", "Help Wording (+)" = "mech_wording", "Difficult to Use (-)" = "mech_difficulty", "Help Formulate (+)" = "mech_formulate", "AI Aversion (-)" = "mech_AIaversion", "Pop-up (+)" = "mech_popup", "True Opinion (-)" = "mech_trueop")

covariates_simple_without_baseline <- covariates_simple[!covariates_simple %in% c("raceOther", "eduHighSchoolOrLess", "polpartyOther")]
covariates_simple_without_baseline_fancy <- covariates_simple_fancy[!covariates_simple_fancy %in% c("Race: Other", "Political Party: Other", "Education: High School or Less")]

# visual correlation plot
mediator_corr <- cor(df_wide_all[df_wide_all$prolific_id %in% respondents_to_keep, c(mediator_columns, "willingness_to_pay", "review_exp")], use = "pairwise.complete.obs")
# change with fancy names
rownames(mediator_corr) <- c(mech_fancy_names, "Willingness to Pay", "Review Experience")
colnames(mediator_corr) <- c(mech_fancy_names, "Willingness to Pay", "Review Experience")

# visualize
corrplot(mediator_corr,method = 'number')

Step 1: Mediator Treatment Effect

Mediator: {faster, not reflect true opinion, right word, difficulty of usage, thought formulation, AI aversion, pop-up feature} Mediator ~ treatment + video (+ demographics)

mediator_coef_df <- data.frame()
for (m in c("mech_speed", "mech_wording", "mech_difficulty", "mech_formulate", "mech_AIaversion", "mech_popup", "mech_trueop")){
  cov_form_lm <- paste0(m, " ~ Treatment + ", paste(video_columns, collapse = " + "))
  mediator_lm <- lm(cov_form_lm, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))
  mediator_lm_coef <- summary(mediator_lm)$coefficients[2:4, c("Estimate", "Std. Error")]
  mediator_coef_df <- rbind(mediator_coef_df, cbind(mediator_lm_coef, mech_fancy_names[which(mech_mapping == m)]))
}
colnames(mediator_coef_df) <- c("Estimate", "Std. Error", "Mediator")
mediator_coef_df$Mediator <- factor(mediator_coef_df$Mediator, levels = c("Pop-up (+)", "Speed (+)", "Help Wording (+)", "Help Formulate (+)", "Difficult to Use (-)","AI Aversion (-)", "True Opinion (-)"))
mediator_coef_df$Estimate <- as.numeric(as.character(mediator_coef_df$Estimate))
mediator_coef_df$`Std. Error` <- as.numeric(as.character(mediator_coef_df$`Std. Error`))
mediator_coef_df$Treatment <- rownames(mediator_coef_df)
mediator_coef_df$Treatment <- ifelse(str_detect(mediator_coef_df$Treatment, "Hint"), "Hint Control", 
                                     ifelse(str_detect(mediator_coef_df$Treatment, "One"), "One-Click Generate", "Chat Generate"))
mediator_coef_df$Treatment <- factor(mediator_coef_df$Treatment, levels = c("Hint Control", "One-Click Generate", "Chat Generate"))

This plot is generated based on estimating {Mediator ~ treatment + video}. We use all videos instead of limiting to first three.

#plot bar plot
ggplot(mediator_coef_df, aes(x = Mediator, y = Estimate, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = Estimate - 1.96*`Std. Error`, ymax = Estimate + 1.96*`Std. Error`), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Mediator Analysis", x = "Mediator", y = "Estimate") +
  scale_fill_manual(name = "vs Pure Control",
                    values=c("Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        # center title
        title = element_text(face = "bold"))

mediator_coef_with_cov_df <- data.frame()
for (m in c("mech_speed", "mech_wording", "mech_difficulty", "mech_formulate", "mech_AIaversion", "mech_popup", "mech_trueop")){
  cov_form_lm_with_cov <- paste0(m, " ~ Treatment + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple_without_baseline, collapse = " + "))
  mediator_lm_with_cov <- lm(cov_form_lm_with_cov, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))
  assign(paste0("mediator_lm_with_cov_", m), lm(cov_form_lm_with_cov, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep)))
  mediator_lm_coef <- summary(mediator_lm_with_cov)$coefficients[2:4, c("Estimate", "Std. Error")]
  mediator_coef_with_cov_df <- rbind(mediator_coef_with_cov_df, cbind(mediator_lm_coef, mech_fancy_names[which(mech_mapping == m)]))
}
colnames(mediator_coef_with_cov_df) <- c("Estimate", "Std. Error", "Mediator")
mediator_coef_with_cov_df$Mediator <- factor(mediator_coef_with_cov_df$Mediator, levels = c("Pop-up (+)", "Speed (+)", "Help Wording (+)", "Help Formulate (+)", "Difficult to Use (-)","AI Aversion (-)", "True Opinion (-)"))
mediator_coef_with_cov_df$Estimate <- as.numeric(as.character(mediator_coef_with_cov_df$Estimate))
mediator_coef_with_cov_df$`Std. Error` <- as.numeric(as.character(mediator_coef_with_cov_df$`Std. Error`))
mediator_coef_with_cov_df$Treatment <- rownames(mediator_coef_with_cov_df)
mediator_coef_with_cov_df$Treatment <- ifelse(str_detect(mediator_coef_with_cov_df$Treatment, "Hint"), "Hint Control", 
                                     ifelse(str_detect(mediator_coef_with_cov_df$Treatment, "One"), "One-Click Generate", "Chat Generate"))
mediator_coef_with_cov_df$Treatment <- factor(mediator_coef_with_cov_df$Treatment, levels = c("Hint Control", "One-Click Generate", "Chat Generate"))

#plot bar plot
ggplot(mediator_coef_with_cov_df, aes(x = Mediator, y = Estimate, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = Estimate - 1.96*`Std. Error`, ymax = Estimate + 1.96*`Std. Error`), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Mediator Analysis (With Covariates)", x = "Mediator", y = "Estimate") +
  scale_fill_manual(name = "vs Pure Control",
                    values=c("Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        # center title
        title = element_text(face = "bold"))

Step 2: Mediation Effect

Number of Reviews ~ treatment + mediator + video (+ demographics)

outcome_coef_with_cov_df <- data.frame()
for (m in mediator_columns){
  cov_form_lm_with_cov <- paste0("num_comment ~ Treatment + ", m, " + ", paste(video_columns, collapse = " + "), "+ ", paste(covariates_simple_without_baseline, collapse = " + "))
  assign(paste0("outcome_lm_with_cov_", m), lm(cov_form_lm_with_cov, data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep)))
}

# include png
include_graphics("tables_and_figures/mediation_table_redone.png")

Individual Mediator Effect

ACME (Average Causal Mediation Effect): The indirect effect (IE) of the treatment through the mediator.
ADE (Average Direct Effect): The direct effect (DE) of the treatment on the outcome.
Proportion Mediated: The proportion of the total effect explained by the mediator.

Should note that the y-axis ranges are different for each mediator plot.

conduct_mediation_analysis <- function(mediator_lm_with_cov, outcome_lm_with_cov, mediator, sims = 500){
  mediation_hintcontrol <- mediate(mediator_lm_with_cov, outcome_lm_with_cov, treat = "Treatment", control.value = "Pure Control", treat.value = "Hint Control", mediator = mediator,  sims = 500, robustSE = TRUE)
  
  mediation_oneclick <- mediate(mediator_lm_with_cov, outcome_lm_with_cov, treat = "Treatment", control.value = "Pure Control", treat.value = "One-Click Generate", mediator = mediator, sims = 500, robustSE = TRUE)
  
  mediation_chat <- mediate(mediator_lm_with_cov, outcome_lm_with_cov, treat = "Treatment", control.value = "Pure Control", treat.value = "Chat Generate", mediator = mediator, sims = 500, robustSE = TRUE)
  
  hintcontrol_mediate_df <- rbind(c(mediation_hintcontrol$d.avg, mediation_hintcontrol$d.avg.ci), 
        c(mediation_hintcontrol$z.avg, mediation_hintcontrol$z.avg.ci), 
        c(mediation_hintcontrol$n.avg, mediation_hintcontrol$n.avg.ci)) %>% as.data.frame()
  hintcontrol_mediate_df$type <- c("ACME", "ADE", "Prop. Mediated")
  hintcontrol_mediate_df$treatment <- "Hint Control"
  
  oneclick_mediate_df <- rbind(c(mediation_oneclick$d.avg, mediation_oneclick$d.avg.ci), 
        c(mediation_oneclick$z.avg, mediation_oneclick$z.avg.ci), 
        c(mediation_oneclick$n.avg, mediation_oneclick$n.avg.ci)) %>% as.data.frame()
  
  oneclick_mediate_df$type <- c("ACME", "ADE", "Prop. Mediated")
  oneclick_mediate_df$treatment <- "One-click Generate"
  
  chat_mediate_df <- rbind(c(mediation_chat$d.avg, mediation_chat$d.avg.ci),
        c(mediation_chat$z.avg, mediation_chat$z.avg.ci), 
        c(mediation_chat$n.avg, mediation_chat$n.avg.ci)) %>% as.data.frame()
  chat_mediate_df$type <- c("ACME", "ADE", "Prop. Mediated")
  chat_mediate_df$treatment <- "Chat Generate"
  
  mediate_df_popup <- rbind(hintcontrol_mediate_df, oneclick_mediate_df, chat_mediate_df)
  colnames(mediate_df_popup) <- c("Estimate", "2.5% CI", "97.5% CI", "Estimate Type", "Treatment")
  mediate_df_popup$Treatment <- factor(mediate_df_popup$Treatment, levels = c("Hint Control", "One-click Generate", "Chat Generate"))
  
  # plot
  output_plot <- ggplot(mediate_df_popup, aes(x = `Treatment`, y = Estimate, fill = `Estimate Type`)) +
    geom_bar(stat = "identity", position = "dodge") +
    geom_errorbar(aes(ymin = `2.5% CI`, ymax = `97.5% CI`), width = 0.2, position = position_dodge(0.9)) +
    labs(title = paste0("Mediation Analysis: ", mech_mapping[which(mech_mapping == mediator)] %>% names()), x = "Treatment", y = "Estimate") +
    scale_fill_manual(name = "Estimate Type",
                      values=c("ACME" = "turquoise4","ADE"="pink4","Prop. Mediated" = "orange4")) +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1),
          # center title
          title = element_text(face = "bold"))
  return(list(mediate_df_popup, output_plot))
}

Pop-up

mediate_analysis_popup <- conduct_mediation_analysis(mediator_lm_with_cov_mech_popup, outcome_lm_with_cov_mech_popup, "mech_popup")
mediate_analysis_popup[[2]]

Speed

mediate_analysis_speed <- conduct_mediation_analysis(mediator_lm_with_cov_mech_speed, outcome_lm_with_cov_mech_speed, "mech_speed")
mediate_analysis_speed[[2]]

Help Wording

mediate_analysis_wording <- conduct_mediation_analysis(mediator_lm_with_cov_mech_wording, outcome_lm_with_cov_mech_wording, "mech_wording")
mediate_analysis_wording[[2]]

Help Formulate

mediate_analysis_formulate <- conduct_mediation_analysis(mediator_lm_with_cov_mech_formulate, outcome_lm_with_cov_mech_formulate, "mech_formulate")
mediate_analysis_formulate[[2]]

Difficult to Use

mediate_analysis_difficulty <- conduct_mediation_analysis(mediator_lm_with_cov_mech_difficulty, outcome_lm_with_cov_mech_difficulty, "mech_difficulty")
mediate_analysis_difficulty[[2]]

AI Aversion

mediate_analysis_AIaversion <- conduct_mediation_analysis(mediator_lm_with_cov_mech_AIaversion, outcome_lm_with_cov_mech_AIaversion, "mech_AIaversion")
mediate_analysis_AIaversion[[2]]

Not Reflect True Opinion

mediate_analysis_trueop <- conduct_mediation_analysis(mediator_lm_with_cov_mech_trueop, outcome_lm_with_cov_mech_trueop, "mech_trueop")
mediate_analysis_trueop[[2]]

Subgroup Analysis

The outcome variable here is number of videos that have comments.

Subgroups using median split: social media usage, comment frequency, demographics, popup. Currently do not include social media platform type.

subgroups_columns <- c("social_media_use_numeric_median", "website_use_numeric_median", "social_media_reply_numeric_median", "review_freq_numeric_median", "age_median", "income_numeric_median", "libcons_numeric_median", "mech_popup_median", "edu_combined", "race_combined", "polparty_combined")

for (subgroup in subgroups_columns){

  subgroup_lm <- lm(paste0("num_comment ~ Treatment * ", subgroup, " + ", paste(video_columns, collapse = " + ")), data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))
  assign(paste0("subgroup_lm_", subgroup), subgroup_lm)
}

for (subgroup in subgroups_columns){
  subgroup_lm <- get(paste0("subgroup_lm_", subgroup))
  coef_table <- summary(subgroup_lm)$coefficients
  cat("#### ", subgroup, "\n")
  print(paste0("Values for this variable: ", paste(unique(df_wide_all[[subgroup]]), collapse = ", ")))
  rows_to_extract <- rownames(coef_table)[!str_detect(rownames(coef_table), "video") & (rownames(coef_table) != "(Intercept)")]
  subgroup_coef <- coef_table[rows_to_extract, ]
  print(kable(subgroup_coef, format = "markdown"))
  cat("\n")
}

social_media_use_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1534181	0.1485937	1.0324665	0.3020117
TreatmentOne-Click Generate	0.2014310	0.1484915	1.3565151	0.1751293
TreatmentChat Generate	0.1474574	0.1958740	0.7528176	0.4516718
social_media_use_numeric_medianBelow Median	-0.1879162	0.1351155	-1.3907823	0.1644874
TreatmentHint Control:social_media_use_numeric_medianBelow Median	0.1168280	0.1908090	0.6122775	0.5404423
TreatmentOne-Click Generate:social_media_use_numeric_medianBelow Median	-0.0242671	0.1902861	-0.1275296	0.8985375
TreatmentChat Generate:social_media_use_numeric_medianBelow Median	0.2677902	0.2496919	1.0724824	0.2836673

website_use_numeric_median

[1] “Values for this variable: Above Median, Below Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.3296926	0.1522844	2.1649792	0.0305387
TreatmentOne-Click Generate	0.0538313	0.1505413	0.3575852	0.7207015
TreatmentChat Generate	0.4549837	0.1966880	2.3132251	0.0208382
website_use_numeric_medianBelow Median	-0.0962575	0.1358911	-0.7083429	0.4788369
TreatmentHint Control:website_use_numeric_medianBelow Median	-0.1674828	0.1925964	-0.8696052	0.3846483
TreatmentOne-Click Generate:website_use_numeric_medianBelow Median	0.2244945	0.1912721	1.1736918	0.2406955
TreatmentChat Generate:website_use_numeric_medianBelow Median	-0.2334817	0.2502534	-0.9329810	0.3509724

social_media_reply_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1705128	0.1739537	0.9802195	0.3271280
TreatmentOne-Click Generate	0.3911395	0.1775675	2.2027659	0.0277553
TreatmentChat Generate	-0.4063896	0.2157953	-1.8832182	0.0598545
social_media_reply_numeric_medianBelow Median	-0.4494937	0.1427472	-3.1488798	0.0016697
TreatmentHint Control:social_media_reply_numeric_medianBelow Median	0.0886741	0.2049222	0.4327208	0.6652767
TreatmentOne-Click Generate:social_media_reply_numeric_medianBelow Median	-0.2393558	0.2072693	-1.1548057	0.2483447
TreatmentChat Generate:social_media_reply_numeric_medianBelow Median	1.0324384	0.2595483	3.9778276	0.0000727

review_freq_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1332382	0.1665592	0.7999449	0.4238630
TreatmentOne-Click Generate	0.1727823	0.1680500	1.0281604	0.3040318
TreatmentChat Generate	-0.6616916	0.2159879	-3.0635591	0.0022242
review_freq_numeric_medianBelow Median	-0.6337279	0.1390343	-4.5580681	0.0000056
TreatmentHint Control:review_freq_numeric_medianBelow Median	0.1539231	0.1993939	0.7719552	0.4402565
TreatmentOne-Click Generate:review_freq_numeric_medianBelow Median	0.0562936	0.1997912	0.2817622	0.7781627
TreatmentChat Generate:review_freq_numeric_medianBelow Median	1.4037798	0.2588600	5.4229299	0.0000001

age_median

[1] “Values for this variable: Above Median, Below Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2922008	0.1327947	2.2003954	0.0279232
TreatmentOne-Click Generate	0.1358251	0.1316744	1.0315231	0.3024535
TreatmentChat Generate	0.0322631	0.1775943	0.1816677	0.8558668
age_medianBelow Median	-0.0669709	0.1314056	-0.5096499	0.6103680
TreatmentHint Control:age_medianBelow Median	-0.1318464	0.1871507	-0.7044933	0.4812295
TreatmentOne-Click Generate:age_medianBelow Median	0.1097160	0.1855191	0.5914002	0.5543370
TreatmentChat Generate:age_medianBelow Median	0.5209230	0.2430449	2.1433195	0.0322399

income_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2166935	0.1771522	1.2232048	0.2214349
TreatmentOne-Click Generate	0.1157640	0.1682130	0.6881991	0.4914285
TreatmentChat Generate	0.2641804	0.2400378	1.1005780	0.2712481
income_numeric_medianBelow Median	0.0832645	0.1434636	0.5803877	0.5617360
TreatmentHint Control:income_numeric_medianBelow Median	0.0137850	0.2082701	0.0661880	0.9472366
TreatmentOne-Click Generate:income_numeric_medianBelow Median	0.1127083	0.2017302	0.5587084	0.5764400
TreatmentChat Generate:income_numeric_medianBelow Median	0.0591367	0.2782884	0.2125016	0.8317432

libcons_numeric_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2271190	0.1407395	1.6137539	0.1067806
TreatmentOne-Click Generate	0.3369315	0.1389809	2.4243017	0.0154495
TreatmentChat Generate	0.5136217	0.1716515	2.9922350	0.0028124
libcons_numeric_medianBelow Median	0.2389086	0.1322396	1.8066340	0.0710097
TreatmentHint Control:libcons_numeric_medianBelow Median	-0.0019044	0.1887015	-0.0100919	0.9919492
TreatmentOne-Click Generate:libcons_numeric_medianBelow Median	-0.2620248	0.1875434	-1.3971423	0.1625671
TreatmentChat Generate:libcons_numeric_medianBelow Median	-0.3866854	0.2432306	-1.5897893	0.1120825

mech_popup_median

[1] “Values for this variable: Below Median, Above Median”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.3661234	0.1361869	2.6883882	0.0072553
TreatmentOne-Click Generate	0.4833410	0.1396941	3.4599962	0.0005546
TreatmentChat Generate	0.1511103	0.1669638	0.9050485	0.3655778
mech_popup_medianBelow Median	-0.4768151	0.1281523	-3.7206915	0.0002056
TreatmentHint Control:mech_popup_medianBelow Median	-0.2027350	0.1832268	-1.1064699	0.2686917
TreatmentOne-Click Generate:mech_popup_medianBelow Median	-0.4011263	0.1839799	-2.1802723	0.0293845
TreatmentChat Generate:mech_popup_medianBelow Median	0.2937770	0.2369939	1.2395972	0.2153086

edu_combined

[1] “Values for this variable: Bachelor’s Degree, High School or Less, Graduate Degree, Some College”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.1995615	0.2459774	0.8113003	0.4173162
TreatmentOne-Click Generate	0.2658819	0.2642294	1.0062541	0.3144483
TreatmentChat Generate	0.3745777	0.3231449	1.1591631	0.2465659
edu_combinedSome College	0.1814463	0.2328903	0.7791061	0.4360345
edu_combinedBachelor’s Degree	0.0085161	0.2104882	0.0404589	0.9677324
edu_combinedGraduate Degree	-0.0490747	0.2358328	-0.2080912	0.8351847
TreatmentHint Control:edu_combinedSome College	-0.2599289	0.3170159	-0.8199239	0.4123839
TreatmentOne-Click Generate:edu_combinedSome College	-0.5207247	0.3306587	-1.5748101	0.1155016
TreatmentChat Generate:edu_combinedSome College	0.2169158	0.4192920	0.5173384	0.6049928
TreatmentHint Control:edu_combinedBachelor’s Degree	0.0619922	0.2832797	0.2188375	0.8268050
TreatmentOne-Click Generate:edu_combinedBachelor’s Degree	0.0462258	0.2972384	0.1555176	0.8764333
TreatmentChat Generate:edu_combinedBachelor’s Degree	0.1391223	0.3679153	0.3781366	0.7053802
TreatmentHint Control:edu_combinedGraduate Degree	0.3037820	0.3227341	0.9412764	0.3467079
TreatmentOne-Click Generate:edu_combinedGraduate Degree	0.0705953	0.3345375	0.2110235	0.8328963
TreatmentChat Generate:edu_combinedGraduate Degree	-0.9436688	0.4271257	-2.2093470	0.0272943

race_combined

[1] “Values for this variable: White, Hispanic, Black, Other, Asian”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2398376	0.1201502	1.9961479	0.0460906
TreatmentOne-Click Generate	0.1589094	0.1210345	1.3129264	0.1894000
TreatmentChat Generate	0.4093815	0.1616323	2.5327950	0.0114126
race_combinedBlack	0.2429908	0.1755617	1.3840760	0.1665323
race_combinedHispanic	-0.2561254	0.3016342	-0.8491257	0.3959410
race_combinedAsian	-0.1508905	0.2542883	-0.5933835	0.5530101
race_combinedOther	0.5564640	0.2368141	2.3497925	0.0189074
TreatmentHint Control:race_combinedBlack	0.0583995	0.2496219	0.2339517	0.8150531
TreatmentOne-Click Generate:race_combinedBlack	-0.0907815	0.2451896	-0.3702503	0.7112460
TreatmentChat Generate:race_combinedBlack	-0.7452324	0.3092277	-2.4099793	0.0160680
TreatmentHint Control:race_combinedHispanic	0.0089891	0.4352624	0.0206521	0.9835258
TreatmentOne-Click Generate:race_combinedHispanic	0.4655536	0.4235088	1.0992774	0.2718159
TreatmentChat Generate:race_combinedHispanic	0.8724832	0.5139603	1.6975694	0.0897876
TreatmentHint Control:race_combinedAsian	0.3470756	0.3565605	0.9733989	0.3305053
TreatmentOne-Click Generate:race_combinedAsian	0.8745007	0.3739562	2.3385110	0.0194859
TreatmentChat Generate:race_combinedAsian	0.8397920	0.4890811	1.7170812	0.0861619
TreatmentHint Control:race_combinedOther	-0.5783844	0.3316859	-1.7437716	0.0813950
TreatmentOne-Click Generate:race_combinedOther	-0.4553110	0.3171189	-1.4357739	0.1512663
TreatmentChat Generate:race_combinedOther	-0.6059246	0.4203505	-1.4414748	0.1496504

polparty_combined

[1] “Values for this variable: Other, Republican, Democrat”

	Estimate	Std. Error	t value	Pr(>\|t\|)
TreatmentHint Control	0.2834796	0.1503794	1.8850961	0.0596008
TreatmentOne-Click Generate	0.3206150	0.1517101	2.1133397	0.0347286
TreatmentChat Generate	0.6000651	0.1876178	3.1983376	0.0014097
polparty_combinedRepublican	0.1874162	0.1614837	1.1605888	0.2459850
polparty_combinedOther	0.2749459	0.1564452	1.7574579	0.0790340
TreatmentHint Control:polparty_combinedRepublican	-0.0303254	0.2290397	-0.1324024	0.8946830
TreatmentOne-Click Generate:polparty_combinedRepublican	-0.1450116	0.2299881	-0.6305182	0.5284471
TreatmentChat Generate:polparty_combinedRepublican	-0.6339879	0.3019314	-2.0997745	0.0359074
TreatmentHint Control:polparty_combinedOther	-0.1248957	0.2230657	-0.5599057	0.5756234
TreatmentOne-Click Generate:polparty_combinedOther	-0.2552607	0.2205544	-1.1573592	0.2473011
TreatmentChat Generate:polparty_combinedOther	-0.3662553	0.2862267	-1.2795986	0.2008748

We specifically look at social media reply and review frequency where we split each into three groups.

subgroup_lm_valuesplit_smr <- lm(paste0("num_comment ~ Treatment * social_media_reply_numeric_valuesplit + ", paste(video_columns, collapse = " + ")), data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))


summary(subgroup_lm_valuesplit_smr)

## 
## Call:
## lm(formula = paste0("num_comment ~ Treatment * social_media_reply_numeric_valuesplit + ", 
##     paste(video_columns, collapse = " + ")), data = df_wide_all %>% 
##     filter(prolific_id %in% respondents_to_keep))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.1496 -1.3775  0.3885  1.2244  7.2863 
## 
## Coefficients:
##                                                                      Estimate
## (Intercept)                                                           1.03387
## TreatmentHint Control                                                 0.17038
## TreatmentOne-Click Generate                                           0.39119
## TreatmentChat Generate                                               -0.40662
## social_media_reply_numeric_valuesplit3-4                             -0.24857
## social_media_reply_numeric_valuesplit1-2                             -0.83468
## video11                                                               0.19033
## video13                                                               0.17047
## video14                                                               0.38977
## video15                                                               0.34048
## video16                                                               0.27930
## video17                                                               0.15584
## video18                                                               0.15961
## video19                                                               0.36369
## video20                                                               0.27625
## video21                                                               0.29771
## video22                                                               0.25784
## video23                                                               0.40737
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4        0.06782
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4 -0.31555
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4       0.92989
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2        0.14312
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2 -0.06278
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2       1.22576
##                                                                      Std. Error
## (Intercept)                                                             0.19790
## TreatmentHint Control                                                   0.17256
## TreatmentOne-Click Generate                                             0.17615
## TreatmentChat Generate                                                  0.21407
## social_media_reply_numeric_valuesplit3-4                                0.15181
## social_media_reply_numeric_valuesplit1-2                                0.17680
## video11                                                                 0.08659
## video13                                                                 0.08690
## video14                                                                 0.09449
## video15                                                                 0.10052
## video16                                                                 0.08822
## video17                                                                 0.08919
## video18                                                                 0.09178
## video19                                                                 0.09789
## video20                                                                 0.09711
## video21                                                                 0.10107
## video22                                                                 0.09852
## video23                                                                 0.08885
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4          0.21791
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4    0.22079
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4         0.27701
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2          0.25150
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2    0.25004
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2         0.32620
##                                                                      t value
## (Intercept)                                                            5.224
## TreatmentHint Control                                                  0.987
## TreatmentOne-Click Generate                                            2.221
## TreatmentChat Generate                                                -1.899
## social_media_reply_numeric_valuesplit3-4                              -1.637
## social_media_reply_numeric_valuesplit1-2                              -4.721
## video11                                                                2.198
## video13                                                                1.962
## video14                                                                4.125
## video15                                                                3.387
## video16                                                                3.166
## video17                                                                1.747
## video18                                                                1.739
## video19                                                                3.715
## video20                                                                2.845
## video21                                                                2.946
## video22                                                                2.617
## video23                                                                4.585
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4         0.311
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4  -1.429
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4        3.357
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2         0.569
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2  -0.251
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2        3.758
##                                                                         Pr(>|t|)
## (Intercept)                                                          0.000000198
## TreatmentHint Control                                                   0.323633
## TreatmentOne-Click Generate                                             0.026509
## TreatmentChat Generate                                                  0.057686
## social_media_reply_numeric_valuesplit3-4                                0.101763
## social_media_reply_numeric_valuesplit1-2                             0.000002554
## video11                                                                 0.028093
## video13                                                                 0.049979
## video14                                                              0.000039030
## video15                                                                 0.000724
## video16                                                                 0.001576
## video17                                                                 0.080761
## video18                                                                 0.082238
## video19                                                                 0.000210
## video20                                                                 0.004505
## video21                                                                 0.003270
## video22                                                                 0.008954
## video23                                                              0.000004900
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4          0.755653
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4    0.153153
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4         0.000807
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2          0.569410
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2    0.801772
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2         0.000178
##                                                                         
## (Intercept)                                                          ***
## TreatmentHint Control                                                   
## TreatmentOne-Click Generate                                          *  
## TreatmentChat Generate                                               .  
## social_media_reply_numeric_valuesplit3-4                                
## social_media_reply_numeric_valuesplit1-2                             ***
## video11                                                              *  
## video13                                                              *  
## video14                                                              ***
## video15                                                              ***
## video16                                                              ** 
## video17                                                              .  
## video18                                                              .  
## video19                                                              ***
## video20                                                              ** 
## video21                                                              ** 
## video22                                                              ** 
## video23                                                              ***
## TreatmentHint Control:social_media_reply_numeric_valuesplit3-4          
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit3-4    
## TreatmentChat Generate:social_media_reply_numeric_valuesplit3-4      ***
## TreatmentHint Control:social_media_reply_numeric_valuesplit1-2          
## TreatmentOne-Click Generate:social_media_reply_numeric_valuesplit1-2    
## TreatmentChat Generate:social_media_reply_numeric_valuesplit1-2      ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.393 on 1575 degrees of freedom
## Multiple R-squared:  0.07549,    Adjusted R-squared:  0.06199 
## F-statistic: 5.591 on 23 and 1575 DF,  p-value: 0.0000000000000006568

subgroup_lm_valuesplit_rf <- lm(paste0("num_comment ~ Treatment * review_freq_numeric_valuesplit + ", paste(video_columns, collapse = " + ")), data = df_wide_all %>% filter(prolific_id %in% respondents_to_keep))

summary(subgroup_lm_valuesplit_rf)

## 
## Call:
## lm(formula = paste0("num_comment ~ Treatment * review_freq_numeric_valuesplit + ", 
##     paste(video_columns, collapse = " + ")), data = df_wide_all %>% 
##     filter(prolific_id %in% respondents_to_keep))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.9541 -1.3254  0.3406  1.1780  7.4623 
## 
## Coefficients:
##                                                               Estimate
## (Intercept)                                                    1.14129
## TreatmentHint Control                                          0.15641
## TreatmentOne-Click Generate                                    0.40383
## TreatmentChat Generate                                        -0.91093
## review_freq_numeric_valuesplit3-4                             -0.23333
## review_freq_numeric_valuesplit1-2                             -0.70370
## video11                                                        0.17486
## video13                                                        0.15660
## video14                                                        0.37575
## video15                                                        0.34643
## video16                                                        0.25746
## video17                                                        0.17333
## video18                                                        0.13413
## video19                                                        0.41697
## video20                                                        0.29429
## video21                                                        0.31925
## video22                                                        0.22231
## video23                                                        0.42268
## TreatmentHint Control:review_freq_numeric_valuesplit3-4        0.14327
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4 -0.06821
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4       0.99040
## TreatmentHint Control:review_freq_numeric_valuesplit1-2        0.06599
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2 -0.29904
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2       1.74495
##                                                               Std. Error
## (Intercept)                                                      0.23805
## TreatmentHint Control                                            0.25664
## TreatmentOne-Click Generate                                      0.26350
## TreatmentChat Generate                                           0.35037
## review_freq_numeric_valuesplit3-4                                0.20143
## review_freq_numeric_valuesplit1-2                                0.19788
## video11                                                          0.08625
## video13                                                          0.08669
## video14                                                          0.09425
## video15                                                          0.10016
## video16                                                          0.08812
## video17                                                          0.08871
## video18                                                          0.09162
## video19                                                          0.09773
## video20                                                          0.09685
## video21                                                          0.10093
## video22                                                          0.09797
## video23                                                          0.08858
## TreatmentHint Control:review_freq_numeric_valuesplit3-4          0.29429
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4    0.30118
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4         0.39347
## TreatmentHint Control:review_freq_numeric_valuesplit1-2          0.28826
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2    0.29340
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2         0.39234
##                                                               t value
## (Intercept)                                                     4.794
## TreatmentHint Control                                           0.609
## TreatmentOne-Click Generate                                     1.533
## TreatmentChat Generate                                         -2.600
## review_freq_numeric_valuesplit3-4                              -1.158
## review_freq_numeric_valuesplit1-2                              -3.556
## video11                                                         2.027
## video13                                                         1.806
## video14                                                         3.987
## video15                                                         3.459
## video16                                                         2.922
## video17                                                         1.954
## video18                                                         1.464
## video19                                                         4.266
## video20                                                         3.039
## video21                                                         3.163
## video22                                                         2.269
## video23                                                         4.772
## TreatmentHint Control:review_freq_numeric_valuesplit3-4         0.487
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4  -0.226
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4        2.517
## TreatmentHint Control:review_freq_numeric_valuesplit1-2         0.229
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2  -1.019
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2        4.448
##                                                                 Pr(>|t|)    
## (Intercept)                                                   0.00000179 ***
## TreatmentHint Control                                           0.542317    
## TreatmentOne-Click Generate                                     0.125580    
## TreatmentChat Generate                                          0.009412 ** 
## review_freq_numeric_valuesplit3-4                               0.246893    
## review_freq_numeric_valuesplit1-2                               0.000387 ***
## video11                                                         0.042792 *  
## video13                                                         0.071056 .  
## video14                                                       0.00007009 ***
## video15                                                         0.000557 ***
## video16                                                         0.003530 ** 
## video17                                                         0.050888 .  
## video18                                                         0.143391    
## video19                                                       0.00002105 ***
## video20                                                         0.002416 ** 
## video21                                                         0.001591 ** 
## video22                                                         0.023390 *  
## video23                                                       0.00000199 ***
## TreatmentHint Control:review_freq_numeric_valuesplit3-4         0.626438    
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit3-4   0.820856    
## TreatmentChat Generate:review_freq_numeric_valuesplit3-4        0.011930 *  
## TreatmentHint Control:review_freq_numeric_valuesplit1-2         0.818969    
## TreatmentOne-Click Generate:review_freq_numeric_valuesplit1-2   0.308253    
## TreatmentChat Generate:review_freq_numeric_valuesplit1-2      0.00000930 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.389 on 1575 degrees of freedom
## Multiple R-squared:  0.08087,    Adjusted R-squared:  0.06745 
## F-statistic: 6.025 on 23 and 1575 DF,  p-value: < 0.00000000000000022

We plot the number of comments by treatment and subgroup using group means.

subgroup_lm_valuesplit_smr_df <- df_wide_all %>% filter(prolific_id %in% respondents_to_keep) %>%
  group_by(Treatment, social_media_reply_numeric_valuesplit) %>%
  summarise(mean_comments = mean(num_comment, na.rm = T), se_comments = sd(num_comment, na.rm = T)/sqrt(n()), n = n())

## `summarise()` has grouped output by 'Treatment'. You can override using the
## `.groups` argument.

# rename valuesplit
subgroup_lm_valuesplit_smr_df$social_media_reply_numeric_valuesplit <- factor(subgroup_lm_valuesplit_smr_df$social_media_reply_numeric_valuesplit, levels = c("1-2", "3-4", "5-6"), labels = c("Low", "Medium", "High"))

subgroup_lm_valuesplit_rf_df <- df_wide_all %>% filter(prolific_id %in% respondents_to_keep) %>%
  group_by(Treatment, review_freq_numeric_valuesplit) %>%
  summarise(mean_comments = mean(num_comment, na.rm = T), se_comments = sd(num_comment, na.rm = T)/sqrt(n()), n = n())

## `summarise()` has grouped output by 'Treatment'. You can override using the
## `.groups` argument.

# rename valuesplit
subgroup_lm_valuesplit_rf_df$review_freq_numeric_valuesplit <- factor(subgroup_lm_valuesplit_rf_df$review_freq_numeric_valuesplit, levels = c("1-2", "3-4", "5-6"), labels = c("Low", "Medium", "High"))

ggplot(subgroup_lm_valuesplit_smr_df, aes(x = social_media_reply_numeric_valuesplit, y = mean_comments, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = mean_comments - 1.96*se_comments, ymax = mean_comments + 1.96*se_comments), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Number of Comments by Social Media Reply Frequency", x = "Social Media Reply Frequency", y = "Number of Comments") +
  scale_fill_manual(name = "Treatment",
                    values=c("Pure Control" = "yellowgreen", "Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(hjust = 1),
        # center title
        title = element_text(face = "bold"))

ggplot(subgroup_lm_valuesplit_rf_df, aes(x = review_freq_numeric_valuesplit, y = mean_comments, fill = Treatment)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = mean_comments - 1.96*se_comments, ymax = mean_comments + 1.96*se_comments), width = 0.2, position = position_dodge(0.9)) +
  labs(title = "Number of Comments by Review Frequency", x = "Frequency", y = "Number of Comments") +
  scale_fill_manual(name = "Treatment",
                    values=c("Pure Control" = "yellowgreen", "Hint Control" = "turquoise2","One-Click Generate"="pink2","Chat Generate" = "orange2")) +
  theme_minimal() +
  theme(axis.text.x = element_text(hjust = 1),
        # center title
        title = element_text(face = "bold"))

USTM Main Analysis

2025-03-28

Main Experiment

Load Processed Data

Additional Data Cleaning

Data Pre-processing

Final Dataframes Creation

Followup Dataframe Generation

Analysis

Randomization check

User Level Regression

All Videos

First Three Videos

Panel Regression

All Videos

First Three Videos

Mediation Analysis (User Level)

Individual Mediator Effect

Pop-up

Speed

Help Wording

Help Formulate

Difficult to Use

AI Aversion

Not Reflect True Opinion

Subgroup Analysis

social_media_use_numeric_median

website_use_numeric_median

social_media_reply_numeric_median

review_freq_numeric_median

age_median

income_numeric_median

libcons_numeric_median

mech_popup_median

edu_combined

race_combined

polparty_combined

Mode 3 and 4 Give-up Analysis (Commented Out For Now)

Mode 3

Mode 4

Follow-up Experiment

Load Data

Analysis

User Level Regression

Panel Regression

Review Similarities

Sentiment Similarities

Smiliarity Mechanism Understanding - Sentiment

Sentiment Label

Followup Reviews

Main Reviews

Label Similiarity

Sentiment Segments

Followup Reviews

Main Reviews

Sentiment Similarity

Sentiment Similarity (Together)

Single Sentiment Score (-1 to 1)

Followup Reviews

Main Reviews

Absolute Difference

Difference (Followup - Main)

Smiliarity Mechanism Understanding - Review Length

Followup

Main

Absolute Difference

Sentiment Distribution

Sentiment Label Distribution

Sentiment Segment Distribution

Sentiment Distribution Analysis for LLM

Sentiment Segments

Single Sentiment Score (-1 to 1)

Noise Analysis

“Mediation Analysis”

Sentiment Label Similarities

Sentiment Segment Similarities

Single Sentiment Score Difference (Followup - Main)

Single Sentiment Score Absolute Difference

Content Length Difference

Theme Count Difference