library(purrr)library(dplyr)library(tidytable)library(stringr)library(ggplot2)data_path <-"/Users/apple/Downloads/energy_pilot2_raw"file_paths <-list.files(path = data_path, full.names =TRUE) d <-map_df(file_paths, ~read.csv(.x) %>%mutate(filename =basename(.x))) #creates a joined df with separate column for filename#data cleaningd_filtered <- d |>filter(task %in%c("energy-compcheck","diff-compcheck","task2","task3")) #subset to relevant trialsd_filtered <- d_filtered %>%group_by(mturk_participant_id) %>%filter(!any(correct =="false")) # Remove comprehension check non-passerslength(unique(d_filtered$mturk_participant_id))
#length(unique(task2$stimulus))#create new label for energy leveltask2 <- task2 %>%mutate(energy_level =case_when(str_detect(stimulus, "extremely tired") ~"0",str_detect(stimulus, "extremely energetic") ~"100",TRUE~NA# Default case if no keyword is found ))#removing NAstask2 <- task2 |>filter(if_any(c(energy_level, act_type, act_diff), ~!is.na(.)))
Difficulty ratings are clearly different across two energy levels. In “extremely tired” trials, there also seems to be differences across activity types (mental/ physical), as well as across difficulty levels (easy/ medium/ hard). But the difference is not so prominent in “extremely energetic” trials, where ratings are a lot more variable.
Adding the interaction terms increases AIC slightly and reduces R-squared - so the best model for predicting the overall response should be the one with the single predictor energy_level?
Difficulty ratings are significantly different across energy level condition - Grand mean for difficulty rating is 58.18, mean(energy_level=100) is 27.42, mean(energy_level=0) is 88.8. It looks like people think doing any activity would be easier when the individual is energetic.
Call:
lm(formula = response ~ act_type * act_diff, data = energetic,
contrasts = list(act_type = "contr.sum", act_diff = "contr.sum"))
Residuals:
Min 1Q Median 3Q Max
-35.417 -24.333 -3.417 14.583 65.667
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 27.6861 3.1848 8.693 5.71e-13 ***
act_type1 1.6806 3.1848 0.528 0.599
act_diff1 3.1556 4.5039 0.701 0.486
act_diff2 -1.8111 4.5039 -0.402 0.689
act_type1:act_diff1 2.8944 4.5039 0.643 0.522
act_type1:act_diff2 -0.1389 4.5039 -0.031 0.975
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 28.49 on 75 degrees of freedom
Multiple R-squared: 0.01616, Adjusted R-squared: -0.04943
F-statistic: 0.2464 on 5 and 75 DF, p-value: 0.9404
In “extremely tired” condition, there are significant differences across activity types - grand mean for difficulty rating is 89, mean(act_type = mental) = 84.6, mean(act_diff = hard) = 93.4. This means people think that when someone’s extremely tired, they tend to find physical activities harder than mental activities. Also, there’re significant differences across difficulty levels - mean(act_diff= hard) = 95, mean(act_diff = easy) = 84, so people do judge difficult activities as more difficult and easy activities as easier. No interaction effect was found.
No effect of action type or difficulty level found in “extremely energetic” condition - people think that when someone’s super energetic, they’ll likely find all activities similarly difficult/easy regardless of type and difficulty level.
Task 3
Visualization
#length(unique(task3$stimulus))#create new label for expressed difficultytask3 <- task3 %>%mutate(difficulty =case_when(str_detect(stimulus, "so hard") ~"hard",str_detect(stimulus, "so easy") ~"easy",TRUE~NA# Default case if no keyword is found ))#removing NAstask3 <- task3 |>filter(if_any(c(difficulty, act_type, act_diff), ~!is.na(.)))
task3$act_diff <-factor(task3$act_diff, levels =c("easy","medium","hard"))#plottingggplot(task3, aes(x = act_type, y=response, fill=act_diff))+stat_boxplot(geom='errorbar')+geom_boxplot() +facet_wrap(~difficulty,labeller =labeller(difficulty =c("easy"="That's going to be so easy", "hard"="That's going to be so hard"))) +# stat_summary(fun.y=mean, geom="point", size=1)+theme_minimal()+scale_fill_brewer(palette="Oranges")+labs(title="Inferred Energy Level Given Expressed Difficulty",x="Expressed Difficulty",y="Inferred Energy",fill ="difficulty level")+theme(plot.title =element_text(size=15))
Energy ratings are clearly different across two difficulty levels. In “so easy” trials, there seem to be differences across activity types (mental/ physical), but not so much across difficulty levels. It’s hard to tell if the energy ratings differ across activity type or difficulty level in “so hard” trials.
Call:
lm(formula = response ~ difficulty, data = task3, contrasts = list(difficulty = "contr.sum"))
Residuals:
Min 1Q Median 3Q Max
-57.284 -10.284 -0.346 14.420 52.654
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.815 1.272 43.09 <2e-16 ***
difficulty1 30.469 1.272 23.95 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 16.19 on 160 degrees of freedom
Multiple R-squared: 0.7819, Adjusted R-squared: 0.7805
F-statistic: 573.6 on 1 and 160 DF, p-value: < 2.2e-16
summary(m3.2)
Call:
lm(formula = response ~ difficulty * act_type, data = task3,
contrasts = list(difficulty = "contr.sum", act_type = "contr.sum"))
Residuals:
Min 1Q Median 3Q Max
-62.095 -10.988 0.357 9.905 53.357
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.736 1.249 43.814 <2e-16 ***
difficulty1 30.363 1.249 24.305 <2e-16 ***
act_type1 -2.863 1.249 -2.292 0.0232 *
difficulty1:act_type1 -2.133 1.249 -1.708 0.0897 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 15.89 on 158 degrees of freedom
Multiple R-squared: 0.7926, Adjusted R-squared: 0.7887
F-statistic: 201.3 on 3 and 158 DF, p-value: < 2.2e-16
AIC(m3.1)
[1] 1365.93
AIC(m3.2)
[1] 1361.764
Adding the interaction terms reduces AIC but increases adjusted R-squared slightly- so the model best for predicting the overall response should be the one with the single predictor difficulty?
Energy ratings are significantly different across difficulty condition - Grand mean for energy rating is 54.8, mean(difficulty=easy) = 85.3, mean(difficulty=hard) = 24.35. People think that if it’s easy for someone to do something, they likely have high energy level, and vice versa.
Call:
lm(formula = response ~ act_type * act_diff, data = easy, contrasts = list(act_type = "contr.sum",
act_diff = "contr.sum"))
Residuals:
Min 1Q Median 3Q Max
-60.917 -6.333 4.667 10.200 21.667
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 84.9750 1.6447 51.667 < 2e-16 ***
act_type1 -5.0417 1.6447 -3.065 0.00302 **
act_diff1 0.8583 2.3259 0.369 0.71315
act_diff2 -1.4083 2.3259 -0.605 0.54668
act_type1:act_diff1 -2.4583 2.3259 -1.057 0.29394
act_type1:act_diff2 0.8083 2.3259 0.348 0.72916
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 14.71 on 75 degrees of freedom
Multiple R-squared: 0.1278, Adjusted R-squared: 0.06969
F-statistic: 2.199 on 5 and 75 DF, p-value: 0.06322
In “so easy” condition, there are significant differences across activity types - grand mean for energy rating is 85, mean(act_type = mental) = 80, mean(act_diff = physical ) = 90. This means people think that others’ energy level tend to be higher if they find it easy to do an physical activity, as compared to a mental activity. No significant differences across difficulty levels were found - people think that others’ energy levels are high when they find it easy to do activities, regardless of how difficult the activities are.
No effect of action type or difficulty level found in “so hard” condition - people think that when someone find it extremely hard to do something, their energy level is low regardless of activity type or difficulty level.