Dear reader :)
This is a first pass at our analyses. The graphs and regression
outputs are super ugly (I ran out of time to make pretty graphs and
tables). Sorry! Please forgive!
I also didn’t get around to doing one of the exploratory models:
“We will test if the order participants are shown affects the
relationship between dominance and perceived social impact:
dominance_beliefs ~ industry_social_good*order + section_number +
industry_worked_in”
Thanks! Sam
Load packages
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Load data
numeric <- read.csv("~/Downloads/(2024 num) Negotiations+-+Initial+assessment+-+Spring+2024.csv") %>%
slice(-(1:2))
text <- read.csv("~/Downloads/(2024 txt) Negotiations+-+Initial+assessment+-+Spring+2024.csv") %>%
slice(-(1:2))
Data cleaning
Select variables and merge dataframes
numeric_select <- numeric %>%
select(c(ResponseId, Q133, Q129:Q132_17, age)) %>%
mutate(across(starts_with("Q132"), as.numeric)) %>%
mutate(age = as.numeric(age))
text_select <- text %>%
select(ResponseId, Q126:Q127_5_TEXT, gender:race_6_TEXT)
full_data <- numeric_select %>%
left_join(text_select, by = "ResponseId")
Rename variables and reverse code
full_data <- full_data %>%
dplyr::rename(section = Q133,
industry = Q126,
industry_other = Q126_23_TEXT,
profit_status_prevEmploy = Q127,
profit_status_prevEmploy_other = Q127_5_TEXT,
socialImpact_1 = Q129,
socialImpact_2 = Q130,
socialImpact_3 = Q131,
dominance_1 = Q132_1,
dominance_2 = Q132_2,
dominance_3 = Q132_3,
dominance_4 = Q132_4,
dominance_5 = Q132_5,
dominance_6 = Q132_6,
dominance_7 = Q132_7,
dominance_8 = Q132_8,
prestige_1 = Q132_9,
prestige_2 = Q132_10,
prestige_3 = Q132_11,
prestige_4 = Q132_12,
prestige_5 = Q132_13,
prestige_6 = Q132_14,
prestige_7 = Q132_15,
prestige_8 = Q132_16,
prestige_9 = Q132_17) %>%
rowwise() %>%
mutate(dominance_5_R = 6 - dominance_5) %>%
mutate(dominance_7_R = 6 - dominance_7) %>%
mutate(prestige_2_R = 6 - prestige_2) %>%
mutate(prestige_4_R = 6 - prestige_4) %>%
mutate(prestige_9_R = 6 - prestige_9) %>%
ungroup() %>%
mutate(dominance_beliefs = rowMeans(select(., starts_with("dom")), na.rm = T)) %>%
mutate(prestige_beliefs = rowMeans(select(., starts_with("pres")), na.rm = T)) %>%
mutate(across(starts_with("social"), as.numeric)) %>%
mutate(industry_social_good = rowMeans(select(., starts_with("social")), na.rm = T))
Primary analysis
We will run the following model in R: dominance_beliefs ~
industry_social_good + section_number + industry_worked_in
QUESTION: What should the reference level be for section number and
industry worked in?
model = full_data %>%
lm(dominance_beliefs ~ industry_social_good + section + industry, data = .)
model %>%
summary()
##
## Call:
## lm(formula = dominance_beliefs ~ industry_social_good + section +
## industry, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.060 -0.371 0.000 0.312 1.249
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.724257 0.399230 6.824 4.64e-10 ***
## industry_social_good -0.022889 0.046329 -0.494 0.6222
## section2 0.007666 0.115918 0.066 0.9474
## section3 -0.090330 0.111453 -0.810 0.4194
## industryE-commerce -0.106057 0.424839 -0.250 0.8033
## industryEngineering 0.489457 0.453224 1.080 0.2825
## industryEntertainment & Sports 0.386421 0.398242 0.970 0.3340
## industryFinance 0.418173 0.372096 1.124 0.2635
## industryGovernment 0.546798 0.438706 1.246 0.2152
## industryHealthcare 0.121204 0.425513 0.285 0.7763
## industryHospitality 0.420671 0.633421 0.664 0.5080
## industryHuman Resources -0.396592 0.612003 -0.648 0.5183
## industryInformation Technology (IT) 0.194583 0.462804 0.420 0.6750
## industryManufacturing 0.178300 0.534279 0.334 0.7392
## industryMilitary 0.754716 0.463445 1.628 0.1062
## industryOther: 0.277502 0.381586 0.727 0.4686
## industryPharmaceutical 0.254716 0.463445 0.550 0.5837
## industryReal Estate 0.113787 0.434469 0.262 0.7939
## industryRetail 0.775478 0.424615 1.826 0.0704 .
## industrySales 0.163078 0.517521 0.315 0.7533
## industryTech 0.169301 0.387446 0.437 0.6630
## industryTransportation 0.105411 0.626324 0.168 0.8666
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4918 on 113 degrees of freedom
## (30 observations deleted due to missingness)
## Multiple R-squared: 0.1809, Adjusted R-squared: 0.02862
## F-statistic: 1.188 on 21 and 113 DF, p-value: 0.2752
No effect
ggplot(full_data,
aes(x=industry_social_good,
y=dominance_beliefs)) +
geom_point(alpha = 0.5) +
geom_smooth(method=lm,
fullrange=TRUE) +
theme_bw()
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 30 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 30 rows containing missing values (`geom_point()`).

Exploratory analyses
We will test if perceptions of the social good of the industry in
which they last worked predicts prestige beliefs, controlling for
section number is the section students are in and the industry they last
worked in: prestige_beliefs ~ industry_social_good + section_number +
industry_worked_in
model = full_data %>%
lm(prestige_beliefs ~ industry_social_good + section + industry, data = .)
model %>%
summary()
##
## Call:
## lm(formula = prestige_beliefs ~ industry_social_good + section +
## industry, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.59231 -0.16669 0.00238 0.16340 0.45577
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.455674 0.212015 16.299 <2e-16 ***
## industry_social_good 0.035351 0.024604 1.437 0.154
## section2 0.036296 0.061560 0.590 0.557
## section3 -0.085295 0.059188 -1.441 0.152
## industryE-commerce -0.003044 0.225615 -0.013 0.989
## industryEngineering -0.197222 0.240690 -0.819 0.414
## industryEntertainment & Sports -0.058387 0.211491 -0.276 0.783
## industryFinance -0.052851 0.197606 -0.267 0.790
## industryGovernment 0.121709 0.232980 0.522 0.602
## industryHealthcare -0.150172 0.225973 -0.665 0.508
## industryHospitality -0.310977 0.336385 -0.924 0.357
## industryHuman Resources -0.499153 0.325011 -1.536 0.127
## industryInformation Technology (IT) -0.215141 0.245777 -0.875 0.383
## industryManufacturing 0.052239 0.283735 0.184 0.854
## industryMilitary 0.029211 0.246118 0.119 0.906
## industryOther: 0.046446 0.202645 0.229 0.819
## industryPharmaceutical 0.029211 0.246118 0.119 0.906
## industryReal Estate -0.207400 0.230730 -0.899 0.371
## industryRetail -0.227078 0.225497 -1.007 0.316
## industrySales -0.292780 0.274835 -1.065 0.289
## industryTech -0.098062 0.205757 -0.477 0.635
## industryTransportation -0.287410 0.332616 -0.864 0.389
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2612 on 113 degrees of freedom
## (30 observations deleted due to missingness)
## Multiple R-squared: 0.1851, Adjusted R-squared: 0.03367
## F-statistic: 1.222 on 21 and 113 DF, p-value: 0.2466
No effect
ggplot(full_data,
aes(x=industry_social_good,
y=prestige_beliefs)) +
geom_point(alpha = 0.5) +
geom_smooth(method=lm,
fullrange=TRUE) +
theme_bw()
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 30 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 30 rows containing missing values (`geom_point()`).

We will run the primary analysis also controlling for prestige
beliefs: dominance_beliefs ~ industry_social_good + prestige_beliefs +
section_number + industry_worked_in
model = full_data %>%
lm(dominance_beliefs ~ industry_social_good + prestige_beliefs + section + industry, data = .)
model %>%
summary()
##
## Call:
## lm(formula = dominance_beliefs ~ industry_social_good + prestige_beliefs +
## section + industry, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.9142 -0.3411 0.0000 0.3105 1.2479
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.44026 0.71964 2.001 0.0478 *
## industry_social_good -0.03602 0.04604 -0.783 0.4356
## prestige_beliefs 0.37156 0.17443 2.130 0.0353 *
## section2 -0.00582 0.11432 -0.051 0.9595
## section3 -0.05864 0.11075 -0.529 0.5975
## industryE-commerce -0.10493 0.41834 -0.251 0.8024
## industryEngineering 0.56274 0.44762 1.257 0.2113
## industryEntertainment & Sports 0.40812 0.39228 1.040 0.3004
## industryFinance 0.43781 0.36652 1.195 0.2348
## industryGovernment 0.50158 0.43252 1.160 0.2487
## industryHealthcare 0.17700 0.41982 0.422 0.6741
## industryHospitality 0.53622 0.62609 0.856 0.3936
## industryHuman Resources -0.21113 0.60890 -0.347 0.7294
## industryInformation Technology (IT) 0.27452 0.45727 0.600 0.5495
## industryManufacturing 0.15889 0.52619 0.302 0.7632
## industryMilitary 0.74386 0.45639 1.630 0.1059
## industryOther: 0.26024 0.37584 0.692 0.4901
## industryPharmaceutical 0.24386 0.45639 0.534 0.5942
## industryReal Estate 0.19085 0.42935 0.445 0.6575
## industryRetail 0.85985 0.41999 2.047 0.0430 *
## industrySales 0.27186 0.51216 0.531 0.5966
## industryTech 0.20574 0.38190 0.539 0.5912
## industryTransportation 0.21220 0.61878 0.343 0.7323
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4843 on 112 degrees of freedom
## (30 observations deleted due to missingness)
## Multiple R-squared: 0.2127, Adjusted R-squared: 0.05811
## F-statistic: 1.376 on 22 and 112 DF, p-value: 0.1421