The code in this section imports the data and cleans it. It is hidden by default. Many code chunks are. If you’d like to view them, click the “Show” button.
dat <- read.csv("~/Desktop/Junk/Google Drive/Research/--Research -- GLP1/02 Follow up initial study/GLP1 - Short - Part 2_October 8, 2025_19.56.csv", comment.char="#")
# First 11 lines are extra headers, preview data, labbies doing test runs
# The first ~40 real participants didn't see all of the questions because
# of a mistake in the JS code. Everyone past row 53 saw all the questions
dat=dat[-c(1:11,52:53),]
#Recode functions
likely_map=c(
"Extremely unlikely" = 1,
"Somewhat unlikely" = 2,
"Neither likely nor unlikely" = 3,
"Somewhat likely" = 4,
"Extremely likely" = 5
)
agree_map=c(
"Strognly disagree" = 1,
"Somewhat disagree" = 2,
"Neither agree nor disagree" = 3,
"Somewhat agree" = 4,
"Strongly agree" = 5
)
wtp_map=c(
"Nothing more" = 0,
"$0.50 more" = .5,
"$1.00 more" = 1,
"$1.50 more" = 1.5,
"$2.00 more" = 2,
"$2.50 more" = 2.5,
"$3.00 more" = 3,
"$3.50 more" = 3.5,
"$4.00 more" = 4,
"$4.50 more" = 4.5,
"$5.00+ more" = 5
)
yes_no_map=c(
"Definitely no" = 1,
"Probably no" = 2,
"Not sure" = 3,
"Probably yes" = 4,
"Definitely yes" = 5
)
For now, we’ll split off the ~40 participants who saw all the questions. The first 40 still have some data present, but we can aggregate that in later if we’re addressing a research question not directly related to the questions missed.
full.dat=subset(dat,dat$product.type != "")
# filter out that person who answered the attention/memory check incorrectly
full.dat=subset(dat,dat$product.type=="Smoothie Supplement Powder")
In the following figures 1 = “Strongly disagree”, 5 = “Strongly agree.”
full.dat$effect.like.glp1=agree_map[full.dat$effect.like.glp1]
barplot(table(factor(full.dat$effect.like.glp1,levels=1:5)), main="Effects like GLP-1")
full.dat$similar.to.glp1=agree_map[full.dat$similar.to.glp1]
barplot(table(factor(full.dat$similar.to.glp1,levels=1:5)), main="Effects similar to GLP-1")
full.dat$works.like.glp1=agree_map[full.dat$works.like.glp1]
barplot(table(factor(full.dat$works.like.glp1,levels=1:5)), main="Works like GLP-1")
contain.glp1=rowMeans(cbind(
full.dat$effect.like.glp1,
full.dat$similar.to.glp1,
full.dat$works.like.glp1
),na.rm=T)
summary(lm(contain.glp1~full.dat$condition))
##
## Call:
## lm(formula = contain.glp1 ~ full.dat$condition)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.6667 -0.3111 0.2500 0.3556 1.3556
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.66667 0.26119 14.038 3.16e-15 ***
## full.dat$conditionmock -0.02222 0.32343 -0.069 0.946
## full.dat$conditionsmoothie_king 0.41667 0.33720 1.236 0.226
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7388 on 32 degrees of freedom
## (4 observations deleted due to missingness)
## Multiple R-squared: 0.07756, Adjusted R-squared: 0.01991
## F-statistic: 1.345 on 2 and 32 DF, p-value: 0.2748
No evidence that GLP-1 beliefs are higher or lower for different products, though sample smalls are very small as of this writing.
prefer=rowMeans(
cbind(
likely_map[full.dat$purchase1],
agree_map[full.dat$purchase2],
agree_map[full.dat$appealing]
),na.rm=TRUE
)
alt.prefer=rowMeans(
cbind(
likely_map[dat$purchase1],
agree_map[dat$purchase2],
agree_map[dat$appealing]
),na.rm=TRUE
)
summary(lm(prefer~full.dat$condition))
##
## Call:
## lm(formula = prefer ~ full.dat$condition)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.98333 -1.25000 0.01667 1.15556 2.01667
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.9833 0.4166 7.161 2.02e-08 ***
## full.dat$conditionmock -0.1389 0.5378 -0.258 0.798
## full.dat$conditionsmoothie_king -0.7333 0.5455 -1.344 0.187
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.317 on 36 degrees of freedom
## Multiple R-squared: 0.05886, Adjusted R-squared: 0.006571
## F-statistic: 1.126 on 2 and 36 DF, p-value: 0.3356
summary(lm(alt.prefer~dat$condition))
##
## Call:
## lm(formula = alt.prefer ~ dat$condition)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.98485 -1.25000 0.01515 1.15556 2.01515
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.9848 0.3918 7.618 4.33e-09 ***
## dat$conditionmock -0.1404 0.5158 -0.272 0.787
## dat$conditionsmoothie_king -0.7348 0.5236 -1.404 0.169
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.299 on 37 degrees of freedom
## (40 observations deleted due to missingness)
## Multiple R-squared: 0.06039, Adjusted R-squared: 0.009596
## F-statistic: 1.189 on 2 and 37 DF, p-value: 0.3159
No evidence so far that there’s an overall preference for one of the products, but sample size is small at this point.
wtp=wtp_map[full.dat$wtp]
alt.wtp=wtp_map[dat$wtp]
summary(lm(alt.wtp~dat$condition))
##
## Call:
## lm(formula = alt.wtp ~ dat$condition)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.6818 -1.1947 -0.2333 0.7982 3.7667
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.6818 0.4694 3.583 0.000973 ***
## dat$conditionmock -0.4485 0.6179 -0.726 0.472538
## dat$conditionsmoothie_king -0.5747 0.6272 -0.916 0.365466
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.557 on 37 degrees of freedom
## (40 observations deleted due to missingness)
## Multiple R-squared: 0.02379, Adjusted R-squared: -0.02898
## F-statistic: 0.4508 on 2 and 37 DF, p-value: 0.6406
No evidence so far that anyone’s willing to pay any more for one of the products. Again, though, sample size is small right now for this.
understand.glp1=rowMeans(
cbind(
agree_map[full.dat$understand1],
agree_map[full.dat$understand2],
agree_map[full.dat$understand3]
),na.rm=TRUE
)
alt.understand.glp1=rowMeans(
cbind(
agree_map[dat$understand1],
agree_map[dat$understand2],
agree_map[dat$understand3]
),na.rm=TRUE
)
hist(alt.understand.glp1,main="Self-reported understanding of GLP-1")
wttGLP1=yes_no_map[full.dat$wttGLP1]
alt.wttGLP1=yes_no_map[dat$wttGLP1]
barplot(table(factor(alt.wttGLP1,levels=1:5)), main="Willingness to try GLP-1")
taking.glp1=yes_no_map[full.dat$taking.glp1]
alt.taking.glp1=yes_no_map[dat$taking.glp1]
barplot(table(factor(alt.taking.glp1,levels=1:5)), main="Currently taking GLP-1")
know.any.glp1.peeps=yes_no_map[full.dat$know.any.glp1.peeps]
alt.know.any.glp1.peeps=yes_no_map[dat$know.any.glp1.peeps]
barplot(table(factor(alt.know.any.glp1.peeps,levels=1:5)), main="Know anyone on GLP-1")
script.required=yes_no_map[full.dat$script.required]
alt.script.required=yes_no_map[dat$script.required]
barplot(table(factor(alt.script.required,levels=1:5)), main="Believe GLP-1 req's script")
This one includes only the 40 participants that answered the “contains GLP-1” questions.
# install.packages("Hmisc") # if not already installed
library(Hmisc)
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
##
## format.pval, units
library(corrplot)
## corrplot 0.95 loaded
vars=cbind(contain.glp1,prefer,wtp,understand.glp1,wttGLP1,taking.glp1,know.any.glp1.peeps,script.required)
# Get correlations and p-values
res <- rcorr(vars)
# Plot only the lower triangle
corrplot(
res$r,
method = "color", # color-coded squares
type = "lower", # << shows only lower diagonal
addCoef.col = "black", # add r values
tl.col = "black", # text color for labels
tl.srt = 45, # rotate labels
p.mat = res$P, # matrix of p-values
sig.level = 0.05, # highlight p < .05
insig = "blank", # hide non-significant cells
diag = FALSE # optional: remove diagonal
)
A couple of takeaways:
The next plot includes all participants and excludes the “contains GLP-1” questions (because half wouldn’t have seen those). They also didn’t see the “prefer” or “wtp” items
vars=cbind(alt.understand.glp1,alt.wttGLP1,alt.taking.glp1,alt.know.any.glp1.peeps,alt.script.required)
# Get correlations and p-values
res <- rcorr(vars)
# Plot only the lower triangle
corrplot(
res$r,
method = "color", # color-coded squares
type = "lower", # << shows only lower diagonal
addCoef.col = "black", # add r values
tl.col = "black", # text color for labels
tl.srt = 45, # rotate labels
p.mat = res$P, # matrix of p-values
sig.level = 0.05, # highlight p < .05
insig = "blank", # hide non-significant cells
diag = FALSE # optional: remove diagonal
)
age=as.numeric(dat$age)
bmi=(as.numeric(dat$weight) / (as.numeric(dat$height_1)^2)) * 703
income = as.numeric(factor(dat$income,
levels = c(
"$0 - $19,999",
"$20,000 - $39,999",
"$40,000 - $59,999",
"$60,000 - $89,999",
"$90,000 - $119,999",
"$120,000 - $149,999",
"$150,000+"
),
ordered = TRUE
))
gender=factor(dat$gender)
race <- dat$race
# Collapse any multi-race (contains a comma) or "prefer not to say" responses
race[grepl(",", race, ignore.case = TRUE) |
grepl("prefer not to say", race, ignore.case = TRUE)] <- "Other / Prefer not to say"
# Convert to factor
race <- factor(race)
summary(lm(alt.understand.glp1~age+bmi+income+gender+race))
##
## Call:
## lm(formula = alt.understand.glp1 ~ age + bmi + income + gender +
## race)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.55853 -0.55554 0.01634 0.49225 1.79485
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.4905304 0.6109856 4.076 0.000124 ***
## age 0.0008837 0.0086300 0.102 0.918746
## bmi 0.0180331 0.0077534 2.326 0.023063 *
## income 0.0597941 0.0587782 1.017 0.312678
## genderMale -0.3062500 0.2110677 -1.451 0.151458
## genderPrefer not to say -1.1555626 1.0545312 -1.096 0.277087
## raceBlack or African American 0.7506907 0.5233551 1.434 0.156115
## raceOther / Prefer not to say 0.8235830 0.7290235 1.130 0.262628
## raceWhite or European American 0.4787787 0.4220262 1.134 0.260637
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.84 on 67 degrees of freedom
## (4 observations deleted due to missingness)
## Multiple R-squared: 0.1396, Adjusted R-squared: 0.03685
## F-statistic: 1.359 on 8 and 67 DF, p-value: 0.2309
summary(lm(alt.prefer~age+bmi+income+gender+race))
##
## Call:
## lm(formula = alt.prefer ~ age + bmi + income + gender + race)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9488 -1.0976 0.1479 1.1086 2.0740
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.5716254 1.5468938 1.662 0.107
## age 0.0069635 0.0207305 0.336 0.739
## bmi 0.0001385 0.0159234 0.009 0.993
## income -0.0814206 0.1356379 -0.600 0.553
## genderMale 0.2258491 0.5327200 0.424 0.675
## genderPrefer not to say 1.1215103 1.8201641 0.616 0.542
## raceBlack or African American 0.2096313 1.1259339 0.186 0.854
## raceOther / Prefer not to say -0.8045913 1.5256276 -0.527 0.602
## raceWhite or European American 0.0427547 0.9843125 0.043 0.966
##
## Residual standard error: 1.419 on 31 degrees of freedom
## (40 observations deleted due to missingness)
## Multiple R-squared: 0.06156, Adjusted R-squared: -0.1806
## F-statistic: 0.2542 on 8 and 31 DF, p-value: 0.9759
summary(lm(alt.wtp~age+bmi+income+gender+race))
##
## Call:
## lm(formula = alt.wtp ~ age + bmi + income + gender + race)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7371 -1.2446 -0.3678 0.8873 3.5550
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.639426 1.827596 0.897 0.377
## age -0.003061 0.024492 -0.125 0.901
## bmi -0.018754 0.018813 -0.997 0.327
## income 0.098066 0.160251 0.612 0.545
## genderMale -0.027444 0.629388 -0.044 0.965
## genderPrefer not to say 0.318253 2.150454 0.148 0.883
## raceBlack or African American 0.244173 1.330248 0.184 0.856
## raceOther / Prefer not to say -0.666202 1.802470 -0.370 0.714
## raceWhite or European American 0.007887 1.162927 0.007 0.995
##
## Residual standard error: 1.676 on 31 degrees of freedom
## (40 observations deleted due to missingness)
## Multiple R-squared: 0.05158, Adjusted R-squared: -0.1932
## F-statistic: 0.2108 on 8 and 31 DF, p-value: 0.9866