Importing, setting up data

The code in this section imports the data and cleans it. It is hidden by default. Many code chunks are. If you’d like to view them, click the “Show” button.

dat <- read.csv("~/Desktop/Junk/Google Drive/Research/--Research -- GLP1/02 Follow up initial study/GLP1 - Short - Part 2_October 8, 2025_19.56.csv", comment.char="#")


# First 11 lines are extra headers, preview data, labbies doing test runs
# The first ~40 real participants didn't see all of the questions because
# of a mistake in the JS code. Everyone past row 53 saw all the questions
dat=dat[-c(1:11,52:53),]

#Recode functions

likely_map=c(
  "Extremely unlikely" = 1,
  "Somewhat unlikely" = 2, 
  "Neither likely nor unlikely" = 3, 
  "Somewhat likely" = 4, 
  "Extremely likely" = 5
)

agree_map=c(
  "Strognly disagree" = 1,
  "Somewhat disagree" = 2, 
  "Neither agree nor disagree" = 3, 
  "Somewhat agree" = 4, 
  "Strongly agree" = 5
)

wtp_map=c(
  "Nothing more" = 0,
  "$0.50 more" = .5, 
  "$1.00 more" = 1, 
  "$1.50 more" = 1.5, 
  "$2.00 more" = 2,
  "$2.50 more" = 2.5,
  "$3.00 more" = 3,
  "$3.50 more" = 3.5,
  "$4.00 more" = 4,
  "$4.50 more" = 4.5,
  "$5.00+ more" = 5
)

yes_no_map=c(
  "Definitely no" = 1,
  "Probably no" = 2, 
  "Not sure" = 3, 
  "Probably yes" = 4, 
  "Definitely yes" = 5
)

For now, we’ll split off the ~40 participants who saw all the questions. The first 40 still have some data present, but we can aggregate that in later if we’re addressing a research question not directly related to the questions missed.

full.dat=subset(dat,dat$product.type != "")
# filter out that person who answered the attention/memory check incorrectly
full.dat=subset(dat,dat$product.type=="Smoothie Supplement Powder")

Do people really think the smoothies contain GLP-1?

In the following figures 1 = “Strongly disagree”, 5 = “Strongly agree.”

“The product shown on the previous page is likely to have effects similar to GLP-1 medications, such as Ozempic, Wegovy, Zepbound, or Mounjaro, depending on how much is consumed.”

full.dat$effect.like.glp1=agree_map[full.dat$effect.like.glp1]
barplot(table(factor(full.dat$effect.like.glp1,levels=1:5)), main="Effects like GLP-1")

The product shown on the previous page has effects similar to GLP-1 medications, such as Ozempic, Wegovy, Zepbound, or Mounjaro, when consumed.

full.dat$similar.to.glp1=agree_map[full.dat$similar.to.glp1]
barplot(table(factor(full.dat$similar.to.glp1,levels=1:5)), main="Effects similar to GLP-1")

“The product shown on the previous page works like taking a GLP-1 medications, such as Ozempic, Wegovy, Zepbound, or Mounjaro.”

full.dat$works.like.glp1=agree_map[full.dat$works.like.glp1]
barplot(table(factor(full.dat$works.like.glp1,levels=1:5)), main="Works like GLP-1")

Did smoothie-contains-GLP1 beliefs differ by product?

contain.glp1=rowMeans(cbind(
  full.dat$effect.like.glp1,
  full.dat$similar.to.glp1,
  full.dat$works.like.glp1
  ),na.rm=T)

summary(lm(contain.glp1~full.dat$condition))

## 
## Call:
## lm(formula = contain.glp1 ~ full.dat$condition)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.6667 -0.3111  0.2500  0.3556  1.3556 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                      3.66667    0.26119  14.038 3.16e-15 ***
## full.dat$conditionmock          -0.02222    0.32343  -0.069    0.946    
## full.dat$conditionsmoothie_king  0.41667    0.33720   1.236    0.226    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7388 on 32 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.07756,    Adjusted R-squared:  0.01991 
## F-statistic: 1.345 on 2 and 32 DF,  p-value: 0.2748

No evidence that GLP-1 beliefs are higher or lower for different products, though sample smalls are very small as of this writing.

Do people prefer one of the products more?

prefer=rowMeans(
  cbind(
    likely_map[full.dat$purchase1],
    agree_map[full.dat$purchase2],
    agree_map[full.dat$appealing]
  ),na.rm=TRUE
)

alt.prefer=rowMeans(
  cbind(
    likely_map[dat$purchase1],
    agree_map[dat$purchase2],
    agree_map[dat$appealing]
  ),na.rm=TRUE
)

summary(lm(prefer~full.dat$condition))

## 
## Call:
## lm(formula = prefer ~ full.dat$condition)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.98333 -1.25000  0.01667  1.15556  2.01667 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                       2.9833     0.4166   7.161 2.02e-08 ***
## full.dat$conditionmock           -0.1389     0.5378  -0.258    0.798    
## full.dat$conditionsmoothie_king  -0.7333     0.5455  -1.344    0.187    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.317 on 36 degrees of freedom
## Multiple R-squared:  0.05886,    Adjusted R-squared:  0.006571 
## F-statistic: 1.126 on 2 and 36 DF,  p-value: 0.3356

summary(lm(alt.prefer~dat$condition))

## 
## Call:
## lm(formula = alt.prefer ~ dat$condition)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.98485 -1.25000  0.01515  1.15556  2.01515 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  2.9848     0.3918   7.618 4.33e-09 ***
## dat$conditionmock           -0.1404     0.5158  -0.272    0.787    
## dat$conditionsmoothie_king  -0.7348     0.5236  -1.404    0.169    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.299 on 37 degrees of freedom
##   (40 observations deleted due to missingness)
## Multiple R-squared:  0.06039,    Adjusted R-squared:  0.009596 
## F-statistic: 1.189 on 2 and 37 DF,  p-value: 0.3159

No evidence so far that there’s an overall preference for one of the products, but sample size is small at this point.

wtp=wtp_map[full.dat$wtp]

alt.wtp=wtp_map[dat$wtp]

summary(lm(alt.wtp~dat$condition))

## 
## Call:
## lm(formula = alt.wtp ~ dat$condition)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.6818 -1.1947 -0.2333  0.7982  3.7667 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  1.6818     0.4694   3.583 0.000973 ***
## dat$conditionmock           -0.4485     0.6179  -0.726 0.472538    
## dat$conditionsmoothie_king  -0.5747     0.6272  -0.916 0.365466    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.557 on 37 degrees of freedom
##   (40 observations deleted due to missingness)
## Multiple R-squared:  0.02379,    Adjusted R-squared:  -0.02898 
## F-statistic: 0.4508 on 2 and 37 DF,  p-value: 0.6406

No evidence so far that anyone’s willing to pay any more for one of the products. Again, though, sample size is small right now for this.

People’s self-reported understanding of GLP-1

understand.glp1=rowMeans(
  cbind(
    agree_map[full.dat$understand1],
    agree_map[full.dat$understand2],
    agree_map[full.dat$understand3]
  ),na.rm=TRUE
)

alt.understand.glp1=rowMeans(
  cbind(
    agree_map[dat$understand1],
    agree_map[dat$understand2],
    agree_map[dat$understand3]
  ),na.rm=TRUE
)


hist(alt.understand.glp1,main="Self-reported understanding of GLP-1")

More objective indicators of GLP-1 familiarity

wttGLP1=yes_no_map[full.dat$wttGLP1]
alt.wttGLP1=yes_no_map[dat$wttGLP1]
barplot(table(factor(alt.wttGLP1,levels=1:5)), main="Willingness to try GLP-1")

taking.glp1=yes_no_map[full.dat$taking.glp1]
alt.taking.glp1=yes_no_map[dat$taking.glp1]
barplot(table(factor(alt.taking.glp1,levels=1:5)), main="Currently taking  GLP-1")

know.any.glp1.peeps=yes_no_map[full.dat$know.any.glp1.peeps]
alt.know.any.glp1.peeps=yes_no_map[dat$know.any.glp1.peeps]
barplot(table(factor(alt.know.any.glp1.peeps,levels=1:5)), main="Know anyone on GLP-1")

script.required=yes_no_map[full.dat$script.required]
alt.script.required=yes_no_map[dat$script.required]
barplot(table(factor(alt.script.required,levels=1:5)), main="Believe GLP-1 req's script")

Correlation matrix

This one includes only the 40 participants that answered the “contains GLP-1” questions.

# install.packages("Hmisc")   # if not already installed
library(Hmisc)

## 
## Attaching package: 'Hmisc'

## The following objects are masked from 'package:base':
## 
##     format.pval, units

library(corrplot)

## corrplot 0.95 loaded

vars=cbind(contain.glp1,prefer,wtp,understand.glp1,wttGLP1,taking.glp1,know.any.glp1.peeps,script.required)

# Get correlations and p-values
res <- rcorr(vars)

# Plot only the lower triangle
corrplot(
  res$r,
  method = "color",          # color-coded squares
  type = "lower",            # << shows only lower diagonal
  addCoef.col = "black",     # add r values
  tl.col = "black",          # text color for labels
  tl.srt = 45,               # rotate labels
  p.mat = res$P,             # matrix of p-values
  sig.level = 0.05,          # highlight p < .05
  insig = "blank",           # hide non-significant cells
  diag = FALSE               # optional: remove diagonal
)

A couple of takeaways:

Willingness to pay more for a GLP-1 support smoothie (wtp) is associated with general positive evaluations of GLP-1 support smoothies (prefer). No, duh!
Willingness to try GLP-1 medications (wttGLP1) is associated with general positive evaluations of the smoothie
Willingness to try GLP-1 medications also associated with higher self-rating understanding of GLP-1 mechanisms (understand.glp1)
Someone who is taking GLP-1 medications (taking.glp1) tend to have higher positive evaluations of the smoothies, believe they understand GLP-1 mechanisms more… and they’re willing to try GLP-1 medications… that they’re already on. Why is the correlation not a perfect 1 then?
Knowing people who are on GLP-1 medications (know.any.glp1.peeps) is associated with higher ratings of understanding GLP-1 mechanisms, willingness to try GLP-1 medications, to be taking GLP-1 medications themselves, and to believe you need a prescription to take GLP-1 medications.

The next plot includes all participants and excludes the “contains GLP-1” questions (because half wouldn’t have seen those). They also didn’t see the “prefer” or “wtp” items

vars=cbind(alt.understand.glp1,alt.wttGLP1,alt.taking.glp1,alt.know.any.glp1.peeps,alt.script.required)

# Get correlations and p-values
res <- rcorr(vars)

# Plot only the lower triangle
corrplot(
  res$r,
  method = "color",          # color-coded squares
  type = "lower",            # << shows only lower diagonal
  addCoef.col = "black",     # add r values
  tl.col = "black",          # text color for labels
  tl.srt = 45,               # rotate labels
  p.mat = res$P,             # matrix of p-values
  sig.level = 0.05,          # highlight p < .05
  insig = "blank",           # hide non-significant cells
  diag = FALSE               # optional: remove diagonal
)

Wrangling demographic data

age=as.numeric(dat$age)

bmi=(as.numeric(dat$weight) / (as.numeric(dat$height_1)^2)) * 703

income = as.numeric(factor(dat$income,
    levels = c(
        "$0 - $19,999",
        "$20,000 - $39,999",
        "$40,000 - $59,999",
        "$60,000 - $89,999",
        "$90,000 - $119,999",
        "$120,000 - $149,999",
        "$150,000+"
    ),
    ordered = TRUE
))


gender=factor(dat$gender)

race <- dat$race
# Collapse any multi-race (contains a comma) or "prefer not to say" responses
race[grepl(",", race, ignore.case = TRUE) |
     grepl("prefer not to say", race, ignore.case = TRUE)] <- "Other / Prefer not to say"
# Convert to factor
race <- factor(race)

Focusing on Self-rated understanding of GLP-1 mechanisms

summary(lm(alt.understand.glp1~age+bmi+income+gender+race))

## 
## Call:
## lm(formula = alt.understand.glp1 ~ age + bmi + income + gender + 
##     race)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.55853 -0.55554  0.01634  0.49225  1.79485 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                     2.4905304  0.6109856   4.076 0.000124 ***
## age                             0.0008837  0.0086300   0.102 0.918746    
## bmi                             0.0180331  0.0077534   2.326 0.023063 *  
## income                          0.0597941  0.0587782   1.017 0.312678    
## genderMale                     -0.3062500  0.2110677  -1.451 0.151458    
## genderPrefer not to say        -1.1555626  1.0545312  -1.096 0.277087    
## raceBlack or African American   0.7506907  0.5233551   1.434 0.156115    
## raceOther / Prefer not to say   0.8235830  0.7290235   1.130 0.262628    
## raceWhite or European American  0.4787787  0.4220262   1.134 0.260637    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.84 on 67 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.1396, Adjusted R-squared:  0.03685 
## F-statistic: 1.359 on 8 and 67 DF,  p-value: 0.2309

Focusing on product preference / consumer choice

summary(lm(alt.prefer~age+bmi+income+gender+race))

## 
## Call:
## lm(formula = alt.prefer ~ age + bmi + income + gender + race)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.9488 -1.0976  0.1479  1.1086  2.0740 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)
## (Intercept)                     2.5716254  1.5468938   1.662    0.107
## age                             0.0069635  0.0207305   0.336    0.739
## bmi                             0.0001385  0.0159234   0.009    0.993
## income                         -0.0814206  0.1356379  -0.600    0.553
## genderMale                      0.2258491  0.5327200   0.424    0.675
## genderPrefer not to say         1.1215103  1.8201641   0.616    0.542
## raceBlack or African American   0.2096313  1.1259339   0.186    0.854
## raceOther / Prefer not to say  -0.8045913  1.5256276  -0.527    0.602
## raceWhite or European American  0.0427547  0.9843125   0.043    0.966
## 
## Residual standard error: 1.419 on 31 degrees of freedom
##   (40 observations deleted due to missingness)
## Multiple R-squared:  0.06156,    Adjusted R-squared:  -0.1806 
## F-statistic: 0.2542 on 8 and 31 DF,  p-value: 0.9759

summary(lm(alt.wtp~age+bmi+income+gender+race))

## 
## Call:
## lm(formula = alt.wtp ~ age + bmi + income + gender + race)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7371 -1.2446 -0.3678  0.8873  3.5550 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)
## (Intercept)                     1.639426   1.827596   0.897    0.377
## age                            -0.003061   0.024492  -0.125    0.901
## bmi                            -0.018754   0.018813  -0.997    0.327
## income                          0.098066   0.160251   0.612    0.545
## genderMale                     -0.027444   0.629388  -0.044    0.965
## genderPrefer not to say         0.318253   2.150454   0.148    0.883
## raceBlack or African American   0.244173   1.330248   0.184    0.856
## raceOther / Prefer not to say  -0.666202   1.802470  -0.370    0.714
## raceWhite or European American  0.007887   1.162927   0.007    0.995
## 
## Residual standard error: 1.676 on 31 degrees of freedom
##   (40 observations deleted due to missingness)
## Multiple R-squared:  0.05158,    Adjusted R-squared:  -0.1932 
## F-statistic: 0.2108 on 8 and 31 DF,  p-value: 0.9866

GLP-1 Smoothie Pilot Part 2

Mark LaCour

2025-08-06