0) Intro Notes:

-Hi Frances! This html file illustrates initial data cleaning, primarily showing output. Several large code chunks have been hidden from the html file to improve readability but I can re-include these if you’d like.

1) Packages & R Markdown Setup

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(naniar)
library(gtsummary)
knitr::opts_chunk$set(include = FALSE)

2) Data Read-In

-Code hidden

Dimensions of Initial Data Set

dim(df)
## [1] 892 616
#892 people with 616 variables

3) Inattentive Responders

## 
## Do not use my data. I did not devote my full attention. 
##                                                      87 
##               Use my data. I devoted my full attention. 
##                                                     708
## 
##   A little likely A little unlikely            Likely          Unlikely 
##                23                21                11                16 
##       Very likely     Very unlikely 
##                23               746
## 
##    1 - Agree Strongly    2 - Agree Somewhat 3 - Disagree Somewhat 
##                    14                    42                    23 
## 4 - Disagree Strongly 
##                   755
## 
## (1) Not at all   (2) A little   (3) Somewhat       (4) Well  (5) Very well 
##              3              5             18              9            235
## 
## (1) Not at all   (2) A little   (3) Somewhat       (4) Well  (5) Very well 
##              2              3             15             17            236
## 
## (1) Not at all   (2) A little   (3) Somewhat       (4) Well  (5) Very well 
##              3             10             12             12            229
## # A tibble: 1 × 14
##   Total_Participants Failed_ATTN_Checks_Count Failed_ATTN_Chec… Data_Use_Exclud…
##                <int>                    <dbl>             <dbl>            <dbl>
## 1                892                      168             0.199               87
## # … with 10 more variables: Data_Use_Exclude_Percent <dbl>,
## #   BCaffEQ_19_Exclude_Count <dbl>, BCaffEQ_19_Exclude_Percent <dbl>,
## #   UPPS_P_57_Exclude_Count <dbl>, UPPS_P_57_Exclude_Percent <dbl>,
## #   SMS_ATTN_Neg_Exclude_Count <dbl>, SMS_ATTN_Neg_Exclude_Percent <dbl>,
## #   SMS8_Neutral_Exclude_Count <dbl>, SMS8_Neutral_Exclude_Percent <dbl>,
## #   SMS8_Positive_Exclude_Count <dbl>

4) People Who are Missing (have “NA” for) Condition Variable

-Code hidden.

5) Removing Non-Drinkers

## [1] 597 624
## 
## Four or more times a week           Monthly or less                     Never 
##                        22                       171                       169 
## Two to four times a month Two to three times a week 
##                       129                       106

5) Demograhpics

Demographics Missingness & Table

## Warning: `gather_()` was deprecated in tidyr 1.2.0.
## Please use `gather()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.

## Warning: The `fmt_missing()` function is deprecated and will soon be removed
## * Use the `sub_missing()` function instead
Characteristic Full Sample By Condition
N = 4281 Negative, N = 1391 Neutral, N = 1491 Positive, N = 1401
Age M(SD)=19.49(1.93) M(SD)=19.70(2.18) M(SD)=19.52(1.82) M(SD)=19.25(1.75)
Sex-at-Birth
Female 247 (58%) 75 (54%) 100 (67%) 72 (51%)
Male 181 (42%) 64 (46%) 49 (33%) 68 (49%)
Gender
Female 248 (58%) 75 (54%) 101 (68%) 72 (51%)
Male 179 (42%) 63 (45%) 48 (32%) 68 (49%)
Non-binary 1 (0.2%) 1 (0.7%) 0 (0%) 0 (0%)
Sexual Orientation
Asexual 3 (0.7%) 2 (1.4%) 1 (0.7%) 0 (0%)
Bisexual 22 (5.1%) 3 (2.2%) 8 (5.4%) 11 (7.9%)
Heterosexual 391 (91%) 130 (94%) 136 (91%) 125 (89%)
Homosexual 12 (2.8%) 4 (2.9%) 4 (2.7%) 4 (2.9%)
Race/Ethnicity
American Indian or Alaska Native 5 (1.2%) 2 (1.4%) 3 (2.0%) 0 (0%)
Asian 8 (1.9%) 1 (0.7%) 5 (3.4%) 2 (1.4%)
Black or African American 19 (4.5%) 6 (4.3%) 7 (4.7%) 6 (4.3%)
Hispanic or Latino 30 (7.0%) 10 (7.2%) 10 (6.7%) 10 (7.2%)
Middle Eastern 2 (0.5%) 0 (0%) 1 (0.7%) 1 (0.7%)
Multiracial 10 (2.3%) 4 (2.9%) 4 (2.7%) 2 (1.4%)
White (non-Hispanic) 352 (83%) 116 (83%) 119 (80%) 117 (85%)
Student Status
Yes 428 (100%) 139 (100%) 149 (100%) 140 (100%)
Student Year
Freshman 241 (56%) 71 (51%) 80 (54%) 90 (64%)
Junior 46 (11%) 14 (10%) 21 (14%) 11 (7.9%)
Senior 39 (9.1%) 18 (13%) 12 (8.1%) 9 (6.4%)
Sophomore 102 (24%) 36 (26%) 36 (24%) 30 (21%)
1 M(SD)=Mean(SD); n (%)

Demographic Statistics

##      
##       Negative Neutral Positive
##   Yes      139     149      140
## # A tibble: 8 × 4
##   Variable                df `Chi_Square/F_Value` p_value
##   <chr>                <int>                <dbl>   <dbl>
## 1 SAB.f                    2                 8.46  0.0145
## 2 Gender.f                 4                11.2   0.0244
## 3 Marital_Status.f         6                11.4   0.0758
## 4 Student_Year.f           6                 9.27  0.159 
## 5 Sexual_Orientation.f     6                 6.64  0.355 
## 6 Employment.f             6                 5.93  0.431 
## 7 Native_Language.f        6                 4.87  0.561 
## 8 Race_Ethnicity.f        12                 7.44  0.827
##              Df Sum Sq Mean Sq F value Pr(>F)
## Condition     2   14.5   7.242   1.961  0.142
## Residuals   424 1566.2   3.694               
## 1 observation deleted due to missingness

6) Substance Use

-Note: Some code has been hidden to improve readability

Substance Use Missingness & Table

-NOTE: Non-drinkers have been filtered out of this data-set

-Substance Use Missingness

-Substance Use Descriptives

## Warning: The `fmt_missing()` function is deprecated and will soon be removed
## * Use the `sub_missing()` function instead
Characteristic Full Sample By Condition
N = 4281 Negative, N = 1391 Neutral, N = 1491 Positive, N = 1401
Drinking Frequency
Never 0 (0%) 0 (0%) 0 (0%) 0 (0%)
Monthly or less 171 (40%) 65 (47%) 58 (39%) 48 (34%)
2-4x/month 129 (30%) 33 (24%) 48 (32%) 48 (34%)
2-3x/week 106 (25%) 32 (23%) 38 (26%) 36 (26%)
4+ x/week 22 (5.1%) 9 (6.5%) 5 (3.4%) 8 (5.7%)
Drinking Quantity
1-2 150 (35%) 53 (38%) 56 (38%) 41 (29%)
3-4 148 (35%) 44 (32%) 53 (36%) 51 (36%)
5-6 81 (19%) 28 (20%) 29 (19%) 24 (17%)
7-9 40 (9.3%) 12 (8.6%) 8 (5.4%) 20 (14%)
10+ 9 (2.1%) 2 (1.4%) 3 (2.0%) 4 (2.9%)
Binge Drinking Frequency
Never 155 (36%) 49 (35%) 63 (42%) 43 (31%)
< Monthly 141 (33%) 49 (35%) 46 (31%) 46 (33%)
Monthly 77 (18%) 23 (17%) 27 (18%) 27 (19%)
Weekly 54 (13%) 17 (12%) 13 (8.7%) 24 (17%)
Daily or ~Daily 1 (0.2%) 1 (0.7%) 0 (0%) 0 (0%)
AUDIT Total M(SD)=6.5(5.2) M(SD)=6.4(5.0) M(SD)=6.0(5.0) M(SD)=7.2(5.4)
DUDIT_Total M(SD)=2.4(4.6) M(SD)=2.2(4.6) M(SD)=2.6(5.4) M(SD)=2.4(3.8)
AUD Criteria Endorsed M(SD)=2.10(2.14) M(SD)=1.96(2.07) M(SD)=2.18(2.30) M(SD)=2.16(2.06)
SUD Criteria Endorsed M(SD)=1.28(2.26) M(SD)=0.95(1.69) M(SD)=1.35(2.44) M(SD)=1.52(2.53)
AUD Diagnostic Status
Mild 126 (29%) 29 (21%) 41 (28%) 56 (40%)
Moderate 66 (15%) 25 (18%) 24 (16%) 17 (12%)
None 203 (47%) 76 (55%) 71 (48%) 56 (40%)
Severe 33 (7.7%) 9 (6.5%) 13 (8.7%) 11 (7.9%)
SUD Diagnostic Status
Mild 62 (14%) 16 (12%) 18 (12%) 28 (20%)
Moderate 28 (6.5%) 12 (8.6%) 10 (6.7%) 6 (4.3%)
None 312 (73%) 108 (78%) 110 (74%) 94 (67%)
Severe 26 (6.1%) 3 (2.2%) 11 (7.4%) 12 (8.6%)
1 n (%); M(SD)=Mean(SD)

-Chi-Square Test of Categorical SUD Variables by Condition

## # A tibble: 7 × 4
##   Variable              df `Chi_Square/F_Value` p_value
##   <chr>              <int>                <dbl>   <dbl>
## 1 MINI_AUD_Dx            6                14.2   0.0272
## 2 MINI_SUD_Dx            6                12.9   0.0452
## 3 Favorite_Caff.f        8                 9.91  0.271 
## 4 AUDIT2.f               8                 9.80  0.280 
## 5 AUDIT1.f               6                 7.43  0.283 
## 6 AUDIT3.f               8                 9.59  0.295 
## 7 Favorite_Alcohol.f     6                 3.82  0.701

-ANOVAs for Continuous SUD Variables by Condition

##       Variable   F_value df_n df_d    p_value
## 1 MINI_SUD_Sum 2.3582996    2  425 0.09581754
## 2    AUDIT_Sum 2.0582221    2  425 0.12895159
## 3 MINI_AUD_Sum 0.4344687    2  425 0.64789596
## 4    DUDIT_Sum 0.2091036    2  425 0.81139458

7) Mood Induction

Mood Induction Effectiveness Across Conditions

Paired T-tests of mood valence pre-post induction

##    Variable    T_stat T_df    T_p_value    T_Mdiff
## t  Negative 11.417252  138 1.136635e-21  2.3453237
## t1  Neutral -3.710182  148 2.924186e-04 -0.6644295
## t2 Positive -9.287051  139 2.901962e-16 -1.6000000
## Adding missing grouping variables: `Condition`
## # A tibble: 3 × 5
##   Condition AG1_Valence_M AG2_Valence_M AG1_Valence_SD AG2_Valence_SD
##   <fct>             <dbl>         <dbl>          <dbl>          <dbl>
## 1 Negative           5.71          3.36           2.05           2.07
## 2 Neutral            5.79          6.46           2.10           1.98
## 3 Positive           5.39          6.99           2.00           1.87
## # A tibble: 3 × 3
##   Condition SD_Ratio Cor_Ratio
##   <fct>        <dbl>     <dbl>
## 1 Negative     0.991     0.308
## 2 Neutral      1.06      0.426
## 3 Positive     1.07      0.448

8) UPPS-P: Negative Urgency & Positive Urgency Info

Urgency Missingness, Condition T-tests, Means, & SDs (Note: some code hidden)

## # A tibble: 3 × 7
##   Condition NU_Avg_M PU_Avg_M NU_Avg_med PU_Avg_med NU_Avg_SD PU_Avg_SD
##   <fct>        <dbl>    <dbl>      <dbl>      <dbl>     <dbl>     <dbl>
## 1 Negative      2.32     1.91       2.33       1.93     0.603     0.549
## 2 Neutral       2.26     1.81       2.25       1.64     0.625     0.582
## 3 Positive      2.38     1.96       2.33       1.93     0.578     0.583
##              Df Sum Sq Mean Sq F value Pr(>F)
## Condition     2   0.86  0.4303   1.185  0.307
## Residuals   425 154.39  0.3633
##              Df Sum Sq Mean Sq F value Pr(>F)  
## Condition     2   1.67  0.8331   2.549 0.0794 .
## Residuals   424 138.59  0.3269                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1 observation deleted due to missingness

9) Momentary Distress Intolerance and Mindfulness Measures

-Note: Some code hidden

## # A tibble: 3 × 5
##   Condition MAAS_M MAAS_SD MDIS_Pre_M MDIS_Pre_SD
##   <fct>      <dbl>   <dbl>      <dbl>       <dbl>
## 1 Negative    3.68    3.68       3.16        1.38
## 2 Neutral     3.67    3.67       2.75        1.25
## 3 Positive    3.59    3.59       2.91        1.11
##              Df Sum Sq Mean Sq F value Pr(>F)
## Condition     2   0.65  0.3247    0.49  0.613
## Residuals   423 280.11  0.6622               
## 2 observations deleted due to missingness
##              Df Sum Sq Mean Sq F value Pr(>F)  
## Condition     2    8.2   4.116   2.827 0.0603 .
## Residuals   425  618.7   1.456                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##              Df Sum Sq Mean Sq F value Pr(>F)  
## Condition     2   12.0   6.020   3.849 0.0221 *
## Residuals   424  663.2   1.564                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1 observation deleted due to missingness

Joining Dataframes

## Rows: 428
## Columns: 35
## $ ID                   <int> 23, 24, 27, 28, 47, 48, 49, 50, 51, 52, 53, 54, 5…
## $ Condition            <fct> Neutral, Negative, Neutral, Negative, Negative, N…
## $ Age                  <dbl> 22, 21, 21, 24, 20, 19, 19, 19, 20, 19, 19, 20, 2…
## $ SAB.f                <fct> Female, Male, Female, Female, Male, Male, Female,…
## $ Gender.f             <fct> Female, Male, Female, Non-binary, Male, Male, Fem…
## $ Sexual_Orientation.f <fct> Heterosexual, Heterosexual, Heterosexual, Asexual…
## $ Race_Ethnicity.f     <fct> Hispanic or Latino, White (non-Hispanic), White (…
## $ Student_Status.f     <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes,…
## $ Student_Year.f       <fct> Senior, Junior, Senior, Senior, Sophomore, Freshm…
## $ Marital_Status.f     <fct> Single, Single, Single, Married, Single, Single, …
## $ Employment.f         <fct> Unemployed, Employed 1-20 hours per week, Employe…
## $ Native_Language.f    <fct> English, English, English, English, English, Engl…
## $ AUDIT1.f             <fct> 2-4x/month, Monthly or less, 2-3x/week, 2-4x/mont…
## $ AUDIT2.f             <fct> 3-4, 1-2, 1-2, 3-4, 3-4, 3-4, 3-4, 5-6, 1-2, 1-2,…
## $ AUDIT3.f             <fct> < Monthly, Never, < Monthly, < Monthly, Never, Mo…
## $ AUDIT_Sum            <dbl> 6, 1, 6, 14, 3, 10, 4, 9, 1, 1, 3, 7, 8, 2, 20, 1…
## $ DUDIT_Sum            <dbl> 0, 9, 4, 24, 0, 3, 0, 0, 0, 0, 0, 3, 3, 0, 1, 0, …
## $ MINI_AUD_Sum         <dbl> 7, 0, 9, 6, 1, 1, 1, 3, 0, 0, 2, 4, 4, 2, 7, 0, 1…
## $ MINI_SUD_Sum         <dbl> 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 3, 0, 3, 0, 5…
## $ MINI_AUD_Dx          <fct> Severe, None, Severe, Severe, None, None, None, M…
## $ MINI_SUD_Dx          <fct> None, None, None, None, None, None, None, None, N…
## $ AG1                  <dbl> 16, 42, 73, 77, 61, 62, 59, 60, 52, 60, 52, 62, 5…
## $ AG2                  <dbl> 41, 40, 73, 73, 56, 35, 41, 29, 52, 21, 62, 68, 3…
## $ AG1_Valence          <dbl> 7, 6, 1, 5, 7, 8, 5, 6, 7, 6, 7, 8, 8, 9, 5, 7, 7…
## $ AG1_Arousal          <dbl> 2, 5, 9, 9, 7, 7, 7, 7, 6, 7, 6, 7, 6, 6, 9, 7, 7…
## $ AG2_Valence          <dbl> 5, 4, 1, 1, 2, 8, 5, 2, 7, 3, 8, 5, 3, 9, 8, 4, 7…
## $ AG2_Arousal          <dbl> 5, 5, 9, 9, 7, 4, 5, 4, 6, 3, 7, 8, 5, 7, 7, 4, 9…
## $ NU_Avg               <dbl> 2.083333, 1.250000, 1.666667, 2.833333, 2.833333,…
## $ PU_Avg               <dbl> 1.500000, 1.142857, 1.142857, 3.000000, 2.428571,…
## $ SS_Avg               <dbl> 3.250000, 2.916667, 1.583333, 2.750000, 3.166667,…
## $ LoPM_Avg             <dbl> 2.000000, 1.545455, 1.363636, 2.818182, 1.818182,…
## $ LoPER_Avg            <dbl> 1.5, 1.4, 2.4, 2.3, 2.0, 2.1, 1.4, 1.6, 1.5, 1.4,…
## $ MDIS_Pre_Avg         <dbl> 1.333333, 1.333333, 5.000000, 4.000000, 4.333333,…
## $ MDIS_Post_Avg        <dbl> 2.666667, 1.666667, 5.666667, 7.000000, 4.666667,…
## $ MAAS_Avg             <dbl> 3.400000, 5.266667, 2.400000, 2.666667, 2.866667,…

10) Correlations Amongst Measures.

Each figure shows univariate distributions along the diagonal, correlations with p-values on the upper-diagonal, and bivariate scatterplots with LOESS curves along the lower-diagonal

-Full Data Set

-For Negative condition

-For Neutral condition

-For Positive condition

11) Writing data to csv

-Code below hased out to prevent continual re-writing of csv upon markdown publications.

#Full_df %>% write_csv("/Users/noahwolkowicz/Desktop/CT/West Haven/Postdoc/Postdoc Research/F&N #Collab/FN_Collab_6.28.22.csv")