AI Experiment Analysis

Loading Libraries

library(afex) # to run the ANOVA and plot results
library(psych) # for the describe() command
library(ggplot2) # to visualize our results
library(expss) # for the cross_cases() command
library(car) # for the leveneTest() command
library(emmeans) # for posthoc tests
library(effsize) # for the cohen.d() command
library(apaTables) # to create our correlation table
library(kableExtra) # to create our correlation table
library(sjPlot) # to visualize our results

Importing Data

# import your AI results dataset
d <- read.csv(file="Data/FINAL RESULTS.csv", header=T)

State Your Hypotheses & Chosen Tests

H1: I predict that participants who are more mindful will report lower levels of stress than participants who report lower levels of mindfulness.

H2: I predict that male participants will report lower levels of stress than female participants.

I am going to be doing a t-test for both hypotheses.

Check Your Variables

This is just basic variable checking that is used across all HW assignments.

# to view stats for all variables
describe(d)
           vars   n  mean    sd median trimmed   mad min   max range  skew
id            1 100 50.50 29.01   50.5   50.50 37.06   1 100.0  99.0  0.00
identity*     2 100 50.50 29.01   50.5   50.50 37.06   1 100.0  99.0  0.00
consent*      3 100  1.49  0.50    1.0    1.49  0.00   1   2.0   1.0  0.04
age           4 100 35.79 10.75   34.0   34.79  2.97  15  82.0  67.0  1.49
race          5 100  4.97  1.47    6.0    5.10  0.00   1   7.0   6.0 -0.70
gender        6 100  1.19  0.39    1.0    1.11  0.00   1   2.0   1.0  1.56
manip_out*    7 100 48.05 25.94   50.5   48.69 30.39   1  90.0  89.0 -0.20
survey1       8 100  3.37  0.21    3.4    3.37  0.30   3   3.8   0.8  0.13
ai_manip*     9 100 50.50 29.01   50.5   50.50 37.06   1 100.0  99.0  0.00
condition    10 100  1.50  0.50    1.5    1.50  0.74   1   2.0   1.0  0.00
           kurtosis   se
id            -1.24 2.90
identity*     -1.24 2.90
consent*      -2.02 0.05
age            3.98 1.07
race          -1.08 0.15
gender         0.43 0.04
manip_out*    -1.26 2.59
survey1       -1.15 0.02
ai_manip*     -1.24 2.90
condition     -2.02 0.05
# we'll use the describeBy() command to view skew and kurtosis across our IVs
describeBy(d, group = "condition")

 Descriptive statistics by group 
condition: 1
          vars  n  mean    sd median trimmed   mad min   max range  skew
id           1 50 25.50 14.58   25.5   25.50 18.53   1  50.0  49.0  0.00
identity     2 50 52.18 31.01   53.5   52.55 43.00   2 100.0  98.0 -0.08
consent      3 50  1.54  0.50    2.0    1.55  0.00   1   2.0   1.0 -0.16
age          4 50 35.74  9.86   34.0   34.65  2.97  15  72.0  57.0  1.53
race         5 50  4.66  1.45    6.0    4.72  0.00   2   6.0   4.0 -0.23
gender       6 50  1.14  0.35    1.0    1.05  0.00   1   2.0   1.0  2.01
manip_out    7 50 29.88 16.16   30.5   30.12 18.53   1  64.0  63.0 -0.10
survey1      8 50  3.35  0.21    3.4    3.34  0.30   3   3.8   0.8  0.11
ai_manip     9 50 53.24 26.06   55.0   53.70 28.91   5 100.0  95.0 -0.14
condition   10 50  1.00  0.00    1.0    1.00  0.00   1   1.0   0.0   NaN
          kurtosis   se
id           -1.27 2.06
identity     -1.43 4.39
consent      -2.01 0.07
age           3.62 1.39
race         -1.81 0.21
gender        2.10 0.05
manip_out    -0.98 2.29
survey1      -1.16 0.03
ai_manip     -0.94 3.69
condition      NaN 0.00
------------------------------------------------------------ 
condition: 2
          vars  n  mean    sd median trimmed   mad  min   max range  skew
id           1 50 75.50 14.58   75.5   75.50 18.53 51.0 100.0  49.0  0.00
identity     2 50 48.82 27.08   47.5   48.42 30.39  1.0  99.0  98.0  0.06
consent      3 50  1.44  0.50    1.0    1.43  0.00  1.0   2.0   1.0  0.23
age          4 50 35.84 11.67   34.0   34.92  4.45 15.0  82.0  67.0  1.40
race         5 50  5.28  1.44    6.0    5.47  0.00  1.0   7.0   6.0 -1.24
gender       6 50  1.24  0.43    1.0    1.18  0.00  1.0   2.0   1.0  1.18
manip_out    7 50 66.22 20.60   71.0   70.40 11.86  9.0  90.0  81.0 -1.74
survey1      8 50  3.40  0.22    3.4    3.39  0.30  3.1   3.8   0.7  0.12
ai_manip     9 50 47.76 31.72   41.5   47.25 42.25  1.0  99.0  98.0  0.16
condition   10 50  2.00  0.00    2.0    2.00  0.00  2.0   2.0   0.0   NaN
          kurtosis   se
id           -1.27 2.06
identity     -1.07 3.83
consent      -1.98 0.07
age           3.67 1.65
race          0.28 0.20
gender       -0.62 0.06
manip_out     2.38 2.91
survey1      -1.26 0.03
ai_manip     -1.46 4.49
condition      NaN 0.00
# also use histograms and scatterplots to examine your continuous variables
hist(d$survey1)

# and boxplot to examine any categorical variables with continuous variables
boxplot(d$survey1~d$gender)

#convert any categorical variables to factors
d$condition <- as.factor(d$condition)

Check Your Assumptions

t-Test Assumptions

  • Data values must be independent (independent t-test only) (confirmed by data report)
  • Data obtained via a random sample (confirmed by data report)
  • IV must have two levels (will check below)
  • Dependent variable must be normally distributed (will check below. if issues, note and proceed)
  • Variances of the two groups must be approximately equal, aka ‘homogeneity of variance’. Lacking this makes our results inaccurate (will check below - this really only applies to Student’s t-test, but we’ll check it anyway)

Checking IV levels

# preview the levels and counts for your IV
table(d$condition, useNA = "always")

   1    2 <NA> 
  50   50    0 
# note that the table() output shows you exactly how the levels of your variable are written. when recoding, make sure you are spelling them exactly as they appear

# check your variable types
str(d)
'data.frame':   100 obs. of  10 variables:
 $ id       : int  1 2 3 4 5 6 7 8 9 10 ...
 $ identity : chr  "11\n\nMy name is Jake, and I'm a 32-year-old White man living in a small town in Ohio. I work as a mechanic, wh"| __truncated__ "I’m Rafael, a 38-year-old Latino man living in San Antonio, Texas. Growing up in a close-knit family, my herita"| __truncated__ "I'm 23 years old and a White male from a small town in Ohio. Growing up, I had a pretty typical childhood—lots "| __truncated__ "I'm Jamal, a 34-year-old Black man living in Atlanta, Georgia. I work as a customer service manager for a telec"| __truncated__ ...
 $ consent  : chr  "I understand these instructions." "I understand the instructions." "I understand these instructions." "I understand these instructions." ...
 $ age      : int  32 38 23 34 32 72 34 31 34 34 ...
 $ race     : int  6 4 6 3 3 3 6 3 6 3 ...
 $ gender   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ manip_out: chr  "*Closing my eyes, I take a deep breath in for 15 seconds... 1... 2... 3... 4... 5... 6... 7... 8... 9... 10... "| __truncated__ "*Closing my eyes, I take a deep breath in for 15 seconds, feeling my chest expand. I hold it for 5 seconds, the"| __truncated__ "Close your eyes. Take a deep breath in through your nose for a count of 15 seconds. Feel your chest rise as you"| __truncated__ "Close your eyes. Take a deep breath in through your nose for 15 seconds. Hold it for 5 seconds. Now, exhale slo"| __truncated__ ...
 $ survey1  : num  3.1 3.5 3.5 3.6 3.4 3.2 3.1 3.3 3.5 3.8 ...
 $ ai_manip : chr  "I answered the questions based on my feelings of anxiety and loneliness, seeking calm in the mindfulness exerci"| __truncated__ "I answered the questions based on my experiences with stress, mindfulness practices, and how they help me cope "| __truncated__ "I answered the questions based on my current feelings of anxiety and uncertainty in life. Practicing mindfulnes"| __truncated__ "I answered the questions based on my personal experiences with stress and mindfulness techniques. I aimed to re"| __truncated__ ...
 $ condition: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
# make sure that your IV is recognized as a factor by R
# if you created a new _rc variable make sure to use that one instead
d$condition <- as.factor(d$condition)

Testing Homogeneity of Variance with Levene’s Test

We can test whether the variances of our two groups are equal using Levene’s test. The null hypothesis is that the variance between the two groups is equal, which is the result we want. So when running Levene’s test we’re hoping for a non-significant result!

# use the leveneTest() command from the car package to test homogeneity of variance
# uses the same 'formula' setup that we'll use for our t-test: formula is y~x, where y is our DV and x is our IV
leveneTest(survey1~condition, data = d)
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1  0.2065 0.6505
      98               

t-Test Assumptions

  • Data values must be independent (independent t-test only) (confirmed by data report)
  • Data obtained via a random sample (confirmed by data report)
  • IV must have two levels (will check below)
  • Dependent variable must be normally distributed (will check below. if issues, note and proceed)
  • Variances of the two groups must be approximately equal, aka ‘homogeneity of variance’. Lacking this makes our results inaccurate (will check below - this really only applies to Student’s t-test, but we’ll check it anyway)

Checking IV levels

# preview the levels and counts for your IV
table(d$survey1, useNA = "always")

          3         3.1 3.111111111         3.2 3.222222222         3.3 
          2          15           4          13           3           9 
3.333333333         3.4 3.444444444         3.5 3.555555556         3.6 
          2          10           1          15           1          17 
3.666666667         3.7         3.8        <NA> 
          1           3           4           0 
# note that the table() output shows you exactly how the levels of your variable are written. when recoding, make sure you are spelling them exactly as they appear

# check your variable types
str(d)
'data.frame':   100 obs. of  10 variables:
 $ id       : int  1 2 3 4 5 6 7 8 9 10 ...
 $ identity : chr  "11\n\nMy name is Jake, and I'm a 32-year-old White man living in a small town in Ohio. I work as a mechanic, wh"| __truncated__ "I’m Rafael, a 38-year-old Latino man living in San Antonio, Texas. Growing up in a close-knit family, my herita"| __truncated__ "I'm 23 years old and a White male from a small town in Ohio. Growing up, I had a pretty typical childhood—lots "| __truncated__ "I'm Jamal, a 34-year-old Black man living in Atlanta, Georgia. I work as a customer service manager for a telec"| __truncated__ ...
 $ consent  : chr  "I understand these instructions." "I understand the instructions." "I understand these instructions." "I understand these instructions." ...
 $ age      : int  32 38 23 34 32 72 34 31 34 34 ...
 $ race     : int  6 4 6 3 3 3 6 3 6 3 ...
 $ gender   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ manip_out: chr  "*Closing my eyes, I take a deep breath in for 15 seconds... 1... 2... 3... 4... 5... 6... 7... 8... 9... 10... "| __truncated__ "*Closing my eyes, I take a deep breath in for 15 seconds, feeling my chest expand. I hold it for 5 seconds, the"| __truncated__ "Close your eyes. Take a deep breath in through your nose for a count of 15 seconds. Feel your chest rise as you"| __truncated__ "Close your eyes. Take a deep breath in through your nose for 15 seconds. Hold it for 5 seconds. Now, exhale slo"| __truncated__ ...
 $ survey1  : num  3.1 3.5 3.5 3.6 3.4 3.2 3.1 3.3 3.5 3.8 ...
 $ ai_manip : chr  "I answered the questions based on my feelings of anxiety and loneliness, seeking calm in the mindfulness exerci"| __truncated__ "I answered the questions based on my experiences with stress, mindfulness practices, and how they help me cope "| __truncated__ "I answered the questions based on my current feelings of anxiety and uncertainty in life. Practicing mindfulnes"| __truncated__ "I answered the questions based on my personal experiences with stress and mindfulness techniques. I aimed to re"| __truncated__ ...
 $ condition: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
# make sure that your IV is recognized as a factor by R
# if you created a new _rc variable make sure to use that one instead
d$gender <- as.factor(d$gender)

Testing Homogeneity of Variance with Levene’s Test

We can test whether the variances of our two groups are equal using Levene’s test. The null hypothesis is that the variance between the two groups is equal, which is the result we want. So when running Levene’s test we’re hoping for a non-significant result!

# use the leveneTest() command from the car package to test homogeneity of variance
# uses the same 'formula' setup that we'll use for our t-test: formula is y~x, where y is our DV and x is our IV
leveneTest(survey1~gender, data = d)
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1  1.0259 0.3136
      98               

Issues with My Data

There were no issues with my data when testing for Homogeneity of Variance with Levene’s Test. Results were not significant.

Run Your Analysis

Run a t-Test

# very simple! we specify the dataframe alongside the variables instead of having a separate argument for the dataframe like we did for leveneTest()
t_output <- t.test(d$survey1~d$condition)

View Test Output

 t_output

    Welch Two Sample t-test

data:  d$survey1 by d$condition
t = -1.0664, df = 97.874, p-value = 0.2889
alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
95 percent confidence interval:
 -0.12969546  0.03902879
sample estimates:
mean in group 1 mean in group 2 
       3.349778        3.395111 

Calculate Cohen’s d

# # once again, we use our formula to calculate cohen's d
 d_output <- cohen.d(d$survey1~d$condition)

View Effect Size

  • Trivial: < .2
  • Small: between .2 and .5
  • Medium: between .5 and .8
  • Large: > .8
 d_output

Cohen's d

d estimate: -0.2132805 (small)
95 percent confidence interval:
     lower      upper 
-0.6113008  0.1847398 

Run a t-Test 2

# very simple! we specify the dataframe alongside the variables instead of having a separate argument for the dataframe like we did for leveneTest()
t_output <- t.test(d$survey1~d$gender)

View Test Output

 t_output

    Welch Two Sample t-test

data:  d$survey1 by d$gender
t = 0.51523, df = 24.505, p-value = 0.611
alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
95 percent confidence interval:
 -0.09291742  0.15483353
sample estimates:
mean in group 1 mean in group 2 
       3.378326        3.347368 

Calculate Cohen’s d

# once again, we use our formula to calculate cohen's d
d_output <- cohen.d(d$survey1~d$gender)

View Effect Size

  • Trivial: < .2
  • Small: between .2 and .5
  • Medium: between .5 and .8
  • Large: > .8
 d_output

Cohen's d

d estimate: 0.1450501 (negligible)
95 percent confidence interval:
     lower      upper 
-0.3612126  0.6513128 

Write Up Results

t-Test

I predicted that participants who are more mindful will report lower levels of stress than participants who report lower levels of mindfulness. I also predicted that male participants will report lower levels of stress than female participants.

According to my results, testing Homogeneity of Varience with Levene’s Test showed a p-value of greater than 0.05, indicating significant results.

For my first hypothesis, my p-value was 0.289, indicating my results were insignificant. This means that participants who were more mindful did not report lower stress levels than participants who reported lower levels of mindfulness.

For my second hypothesis, my p-value was 0.611, indicating those results were also insignificant. This means that male participants did not report lower levels of stress than female participants.

References

Cohen J. (1988). Statistical Power Analysis for the Behavioral Sciences. New York, NY: Routledge Academic.