In this project, we’re interested in determining whether specific serotonergic regulator genes interact with early life stress to predict hoarding symptoms (difficulty discarding, excessive acquiring, and clutter).
##
## SS SL LL
## GG 0 0 1
## AG 1 24 30
## AA 67 147 86
0 people are SgSg, 0 people are SgLg, 1 person is LgLg 1 person is SaSg, 24 people are SaLg, 30 people are LaLg 67 people are SaSa, 147 people are SaLa, 86 people are LaLa
Create effect-coded variables for 3 genotype groups with combined 5-HTTLPR and rs25531: S/S vs. La/La vs. all other genotypes
hist(genes$BDI_tot)
hist(genes$NLEQ_tot)
hist(genes$SC_tot)
skew(genes$BDI_tot)
## skew (g1) se z p
## 2.1999871 0.1298227 16.9460900 0.0000000
skew(genes$NLEQ_tot)
## skew (g1) se z p
## 1.1583839 0.1298227 8.9228149 0.0000000
skew(genes$SC_tot)
## skew (g1) se z p
## -0.02452497 0.21997067 -0.11149199 0.91122621
kurtosis(genes$BDI_tot)
## Excess Kur (g2) se z p
## 7.3880089 0.2596454 28.4542267 0.0000000
kurtosis(genes$NLEQ_tot)
## Excess Kur (g2) se z p
## 1.435857e+00 2.596454e-01 5.530070e+00 3.201037e-08
kurtosis(genes$SC_tot)
## Excess Kur (g2) se z p
## -0.2600210 0.4399413 -0.5910356 0.5544965
BDI and NLEQ are both positively skewed. Use square root transformation to normalize both variables.
This helps to reduce collinearity when testing for interaction terms.
genes$bdi_c <- genes$BDI_tot - mean(genes$BDI_tot, na.rm=T)
genes$nleq_c <- genes$NLEQ_tot - mean(genes$NLEQ_tot, na.rm=T)
genes$nleq_sq_c <- genes$nleq_sq - mean(genes$nleq_sq, na.rm=T)
genes$bdi_sq_c <- genes$bdi_sq - mean(genes$bdi_sq, na.rm=T)
genes$sc_tot_c <- genes$SC_tot - mean(genes$SC_tot, na.rm=T)
hist(genes$sc_tot_c)
hoard <- lm(SIR_tot ~ SS*nleq_c + LaLa*nleq_c + sc_tot_c + bdi_c, data=genes)
summary(hoard)
##
## Call:
## lm(formula = SIR_tot ~ SS * nleq_c + LaLa * nleq_c + sc_tot_c +
## bdi_c, data = genes)
##
## Residuals:
## Min 1Q Median 3Q Max
## -20.039 -4.965 -0.559 4.849 52.196
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19.23832 1.00432 19.156 < 2e-16 ***
## SS 0.16084 1.58230 0.102 0.91921
## nleq_c 0.09327 0.05053 1.846 0.06752 .
## LaLa 0.23121 1.42248 0.163 0.87117
## sc_tot_c -0.44037 0.10153 -4.337 3.13e-05 ***
## bdi_c 0.47292 0.17740 2.666 0.00879 **
## SS:nleq_c 0.04899 0.07356 0.666 0.50676
## nleq_c:LaLa -0.11682 0.06846 -1.706 0.09066 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.304 on 114 degrees of freedom
## (237 observations deleted due to missingness)
## Multiple R-squared: 0.4439, Adjusted R-squared: 0.4097
## F-statistic: 13 on 7 and 114 DF, p-value: 3.299e-12
discard <- lm(SIR_discarding ~ SS*nleq_c + LaLa*nleq_c + sc_tot_c + bdi_c, data=genes)
summary(discard)
##
## Call:
## lm(formula = SIR_discarding ~ SS * nleq_c + LaLa * nleq_c + sc_tot_c +
## bdi_c, data = genes)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.3578 -2.3676 -0.1154 1.9947 16.9382
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.462325 0.446540 14.472 < 2e-16 ***
## SS -0.506879 0.703523 -0.720 0.47270
## nleq_c 0.005195 0.022468 0.231 0.81757
## LaLa 0.774295 0.632464 1.224 0.22338
## sc_tot_c -0.154704 0.045144 -3.427 0.00085 ***
## bdi_c 0.160986 0.078874 2.041 0.04356 *
## SS:nleq_c 0.018683 0.032707 0.571 0.56897
## nleq_c:LaLa -0.065380 0.030439 -2.148 0.03384 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.137 on 114 degrees of freedom
## (237 observations deleted due to missingness)
## Multiple R-squared: 0.3183, Adjusted R-squared: 0.2765
## F-statistic: 7.605 on 7 and 114 DF, p-value: 1.63e-07
clutter <- lm(SIR_clutter ~ SS*nleq_c + LaLa*nleq_c + sc_tot_c + bdi_c, data=genes)
summary(clutter)
##
## Call:
## lm(formula = SIR_clutter ~ SS * nleq_c + LaLa * nleq_c + sc_tot_c +
## bdi_c, data = genes)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.1518 -2.5395 -0.3637 1.3725 14.8876
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.746531 0.408578 11.617 <2e-16 ***
## SS 0.571863 0.643715 0.888 0.3762
## nleq_c 0.040916 0.020558 1.990 0.0490 *
## LaLa -0.215545 0.578696 -0.372 0.7102
## sc_tot_c -0.120361 0.041306 -2.914 0.0043 **
## bdi_c 0.110481 0.072169 1.531 0.1286
## SS:nleq_c -0.004991 0.029926 -0.167 0.8678
## nleq_c:LaLa -0.012877 0.027851 -0.462 0.6447
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.785 on 114 degrees of freedom
## (237 observations deleted due to missingness)
## Multiple R-squared: 0.2948, Adjusted R-squared: 0.2515
## F-statistic: 6.809 on 7 and 114 DF, p-value: 9.387e-07
acquire <- lm(SIR_acquisitioning ~ SS*nleq_c + LaLa*nleq_c + sc_tot_c + bdi_c, data=genes)
summary(acquire)
##
## Call:
## lm(formula = SIR_acquisitioning ~ SS * nleq_c + LaLa * nleq_c +
## sc_tot_c + bdi_c, data = genes)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.1145 -1.9898 -0.0336 1.6548 18.9346
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.56362 0.38340 19.728 < 2e-16 ***
## SS 0.16251 0.60405 0.269 0.788395
## nleq_c 0.04044 0.01929 2.096 0.038258 *
## LaLa -0.34887 0.54304 -0.642 0.521881
## sc_tot_c -0.14609 0.03876 -3.769 0.000261 ***
## bdi_c 0.18889 0.06772 2.789 0.006194 **
## SS:nleq_c 0.04086 0.02808 1.455 0.148373
## nleq_c:LaLa -0.04584 0.02614 -1.754 0.082135 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.552 on 114 degrees of freedom
## (237 observations deleted due to missingness)
## Multiple R-squared: 0.3978, Adjusted R-squared: 0.3608
## F-statistic: 10.76 on 7 and 114 DF, p-value: 2.369e-10
There appears to be an interaction between genes and early life stress (NLEQ) to predict difficulty discarding, whereby the non-risk group (LaLa) is less susceptible to the effects of stress in terms of later symptom development.