1 Introduction

In this assignment, we perform a data analysis on the incomplete data set we received. For referential purposes, we received to complete data as well, so that we can make a comparison between the different strategies we use to treat our missing data. The observed data is derived from the National Health and Nutrition Examination Survey (NHANES). First, we investigate the scope of the missingness problem. After inspecting and analyzing the data, we create a research question with a variable that holds missing values. We describe both the ad-hoc and (multiple) imputation strategies we use througout our analysis.

2 Incomplete data inspection

We start investigating the data by doing some summary statistics. The data set contains 500 units. To show a quick overview, we display the means, medians, minima, maxima and NAs for each variable below.

summary(data_incomplete) #viewing means, median, mins, maxs and NA's for each variable
##        id            sex           age                     ethnicity  
##  Min.   :41487   male  :256   Min.   :20.00   mexican_american  :101  
##  1st Qu.:44313   female:244   1st Qu.:32.00   other_hispanic    : 66  
##  Median :46962                Median :45.00   non-hispanic_white:220  
##  Mean   :46743                Mean   :44.48   non-hispanic_black: 98  
##  3rd Qu.:49245                3rd Qu.:57.00   other             : 15  
##  Max.   :51614                Max.   :69.00                           
##                                                                       
##             education                  marital    household_size 
##  no_high_school  : 59   married            :264   Min.   :1.000  
##  some_high_school: 93   widowed            : 13   1st Qu.:2.000  
##  high_school_grad:113   divorced           : 58   Median :3.000  
##  some_college    :151   separated          : 28   Mean   :3.304  
##  college_grad    : 84   never_married      :100   3rd Qu.:4.000  
##                         living_with_partner: 37   Max.   :7.000  
##                                                                  
##     household_income     weight           height           bmi       
##  100000+    : 78     Min.   : 48.00   Min.   :143.3   Min.   :17.20  
##  25000:34999: 74     1st Qu.: 68.33   1st Qu.:161.1   1st Qu.:24.55  
##  75000:99999: 57     Median : 81.45   Median :168.1   Median :28.07  
##  35000:44999: 43     Mean   : 83.58   Mean   :168.1   Mean   :28.99  
##  10000:14999: 42     3rd Qu.: 95.75   3rd Qu.:176.3   3rd Qu.:32.16  
##  45000:54999: 41     Max.   :195.80   Max.   :192.9   Max.   :58.59  
##  (Other)    :165     NA's   :150      NA's   :96      NA's   :222    
##      pulse           bp_sys1         bp_dia1          bp_sys2     
##  Min.   : 46.00   Min.   : 82.0   Min.   : 28.00   Min.   : 76.0  
##  1st Qu.: 66.00   1st Qu.:110.0   1st Qu.: 64.00   1st Qu.:108.0  
##  Median : 74.00   Median :120.0   Median : 72.00   Median :118.0  
##  Mean   : 74.86   Mean   :122.2   Mean   : 72.39   Mean   :119.5  
##  3rd Qu.: 82.00   3rd Qu.:132.0   3rd Qu.: 80.00   3rd Qu.:130.0  
##  Max.   :128.00   Max.   :210.0   Max.   :110.00   Max.   :180.0  
##                                                    NA's   :100    
##     bp_dia2          time_sed      drink_regularly days_drinking   
##  Min.   : 40.00   Min.   :   0.0   yes :335        Min.   :  0.00  
##  1st Qu.: 62.00   1st Qu.: 180.0   no  :115        1st Qu.:  0.75  
##  Median : 70.00   Median : 240.0   NA's: 50        Median :  7.50  
##  Mean   : 70.57   Mean   : 307.4                   Mean   : 50.69  
##  3rd Qu.: 78.00   3rd Qu.: 480.0                   3rd Qu.: 52.00  
##  Max.   :108.00   Max.   :1080.0                   Max.   :365.00  
##  NA's   :100                                                       
##       dep1            dep2             dep3             dep4      
##  Min.   :0.000   Min.   :0.0000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.000  
##  Median :0.000   Median :0.0000   Median :0.0000   Median :0.000  
##  Mean   :0.368   Mean   :0.3153   Mean   :0.6918   Mean   :0.794  
##  3rd Qu.:1.000   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.000  
##  Max.   :3.000   Max.   :3.0000   Max.   :3.0000   Max.   :3.000  
##                  NA's   :75       NA's   :75                      
##       dep5             dep6             dep7            dep8       
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.000   Median :0.0000  
##  Mean   :0.4094   Mean   :0.2682   Mean   :0.286   Mean   :0.1674  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:0.000   3rd Qu.:0.0000  
##  Max.   :3.0000   Max.   :3.0000   Max.   :3.000   Max.   :3.0000  
##  NA's   :75       NA's   :75                       NA's   :40      
##       dep9        
##  Min.   :0.00000  
##  1st Qu.:0.00000  
##  Median :0.00000  
##  Mean   :0.06205  
##  3rd Qu.:0.00000  
##  Max.   :3.00000  
##  NA's   :81

3 Research Question

We are interested in if age has a significant effect on depression screening score. We think this is an interesting question, since the screening score is built from all depression variables together. Furthermore, it contains a significant amount of total NA’s, which is in line with the purpose of this assignment.

Our full research question and hypothesis are as follows. Rq: Does age have a negative impact on the depression screening score? H1: Age has a negative impact on depression screening score H0: Age has no significant correlation with depression screening score

3.1 Depression score

As we want to know whether participants’ age affects their perception of depression in general, we add a new column to our data frame and sum up all the scores given to 9 depression screening questions.

data_complete1 <- data_complete %>% mutate(depression_score = dep1 + dep2 + dep3 + dep4 + dep5 + dep6 + dep7 + dep8 + dep9)

data_incomplete1 <- data_incomplete %>% mutate(depression_score = dep1 + dep2 + dep3 + dep4 + dep5 + dep6 + dep7 + dep8 + dep9)

The correlation test on the full data that shows us the relation is not significant at 95% confidence interval. However, with a p-value of 0.171 it is not that far away from 0.05. This seems interesting, since we like to find out if different imputations on NA values will give us another result;if it will tell us the effect is significant.

cor.test(data_complete1$age, data_complete1$depression_score)
## 
##  Pearson's product-moment correlation
## 
## data:  data_complete1$age and data_complete1$depression_score
## t = -1.3695, df = 498, p-value = 0.1714
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.14815038  0.02657777
## sample estimates:
##         cor 
## -0.06125558

4 Compare the incomplete data and the complete data

Now that we know what our data looks like and what our research focus is, we start comparing the distributions and descriptive statistics of the incomplete and the complete data set..

4.1 An overall comparison

There are 12 variables that have differences: they contain NA values and there are 16 that don’t. From the 500 observations, there are 367 with some compared variables unequal. We see that 8.1% of the data is missing. This concerns data about weight, height, BMI, blood pressure, drinking habits and depression. The total number of differences between the two datasets is 1139, across the 12 variables with some values unequal.

4.2 Compare means, variances and correlations

We do some computations to compare the means, variances and correlations between the two datasets.First, we show the differences in mean per variable per dataset.

##                  mean_complete mean_incomplete
## id                 46743.24400  46743.24400000
## sex                    1.48800      1.48800000
## age                   44.48400     44.48400000
## ethnicity              2.72000      2.72000000
## education              3.21600      3.21600000
## marital                2.59600      2.59600000
## household_size         3.30400      3.30400000
## household_income       7.49800      7.49800000
## weight                83.05860     83.58057143
## height               168.11680    168.13143564
## bmi                   29.30561     28.98733871
## pulse                 74.86400     74.86400000
## bp_sys1              122.15600    122.15600000
## bp_dia1               72.38800     72.38800000
## bp_sys2              120.86800    119.55000000
## bp_dia2               71.86000     70.57000000
## time_sed             307.42400    307.42400000
## drink_regularly        1.26000      1.25555556
## days_drinking         50.69400     50.69400000
## dep1                   0.36800      0.36800000
## dep2                   0.38200      0.31529412
## dep3                   0.73800      0.69176471
## dep4                   0.79400      0.79400000
## dep5                   0.45000      0.40941176
## dep6                   0.31200      0.26823529
## dep7                   0.28600      0.28600000
## dep8                   0.16400      0.16739130
## dep9                   0.07000      0.06205251

Here we create a table that compares the variances for our two datasets, followed by a relative comparison in the form of a barplot.

##                          Variable       Variance1        Variance2   Difference
## id                             id 8365721.7158958 8365721.71589579  0.000000000
## sex                           sex       0.2503567       0.25035671  0.000000000
## age                           age     206.6510461     206.65104609  0.000000000
## ethnicity               ethnicity       1.1799599       1.17995992  0.000000000
## education               education       1.5885210       1.58852104  0.000000000
## marital                   marital       3.5037916       3.50379158  0.000000000
## household_size     household_size       2.8813467       2.88134669  0.000000000
## household_income household_income      10.4669299      10.46692986  0.000000000
## weight                     weight     469.5249560     421.58965010 47.935305850
## height                     height     100.1350679     100.29843863 -0.163370733
## bmi                           bmi      47.7791803      40.51594373  7.263236522
## pulse                       pulse     162.9393828     162.93938277  0.000000000
## bp_sys1                   bp_sys1     327.4706052     327.47060521  0.000000000
## bp_dia1                   bp_dia1     141.8050661     141.80506613  0.000000000
## bp_sys2                   bp_sys2     304.5837435     272.35839599 32.225347497
## bp_dia2                   bp_dia2     138.9062124     125.30837093 13.597841498
## time_sed                 time_sed   41084.3970180   41084.39701804  0.000000000
## drink_regularly   drink_regularly       0.1927856       0.19067063  0.002114945
## days_drinking       days_drinking    7769.4592826    7769.45928257  0.000000000
## dep1                         dep1       0.5296353       0.52963527  0.000000000
## dep2                         dep2       0.5932625       0.48054384  0.112718685
## dep3                         dep3       0.9712986       0.96844617  0.002852426
## dep4                         dep4       0.9895431       0.98954309  0.000000000
## dep5                         dep5       0.6968938       0.63859046  0.058303333
## dep6                         dep6       0.4876313       0.41372919  0.073902073
## dep7                         dep7       0.4410862       0.44108617  0.000000000
## dep8                         dep8       0.2576192       0.26603675 -0.008417514
## dep9                         dep9       0.1093186       0.09183405  0.017484583

We now do the same, but then for the correlations.The Correlation1 versions indicate the correlation coefficient for the variable in the complete dataset. The Correlation2 versions refer to the incomplete variant. The resulting table is quite long, therefore it is included in the appendix.

4.3 Missingness Inspection

The visualization of missing data patterns in the dataset has been simplified by excluding variables without any missing data. The patterns reveal that some variables exhibit missing values occurring alongside other variables. For example, NAs in a unit for bp_dia2 and bp_sys1 are always observed together. Overall, there are 62 distinct combinations of variables with missing data. The most common pattern is when no values are missing, which occurs 127 times, followed by a pattern with 9 missing values that only occurs once.

4.4 MCAR test

Now, we do a MCAR test to see if the missing data is observed at random. The results show that the missing values are not observed completely at random.

# not MCAR if we look at all our data
out <- mcar_test(data_incomplete)
out$statistic  # 1448.785
## [1] 1448.785
out$p.value # 1.968048e-10
## [1] 0.0000000001968048

4.4.1 Depression score MCAR

Because we sum up the individual depression scores into one score, we investigate whether this eradicates the intrinsic relation of missingness between individual scores. To verify our hypothesis, we do a MCAR test to a data frame only including these individual depression scores. The p value is smaller than 0.05 which indicates the missing values are not observed completely at random.

data_dep <- data_incomplete1[,c("dep1", "dep2", "dep3", "dep4", "dep5", "dep6", "dep7", "dep8", "dep9")]
out <- mcar_test(data_dep)
out$statistic  
## [1] 79.12127
out$p.value 
## [1] 0.00004495763

4.5 Check the dependency of the missingness

Here we compare the mean age of individuals with missing depression score values and those with non-missing values to see if the missingness relates to the observed data. The p-value of 0.783 indicates that there is no significant difference between the age of individuals with and without missing values in the depression score variable: the missingness in dep_score does not significantly depend on age.

# Create a missingness vector for dependent variables
mDep <- is.na(data_incomplete1$depression_score)

# age ~ dep_score
out <- t.test(age ~ mDep, data = data_incomplete1)
out$statistic # -0.2754501 
##          t 
## -0.2754501
out$p.value # 0.7831269
## [1] 0.7831269

From the distribution visualization of the missing values, missing values in dep_score spread evenly at age axis. The logistic regression model further shows that the missingness on dep_score is independent of age. Both visualizations suggest the missing values in depression score are missing completely at random (MCAR).

## 
## Call:
## glm(formula = missingness ~ age, family = "binomial", data = incomplete_missingness)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.4609  -1.4331   0.9257   0.9392   0.9516  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  0.681464   0.305121   2.233   0.0255 *
## age         -0.001795   0.006513  -0.276   0.7828  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 649.89  on 499  degrees of freedom
## Residual deviance: 649.81  on 498  degrees of freedom
## AIC: 653.81
## 
## Number of Fisher Scoring iterations: 4
## `geom_smooth()` using formula = 'y ~ x'

4.6 Analysis variables missingnes

We aim to obtain an insight into the missingness in the variables that occur in our analysis model. First, we do some summary statistics. The minimum age of participants is 20, the maximum age 69 and the mean age 44.48. The lowest scored depression score is 0, the highest score is 22 and the mean score is 3.2. There are no missing values for age and 177 missing values for depression score.

summary(subset_incomplete)
##       age        depression_score
##  Min.   :20.00   Min.   : 0.000  
##  1st Qu.:32.00   1st Qu.: 0.000  
##  Median :45.00   Median : 2.000  
##  Mean   :44.48   Mean   : 3.238  
##  3rd Qu.:57.00   3rd Qu.: 4.000  
##  Max.   :69.00   Max.   :22.000  
##                  NA's   :177

Here we visualize the distribution of missing values between depression score and age:

5 Complete Data Analysis

When looking at the complete subset, we see that the mean for age is 44.484 and for depression score 3.564. The variance for age is 206.651 and 20.603 for the depression score. They have a correlation of -0.061. According to the regression model we made, if age goes up with one year, the depression score goes down with 0.019. The p-value of this effect is 0.171, indicating that it is 0.121 away from being statistically significant at 95% confidence level. This is interesting, as we like to find out if different imputation methods for our missing data can cause us to think the relation is statistically significant, while it in reality is not.

As estimates of the linear model show, the linear association between age and depression score isn’t significant and the model only accounts for 0.38% variability. Therefore, this linear model is not an ideal representation of their relation. The visualization also agrees with the non-linearity, with lots of points far away from the line.

# compute means, variances, and correlations of all variables in the complete data
colMeans(subset_complete)
##              age depression_score 
##           44.484            3.564
sapply(subset_complete, var)
##              age depression_score 
##        206.65105         20.60311
cor(subset_complete)
##                          age depression_score
## age               1.00000000      -0.06125558
## depression_score -0.06125558       1.00000000
# build a linear regression model
model1 <- lm(depression_score ~ age, data = subset_complete)
summary(model1)
# visualize the linear regression model
ggmice(subset_complete, aes(age, depression_score)) + geom_point() + geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula = 'y ~ x'

6 Incomplete Data Analysis with Imputation Methods

Now we will analyze our incomplete data to see if we get different results. First we do list wise deletion of the NA values. Then, mean imputation on the NA values. And finally regression imputation for the NA values.

6.1 Deletion-Based Treatments

Neither the mean nor the variance of variable age changes since it has no missing data. However, the variance of dep_score decreases because 35.4% of data was deleted in the observed dataset, leading to sampling variability.The mean of depression score also decreases in the observed data. Nonetheless, the negative correlation between age and depression score remains low but gets slightly stronger, from -0.061 in the complete data to -0.1 in the observed data. From the comparison of linear models estimated from complete data and observed data, the R² slightly increases and the regression estimates are slightly biased in the model with deleted data.

# the proportion of values are deleted in Deletion-Based Treatments
pm <- colMeans(is.na(subset_incomplete))
pm['age']
## age 
##   0
# compute means, variances, and correlations of all variables in the observed data
colMeans(subset_incomplete, na.rm = TRUE)
##              age depression_score 
##         44.48400          3.23839
sapply(subset_incomplete, var, na.rm = TRUE)
##              age depression_score 
##        206.65105         17.30635
cor(subset_incomplete, use = "pairwise.complete.obs")
##                          age depression_score
## age               1.00000000      -0.09782773
## depression_score -0.09782773       1.00000000
# visulise the linear regression model
ggmice(subset_incomplete, aes(age, depression_score)) + geom_point() + geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula = 'y ~ x'

6.2 Mean Substitution

The Pearson’s product-moment correlation shows that the correlation coefficient is -0.078, which indicates a weak negative correlation. Due to our p-value of 0.078, there is not enough evidence to reject the null hypothesis of no correlation based on a 95% confidence interval. Compared to the complete data the correlation has gone down from -0.061 to -0.078. The p-value has gone down as well, from 0.171 to 0.078. Since the p-value is greater than the typical significance level of 0.05, we do not have sufficient evidence to reject the null hypothesis. Therefore, we can conclude that there is no significant difference in mean depression scores between the two groups at the 5% significance level. So, although mean imputation normally is not the smartest thing to do, unless you know what you are doing, this test shows us there is no significant difference in means for this particular case.

# correlation test:
cor.test(mean_imputed_data$age, mean_imputed_data$depression_score)
## 
##  Pearson's product-moment correlation
## 
## data:  mean_imputed_data$age and mean_imputed_data$depression_score
## t = -1.7632, df = 498, p-value = 0.07847
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.165315474  0.008985917
## sample estimates:
##         cor 
## -0.07876674

Now we will show two histograms that plot the differences in depression score frequency for the mean imputed data set and the complete data set. This shows us that the lower depression scores seem more prevalent in the mean imputed data

Here we show by means of a t-test that the differences in the depression score for the full data set and the mean imputed data set are not significant, indicated by the 0.197 p-value.

## 
##  Welch Two Sample t-test
## 
## data:  data_complete1$depression_score and mean_imputed_data$depression_score
## t = 1.2917, df = 917.11, p-value = 0.1968
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1690996  0.8203194
## sample estimates:
## mean of x mean of y 
##   3.56400   3.23839

Here we show our regression model with the mean imputed data. Comparing it to the regression model with the complete subset data, the coefficient goes up from -0.019 to -0.018, indicating mean imputation causes a slightly less negative relationship. The p-value goes down from 0.171 to 0.0785, indicating mean imputation puts us closer to having a statistical significant correlation than there is in reality.

## 
## Call:
## glm(formula = depression_score ~ age, data = mean_imputed_data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.6867  -2.3617  -0.2194   0.3390  18.5880  
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)  4.05292    0.48543   8.349 0.000000000000000685 ***
## age         -0.01831    0.01038  -1.763               0.0785 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 11.12062)
## 
##     Null deviance: 5572.6  on 499  degrees of freedom
## Residual deviance: 5538.1  on 498  degrees of freedom
## AIC: 2627.3
## 
## Number of Fisher Scoring iterations: 2

6.3 Regression Imputation

It is clear that with this data, using a regression imputation has a significant impact on the imputed results when compared with the true data. The means for age remain the same as stated above, but the mean value of the depression score changes from 3.564 in the complete subset to 3.235 in the imputed data. The correlation between the variables age and depression score is negative in both the complete and imputed data using this method. However, it increased from -0.061 to -0.121, meaning that the imputation makes the relationship appear stronger. When comparing the Pearson’s coefficient of the complete subset, 0.171, to the imputation, 0.007, the significance changes greatly; from insignificant to significant.

Now that we have used a regression imputation to impute our missing data, we will compare our imputed data to our true data.

##              age depression_score 
##        44.484000         3.234688
##              age depression_score 
##           44.484            3.564
##              age depression_score 
##        206.65105         11.22561
##              age depression_score 
##        206.65105         20.60311
##                         age depression_score
## age               1.0000000       -0.1211871
## depression_score -0.1211871        1.0000000
##                          age depression_score
## age               1.00000000      -0.06125558
## depression_score -0.06125558       1.00000000

Now, we create a linear regression to determine the p-value of the imputed data

## 
## Call:
## lm(formula = depression_score ~ age, data = inc1)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.926 -2.425  0.000  0.000 18.497 
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)  4.49114    0.48461   9.268 < 0.0000000000000002 ***
## age         -0.02825    0.01037  -2.724              0.00667 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.329 on 498 degrees of freedom
## Multiple R-squared:  0.01469,    Adjusted R-squared:  0.01271 
## F-statistic: 7.423 on 1 and 498 DF,  p-value: 0.006667

7 Multiple imputation

In order to improve our analysis in comparison to the earlier methods, we now employ multiple imputation methods rather than single imputation methods. Using multiple imputation helps to reduce bias, and increase precision, statistical power. It is a more flexible and robust strategy for handling missing data that provides more accurate and precise estimates while reducing bias and increasing statistical power. We first use the default MICE imputation method, and then use passive imputation to perform multiple imputations on our data. Then, we pool and analyze the data, and then compare the effect of age on depression score and the effect of age and BMI on depression score. The inclusion of BMI is incorporated to make our analysis more interesting, since we found out it has a higher correlation with depression score Throughout, we assess the convergence of plausibility of the imputed data. It is critical to assess both the convergence and plausibility in order to assess the validity and quality of the imputed data. Convergence refers to the stability of the imputation and how much the imputed values have stabilized. On the other hand, plausibility refers to how reasonable or “plausible” the imputed values appear when compared to the observed data. Both are necessary to assess to ensure accurate imputations.

7.1 Default MICE Imputation Method

We begin by performing a default imputation of the data using the MICE algorithm, using default multiple imputation. The default imputation method uses “pmm” (predictive mean matching) for continuous variables and “polyreg” (polynomial regression) for categorical variables. We chose this as a starting point for our imputations.

7.1.1 Naive imputation

# default multiple imputation with 10 imps, 20 its: 
imp <- mice(data_incomplete1, m = 10, maxit = 20, seed = 1234567, print = FALSE)

# checking the loggedEvents; due to depression_score:
imp$loggedEvents %>% head()
##   it im             dep   meth              out
## 1  2  3          height    pmm depression_score
## 2  2  3             bmi    pmm depression_score
## 3  2  3         bp_sys2    pmm depression_score
## 4  2  3         bp_dia2    pmm depression_score
## 5  2  6 drink_regularly logreg depression_score
## 6  2  8          weight    pmm depression_score

7.1.2 Convergence and Plausibility

Now, we will assess the convergence and plausibility of this imputation. In order to do this, we must create plots in order to visualize the convergence and plausibility. Convergence can be assessed using a convergence plot. A convergence plot shows the change in imputation estimates as the number of imputations increase. In order to interpret a convergence plot, we look at the stabilization, plateau, and fluctuations of the imputation estimates in the plot. Convergence can also be assessed by using the “convergence”function in MICE. This function provides information on the maximum absolute difference between successive imputations for each variable at each iteration. This can be plotted to show auto-correlation. Ideally auto-correlation should be close to 0 to demonstrate convergence. Plausibility is assessed by comparing the imputed data to the original data or to external sources. This can be done by a density plot, which visualizes the distribution of the imputed values vs the observed data. If the imputed values are plausible, then the distribution of the imputed values should be similar between the imputed values and the observed values. However, it should be noted that the similarity of the imputed values to the observed values does not guarantee that the imputation is correct or accurate, for example when the imputation model is misspecified or when the missingness mechanism is not MAR.

# looking at the plots from our initial model:
plot(imp)

densityplot(imp)

stripplot(imp)

# auto-correlation should ideally be 0, but deviates from that for some variables:
conv <- convergence(imp)
ggplot(conv, aes(x = .it, y = ac)) +
  geom_hline(yintercept = 0, color = "grey", linetype = "dashed") +
  geom_line() + 
  facet_wrap(~vrb, scales = "free") + 
  theme_classic() + 
  labs(x = "Iteration", y = "Auto-correlation")

# potential scale reduction factors are high; ideally they should be 1 for each variable:

ggplot(conv, aes(x = .it, y = psrf)) +
  geom_hline(yintercept = 1, color = "grey", linetype = "dashed") +
  geom_line() + 
  facet_wrap(~vrb, scales = "free") + 
  theme_classic() + 
  labs(x = "Iteration", y = "Potential scale reduction factor")

It is clear that this imputation model shows non-convergence, which we have quantified and displayed. It can be seen in the plot of the imputation using the “plot” function that the depression score, weight, height, and bmi do not show or show little convergence over iterations. In the plot of the “convergence” function, the auto-correlation value is high for weight, height, and depression score. This indicates non-convergence. The density plot also demonstrates low plausibility for the weight and height variables. Depression score has better plausibility, but some of the individual depression variables are further away from the true data, which could indicate lower plausibility.

7.2 Passive Imputation

Now we try to improve our multiple imputation in order to have better convergence. Since we added a column for the depression score, which is the sum of all depression scores in a row, we are going to do passive imputation for this variable. Passive imputation is ideal for transformed variables, when compared with other methods such as Impute, then transform. This gives us a more accurate imputation on the depression score variable. To implement passive imputation, we adjust two features of the mice() setup: the method vector: we use the method vector to define the deterministic relations, and the predictor matrix: we adjust the predictor matrix to keep a transformed variable from being used as a predictor of its raw version.

7.2.1 Passive Imputation for depression score

Currently all missing data is treated with pmm. Since our data is not MCAR, pmm is mostly fine. First, we exclude the id from our prediction matrix and method.

Then, we change the method to implement passive imputation depression_score.

The convergence and plausibility look like this after the passive imputation.

After performing passive imputation on our data, we have reassessed the convergence and plausibility. The convergence plot using the “plot” function is similar for weight, height and bmi, though depression score shows much more convergence. On the other hand, the density plot appears relatively unchanged when compared with our first imputation. The weight and height variables still display low plausibility. Therefore, we now apply passive imputation for BMI, which is an equation of weight and height.

7.2.2 Passive Imputation for BMI

We now do passive imputation for BMI.

#BMI, height, weight could be better. The rest looks good. We add one more pmi for BMI:
method["bmi"] <-  "~ I(weight / (height / 100)^2)"
method

# adjusting the prediction matrix to avoid circularity:
(pm1 <- imp1$predictorMatrix)
pm1 <- as.data.frame(pm1) #converted to df to change the cells
pm1[9:10, 11] <- 0 #changing the values to 0
pm1 <- as.matrix(pm1) #conversion back to matrix
pm1
imp2 <- mice(data_incomplete1,
            m = 10, 
            method = method, 
            predictorMatrix = pm1, 
            maxit = 20, 
            seed = 1234567, 
            print = FALSE)


# Checking if the deterministic definition of depression score is maintained in the imputed data:
xyplot(imp1, depression_score ~  I(dep1 + dep2 + dep3 + dep4 + dep5 + dep6 + dep7 + dep8 + dep9), ylab="Imputed depression score", xlab="Calculated depression score")

# Checking if the deterministic definition of BMI is maintained in the imputed data:
xyplot(imp2, bmi ~ I(weight / (height / 100)^2), ylab="Imputed BMI", xlab="Calculated BMI")

# looking at the plots from our model with 2 passive imputations:
plot(imp2) #convergence is better now: also good for bmi, height, weight

densityplot(imp2) #plausibility

stripplot(imp2) #plausibility

# checking the default prediction matrix to find correlating variables: BMI (and height + weight) correlate with depression_score (and two blood pressure variables and household income correlate with BMI). Since it does not include a correlation between age and depression_score, as we have in our research question, we stick to keeping all variables, except the ones removed to avoid circularity from passive imputation. In other words, we stick with the predictor matrix in imp2 due to its fit with our research question.
(pred <- quickpred(data_incomplete1))

The plausibility and convergence for this final imputation have improved. When looking at the convergence plot for weight, height, bmi, and depression score, the convergence is now acceptable. As seen in the density plot, the plausibility of the weight and height variables has greatly improved. We can now be satisfied with the validity and quality of our imputation.

8 Linear regression models with multiple imputation

With our chosen multiple imputation model, we perform analysis for our research question and pool the results. Pooling the imputed datasets allows us to obtain a single set of imputed values. This creates estimates that are more accurate and better reflect the uncertainty of the missing data.

8.1 Linear regression model with age

# analysis for each imp to see if age is a predictor for depression score:
fit <- with(imp2, lm(depression_score ~ age))

# results:
fit$analyses[[1]] %>% summary()
## 
## Call:
## lm(formula = depression_score ~ age)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.987 -3.395 -1.832  1.136 18.224 
## 
## Coefficients:
##             Estimate Std. Error t value     Pr(>|t|)    
## (Intercept)  4.26896    0.69028   6.184 0.0000000013 ***
## age         -0.01409    0.01477  -0.954         0.34    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.742 on 498 degrees of freedom
## Multiple R-squared:  0.001826,   Adjusted R-squared:  -0.0001785 
## F-statistic: 0.9109 on 1 and 498 DF,  p-value: 0.3403
fit$analyses[[2]] %>% coef()
## (Intercept)         age 
##  4.21520922 -0.01311054
# pooling the results and checking them:
poolFit <- pool(fit)
summary(poolFit)
##          term    estimate  std.error  statistic       df           p.value
## 1 (Intercept)  4.27960613 0.68921050  6.2094326 488.5156 0.000000001137687
## 2         age -0.01471554 0.01476219 -0.9968402 485.9189 0.319338232738308

The p-value of these pooled results is 0.319, which indicates that the observed differences in the pooled estimate are likely to have occurred by chance when assuming that the null hypothesis is true. It suggests weak evidence in support of the alternative hypothesis. Because the estimate is negative (-0.015), it indicates a negative association between variables. However, because the absolute value of -0.015 is very low, it indicates weak associations or effects.

8.2 Linear regression model with age and bmi

As the multiple imputation model above with a single predictor “age” does not predict the depression score of individuals well, we decided to add an extra variable “bmi” to our analysis. This was in order to see whether independent variables “age” and “bmi” together better explains the outcome. Therefore a multiple linear regression with multiple imputation for missing data is performed. We hope that this will provide more interesting results from our imputed data. We also used imputed data with an adjusted method vector and predictor matrix due to the transformed variables “depression score” and “bmi”.

#  Analysis phase: fit the model
fit1  <- with(imp2, lm(`depression_score` ~ age + bmi))

# pooling phase 
pool <- pool(fit1)
summary(pool)
##          term    estimate  std.error statistic        df   p.value
## 1 (Intercept)  3.14502816 1.03285080  3.044998 124.55489 0.0028400
## 2         age -0.01644018 0.01475595 -1.114139 483.98559 0.2657725
## 3         bmi  0.04046774 0.02755236  1.468758  49.21701 0.1482615

After pooling the estimates from each model into a single set of estimates, we can see age does not have a significant effect on an individual’s depression score, with a p value of 0.115. However, there is a significant positive relation between depression score and bmi (p = 0.008). For each unit bmi increases, depression score increases by 0.137 points after controlling for age, indicating a higher probability of depression in those with higher BMIs.

8.3 Linear regression model with completed data

We also performed a linear regression model with completed data, to compare with linear models made based on imputed data. The interpretation of estimates are discussed in the discussion part.

model3 <- lm(`depression_score` ~ age + bmi, data = data_complete1)
summary(model3)
## 
## Call:
## lm(formula = depression_score ~ age + bmi, data = data_complete1)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -5.766 -3.112 -1.682  1.209 20.784 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  1.73578    1.04508   1.661  0.09736 . 
## age         -0.02220    0.01401  -1.585  0.11372   
## bmi          0.09609    0.02914   3.297  0.00105 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.491 on 497 degrees of freedom
## Multiple R-squared:  0.02508,    Adjusted R-squared:  0.02116 
## F-statistic: 6.393 on 2 and 497 DF,  p-value: 0.001815

8.4 Compare the models (age and depression score vs. age, bmi and depression score).

# compare R2
pool.r.squared(fit)
##           est       lo 95     hi 95        fmi
## R^2 0.0020169 0.001893261 0.0175916 0.01268108
pool.r.squared(fit1)
##             est         lo 95      hi 95       fmi
## R^2 0.009378508 0.00006144138 0.03977005 0.3122186
summary(model3)$r.squared
## [1] 0.02508027
# compare estimates
D1(fit1,fit)
##    test statistic df1      df2 dfcom   p.value       riv
##  1 ~~ 2   2.15725   1 26.15017   497 0.1538263 0.6424518
anova(model3, fit1)
## Analysis of Variance Table
## 
## Response: depression_score
##            Df  Sum Sq Mean Sq F value   Pr(>F)   
## age         1    38.6  38.577  1.9128 0.167269   
## bmi         1   219.3 219.272 10.8727 0.001046 **
## Residuals 497 10023.1  20.167                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In the first model, using only age, the R2 is 0.002, whereas for the second with age and bmi, it is 0.045. Though the general R2 is still low in the multiple linear model, adding predictor “bmi” explains about 4.3% more variability in depression score. Comparing estimates, it shows that the “bmi” factor is a significant predictor of depression score, and the multiple linear model with age and bmi as predictors better predict one’s depression score. This indicates that a person’s BMI is a better predictor of depression than a person’s age. In the discussion, we will compare our imputed data with the observed data using the tests above.

9 Conclusion and discussion

9.1 Discussion: complete data

When looking at the complete data, we found that the mean for age is 44.484 and for depression score 3.564. The variance for age is 206.651, and 20.603 for the depression score. They have a correlation of -0.061. According to the regression model we made, if age goes up one year, the depression score goes down 0.019. The correlation estimate is -0.061, which indicates a weak negative correlation between age and depression_score. The p-value of this effect is 0.171, which suggests that there is no strong evidence of a significant correlation between age and depression_score. So, the correlation test on the full data that shows us the relation is not significant at 95% confidence interval.

9.2 Discussion: deletion based treatments vs. complete data

The negative correlation between age and depression score remains low, but slightly stronger, from -0.061 in the complete data to -0.1 in the observed data. The estimates of 2 linear models shows that the R² slightly increases in the imputed data; the slope negatively increases from -0.019 in the complete data to -0.028 in the imputed data; p value decreases to 0.0792. In general, the imputed data with deletion-based methods slightly increases the negative relation between age and depression, but this linear relation remains insignificant.

9.3 Discussion: mean imputed data vs. complete data

Doing the same analysis with the mean imputed data, the correlation between depression score and age has gone down from -0.061 to -0.0788. Comparing the mean imputed dataset to the regression model with the complete data, the coefficient goes up from -0.019 to -0.018, indicating mean imputation causes a slightly less negative relationship. The p-value goes down from 0.171 to 0.0785, indicating mean imputation puts us closer to having a statistical significant correlation than there is in reality.

9.4 Discussion: regression imputation vs. complete data

When performing regression imputation on the missing data, the correlation between age and the depression score changes from -0.061 to -0.121. The percent difference in these two correlations is 65.699%. This means that the relationship is still negative, but the imputation makes the correlation appear to be stronger than when looking at the complete data. The p-value also changes when comparing the regression imputation and the complete data. In the complete data, the p-value is 0.171, whereas after a regression imputed data has a p-value of 0.007. This means that if a researcher used a regression imputation on this dataset, they may incorrectly assume there is a significant, albeit weak negative relationship between age and depression score, when in reality the relationship is not significant.

9.5 Discussion: multiple imputation

We cannot reject the null hypothesis of the first research question because we found no significant relationship between age and depression score using our imputed data. However, we can reject the null hypothesis of our second research question, because we did find a statistically significant relationship between bmi and depression score (0.008). In the complete data, the p-value of the relationship between age and depression score was 0.113, whereas in the imputed data, the p-value was 0.115. On the other hand, the bmi and depressions score was 0.001 in the complete data, and 0.008 on the incomplete data. The difference between these two in both cases is small and does not affect whether or not the null hypothesis is rejected. The estimate of age and depression score in the complete data was -0.022, whereas it was -0.023 in the imputation data. This means the relationship is negative and weak in both cases. For bmi, estimate with depression score in the complete data was 0.096, while it was 0.137 in the imputation data This means that whether a researcher was studying the complete data or our imputed data, they would reach the same conclusion: there is a weak, negative, insignificant relationship between age and depression score, whereas there is a weak, positive, significant relationship between bmi and depression score. We can therefore conclude that our MI was adequate when addressing our research questions as it would not have led to a different conclusion.

9.6 Conclusion: comparing the complete data to all the imputed versions

The complete data shows us there is no significant relationship between age and depression score. We obtain a correlation estimate of -0.061 with a p-value of 0.171. The imputation method that comes the closest to this is mean imputed data. The correlation of age and depression score when using the mean imputed data is -0.0788. This is closer than both regression imputation and deletion based treatments, which had a correlation of -0.121 and -0.1, respectively. The p-value, indicating the significance of the results, was similar for the mean imputed data and the deletion based treatments data, with p-values of 0.0785 and 0.0792. These are very similar significant levels, and while being closer to significance (ɑ < 0.05) are both still insignificant. For the complete data, the p-value was 0.171, which indicates that both treatments may have introduced bias or altered the relationship between the variables. Listwise deletion involves removing all observations that have missing values in any of the variables used in the analysis. This can lead to a reduction in the sample size and a change in the composition of the sample. As a result, the relationships between the variables in the analysis may differ from those in the complete data. Mean imputation can change the mean and variance of the imputed variable, which can affect the correlation or regression coefficients between the imputed variable and other variables in the analysis. The regression imputation had a very different p-value of 0.007, which would indicate that the relationship between age and depression score was in fact significant. That is due to the fact that the regression imputation replaces missing values by prediction which artificially increases correlations. The multiple imputation method was adequate in addressing the research questions as it produced similar results to those obtained from the complete data. The imputed data showed a weak, negative, insignificant relationship between age and depression score, and a weak, positive, significant relationship between BMI and depression score. The MI method can be considered the best approach as it did not lead to a different conclusion than the complete data From theses methods, the multiple imputed data is the closest to our true data and regression imputation was the farthest. This shows how certain methods of dealing with missing data can be dangerous in statistics, and may lead a researcher to incorrect solutions and results. None of the ad-hoc imputation methods came particularly close to the true data; but while the deletion based treatments and mean imputation may have lead a researcher to the same conclusion as with the complete data (cannot reject null hypothesis), in this case, regression imputation may have lead to a research reaching an incorrect conclusion (can reject null hypothesis).

9.7 Limitations

There are some limitations to the models and analysis we used in this assignment that should be discussed. For example, as seen in the first imputation, the default imputation settings in MICE were not appropriate for our missing data and led to non-convergence and somewhat lower plausibility on several variables. This non-convergence could imply that the imputation process did not work effectively, and the imputed data may not be accurate. As seen in the first imputation, the default imputation settings in MICE were not appropriate for our missing data and led to non-convergence and lower plausibility on several variables. Using the default methods in MICE may limit flexibility and make it more difficult to fine tune the imputation process. It is also difficult to apply to larger datasets.

While our final imputation had acceptable convergence and plausibility that was an improvement over the default MICE imputation, it also presents its own issues. For example, while the variables height, bmi, and weight all converged more from the first to the final passive imputation, the depression score had weaker convergence. While it was still acceptable, this should be taken into account.

9.8 Potential improvements

In general, our imputed values were plausible. However, there were some outliers in the imputed data that did not occur in the observed data. This could be because the imputed values for some variables have a strong correlation that is not supported by the observed data. More investigation into this matter is a potential improvement.

10 Appendix

Comparison of the correlation coefficients per variable. The Correlation1 versions indicate the correlation coefficient for the variable in the complete dataset. The Correlation2 versions refer to the incomplete variant.

##                          Variable Correlation1.id Correlation1.sex
## id                             id     1.000000000     -0.009061094
## sex                           sex    -0.009061094      1.000000000
## age                           age    -0.025562771      0.023098114
## ethnicity               ethnicity    -0.014312468     -0.039378345
## education               education     0.021615281      0.035896163
## marital                   marital    -0.038175760     -0.024443810
## household_size     household_size    -0.004221008      0.011382283
## household_income household_income     0.026733200     -0.061294379
## weight                     weight    -0.025072187     -0.272949497
## height                     height     0.021715970     -0.689623117
## bmi                           bmi    -0.038143481      0.042259004
## pulse                       pulse    -0.035046305      0.114582565
## bp_sys1                   bp_sys1    -0.043128147     -0.152286887
## bp_dia1                   bp_dia1     0.027575190     -0.171757617
## bp_sys2                   bp_sys2    -0.085713697     -0.162891182
## bp_dia2                   bp_dia2     0.016405141     -0.165781630
## time_sed                 time_sed    -0.003417041     -0.010580475
## drink_regularly   drink_regularly    -0.061661981      0.242276137
## days_drinking       days_drinking     0.086961928     -0.242475355
## dep1                         dep1     0.043327849      0.078192362
## dep2                         dep2    -0.003542063      0.076917166
## dep3                         dep3    -0.049868469      0.093177194
## dep4                         dep4    -0.074186403      0.117824652
## dep5                         dep5     0.042081421      0.149689429
## dep6                         dep6     0.018947108      0.125447659
## dep7                         dep7    -0.059105937      0.061608268
## dep8                         dep8    -0.013813271      0.039328621
## dep9                         dep9     0.014889936     -0.061537017
##                  Correlation1.age Correlation1.ethnicity Correlation1.education
## id                   -0.025562771           -0.014312468          0.02161528085
## sex                   0.023098114           -0.039378345          0.03589616322
## age                   1.000000000            0.050276740         -0.03166380020
## ethnicity             0.050276740            1.000000000          0.27114656866
## education            -0.031663800            0.271146569          1.00000000000
## marital              -0.357051255            0.046756433         -0.11158952372
## household_size       -0.256527457           -0.200457902         -0.21996950963
## household_income      0.084346917            0.085376036          0.40409105330
## weight                0.007704952            0.121989730          0.08578563886
## height               -0.089705102            0.306217362          0.24425090272
## bmi                   0.061898028           -0.005645942         -0.02006648434
## pulse                -0.133970475           -0.001017479         -0.09184197983
## bp_sys1               0.381669741            0.075425374         -0.11078463752
## bp_dia1               0.172956118            0.046217061          0.00001281824
## bp_sys2               0.362182470            0.061260550         -0.11513535782
## bp_dia2               0.148466542            0.072380706          0.01148348609
## time_sed             -0.077221674            0.146806236          0.28944748770
## drink_regularly       0.053365358            0.110925780         -0.09444376507
## days_drinking         0.073618466            0.033219351          0.13794354640
## dep1                 -0.018016843           -0.016426778         -0.16548657026
## dep2                 -0.027229023           -0.029987944         -0.10168072895
## dep3                 -0.041670731            0.015574452         -0.09954968863
## dep4                 -0.014735481            0.026261028         -0.09390922774
## dep5                 -0.091662265            0.015469653         -0.11351831383
## dep6                 -0.065780258            0.015006119         -0.04712415911
## dep7                 -0.055878734            0.027889313         -0.07873697227
## dep8                 -0.071050606           -0.029223534         -0.15573099493
## dep9                  0.022371793           -0.079233176         -0.13734538365
##                  Correlation1.marital Correlation1.household_size
## id                       -0.038175760                -0.004221008
## sex                      -0.024443810                 0.011382283
## age                      -0.357051255                -0.256527457
## ethnicity                 0.046756433                -0.200457902
## education                -0.111589524                -0.219969510
## marital                   1.000000000                -0.013618380
## household_size           -0.013618380                 1.000000000
## household_income         -0.266191931                 0.051199072
## weight                   -0.029178776                -0.095064813
## height                    0.037691375                -0.150761239
## bmi                      -0.050470988                -0.031230985
## pulse                     0.064625767                 0.069983608
## bp_sys1                  -0.042862289                -0.109454330
## bp_dia1                  -0.081959670                -0.010605769
## bp_sys2                  -0.027155085                -0.110530816
## bp_dia2                  -0.058525366                -0.040340758
## time_sed                  0.118101884                -0.199365701
## drink_regularly          -0.049937065                 0.060445050
## days_drinking            -0.009034375                -0.139235885
## dep1                      0.076991432                 0.003348289
## dep2                      0.075286584                -0.024622488
## dep3                      0.078296909                -0.043335704
## dep4                      0.076831300                -0.011497899
## dep5                      0.149920678                -0.051477777
## dep6                      0.095092075                -0.017623426
## dep7                      0.068949030                -0.031058651
## dep8                      0.120500874                 0.049013792
## dep9                      0.078166467                 0.154825990
##                  Correlation1.household_income Correlation1.weight
## id                                  0.02673320        -0.025072187
## sex                                -0.06129438        -0.272949497
## age                                 0.08434692         0.007704952
## ethnicity                           0.08537604         0.121989730
## education                           0.40409105         0.085785639
## marital                            -0.26619193        -0.029178776
## household_size                      0.05119907        -0.095064813
## household_income                    1.00000000        -0.039477663
## weight                             -0.03947766         1.000000000
## height                              0.16008286         0.415429394
## bmi                                -0.12162978         0.886379420
## pulse                              -0.11942971         0.152457915
## bp_sys1                            -0.06657151         0.202607336
## bp_dia1                             0.06082786         0.237188551
## bp_sys2                            -0.05956108         0.216514575
## bp_dia2                             0.06363893         0.209744411
## time_sed                            0.12569705         0.118132995
## drink_regularly                    -0.09979702        -0.069555839
## days_drinking                       0.19435804         0.022801412
## dep1                               -0.14523184         0.054062624
## dep2                               -0.16656517         0.035842786
## dep3                               -0.19720311         0.128597838
## dep4                               -0.13992236         0.083566675
## dep5                               -0.22931647         0.148976260
## dep6                               -0.13011837         0.060056746
## dep7                               -0.07947655         0.054910755
## dep8                               -0.13404338         0.018019985
## dep9                               -0.05513571         0.011873820
##                  Correlation1.height Correlation1.bmi Correlation1.pulse
## id                       0.021715970     -0.038143481       -0.035046305
## sex                     -0.689623117      0.042259004        0.114582565
## age                     -0.089705102      0.061898028       -0.133970475
## ethnicity                0.306217362     -0.005645942       -0.001017479
## education                0.244250903     -0.020066484       -0.091841980
## marital                  0.037691375     -0.050470988        0.064625767
## household_size          -0.150761239     -0.031230985        0.069983608
## household_income         0.160082860     -0.121629779       -0.119429710
## weight                   0.415429394      0.886379420        0.152457915
## height                   1.000000000     -0.038822222       -0.082254825
## bmi                     -0.038822222      1.000000000        0.211810098
## pulse                   -0.082254825      0.211810098        1.000000000
## bp_sys1                  0.060416606      0.204099015       -0.007525159
## bp_dia1                  0.195226206      0.167590329        0.170867088
## bp_sys2                  0.084863522      0.206465963        0.011703570
## bp_dia2                  0.182779568      0.144101143        0.150902776
## time_sed                 0.136526507      0.059674333        0.062559283
## drink_regularly         -0.213908045      0.029184378        0.067822017
## days_drinking            0.271969296     -0.108446083       -0.083106304
## dep1                    -0.019397815      0.069563577        0.086510403
## dep2                    -0.033075015      0.057931789        0.120457212
## dep3                    -0.040010495      0.163298313        0.175416295
## dep4                    -0.034741878      0.110121565        0.122311217
## dep5                    -0.083071197      0.203279834        0.182721912
## dep6                    -0.039009098      0.080526726        0.087055045
## dep7                     0.016825197      0.044896165        0.089460416
## dep8                    -0.050534781      0.039038055        0.127174396
## dep9                     0.009213944      0.009429586        0.023627604
##                  Correlation1.bp_sys1 Correlation1.bp_dia1 Correlation1.bp_sys2
## id                       -0.043128147        0.02757519015         -0.085713697
## sex                      -0.152286887       -0.17175761718         -0.162891182
## age                       0.381669741        0.17295611793          0.362182470
## ethnicity                 0.075425374        0.04621706087          0.061260550
## education                -0.110784638        0.00001281824         -0.115135358
## marital                  -0.042862289       -0.08195966954         -0.027155085
## household_size           -0.109454330       -0.01060576933         -0.110530816
## household_income         -0.066571510        0.06082786269         -0.059561079
## weight                    0.202607336        0.23718855145          0.216514575
## height                    0.060416606        0.19522620621          0.084863522
## bmi                       0.204099015        0.16759032873          0.206465963
## pulse                    -0.007525159        0.17086708756          0.011703570
## bp_sys1                   1.000000000        0.48300367406          0.926381496
## bp_dia1                   0.483003674        1.00000000000          0.497291337
## bp_sys2                   0.926381496        0.49729133716          1.000000000
## bp_dia2                   0.466530196        0.86854092345          0.487850540
## time_sed                 -0.003782454        0.00217839475          0.022736306
## drink_regularly          -0.026805726       -0.05766066444         -0.013818842
## days_drinking             0.148590830        0.11720639751          0.124082412
## dep1                     -0.029932158        0.06072577747         -0.021097356
## dep2                     -0.061219730        0.04323727956         -0.045289054
## dep3                      0.009937245        0.05751556453         -0.001432162
## dep4                     -0.026042674        0.02740026232         -0.022462736
## dep5                     -0.047371796        0.03118606632         -0.051347675
## dep6                     -0.059681989       -0.02615469201         -0.050878215
## dep7                     -0.102765982       -0.04345256967         -0.071946080
## dep8                     -0.025918588       -0.00458085726          0.013760427
## dep9                      0.055780801        0.07656178684          0.067937920
##                  Correlation1.bp_dia2 Correlation1.time_sed
## id                        0.016405141          -0.003417041
## sex                      -0.165781630          -0.010580475
## age                       0.148466542          -0.077221674
## ethnicity                 0.072380706           0.146806236
## education                 0.011483486           0.289447488
## marital                  -0.058525366           0.118101884
## household_size           -0.040340758          -0.199365701
## household_income          0.063638934           0.125697050
## weight                    0.209744411           0.118132995
## height                    0.182779568           0.136526507
## bmi                       0.144101143           0.059674333
## pulse                     0.150902776           0.062559283
## bp_sys1                   0.466530196          -0.003782454
## bp_dia1                   0.868540923           0.002178395
## bp_sys2                   0.487850540           0.022736306
## bp_dia2                   1.000000000          -0.002446444
## time_sed                 -0.002446444           1.000000000
## drink_regularly          -0.032452292          -0.055598865
## days_drinking             0.112310342           0.116689916
## dep1                      0.066765423           0.026518491
## dep2                      0.054028153           0.068057904
## dep3                      0.065502380           0.030903810
## dep4                      0.076847235           0.078405683
## dep5                      0.031672746           0.051348399
## dep6                      0.019440766           0.084268961
## dep7                      0.028167515           0.052525857
## dep8                      0.026626086          -0.032895905
## dep9                      0.106402543          -0.101067153
##                  Correlation1.drink_regularly Correlation1.days_drinking
## id                                -0.06166198                0.086961928
## sex                                0.24227614               -0.242475355
## age                                0.05336536                0.073618466
## ethnicity                          0.11092578                0.033219351
## education                         -0.09444377                0.137943546
## marital                           -0.04993706               -0.009034375
## household_size                     0.06044505               -0.139235885
## household_income                  -0.09979702                0.194358042
## weight                            -0.06955584                0.022801412
## height                            -0.21390805                0.271969296
## bmi                                0.02918438               -0.108446083
## pulse                              0.06782202               -0.083106304
## bp_sys1                           -0.02680573                0.148590830
## bp_dia1                           -0.05766066                0.117206398
## bp_sys2                           -0.01381884                0.124082412
## bp_dia2                           -0.03245229                0.112310342
## time_sed                          -0.05559887                0.116689916
## drink_regularly                    1.00000000               -0.311575058
## days_drinking                     -0.31157506                1.000000000
## dep1                               0.01354651               -0.091712111
## dep2                               0.04349455               -0.103534470
## dep3                               0.04195793               -0.091562615
## dep4                               0.04487285               -0.106563193
## dep5                               0.01366844               -0.060791679
## dep6                               0.01594800               -0.071961869
## dep7                               0.03999663               -0.102261637
## dep8                               0.04208418               -0.042997630
## dep9                               0.06764118               -0.034195320
##                  Correlation1.dep1 Correlation1.dep2 Correlation1.dep3
## id                     0.043327849      -0.003542063      -0.049868469
## sex                    0.078192362       0.076917166       0.093177194
## age                   -0.018016843      -0.027229023      -0.041670731
## ethnicity             -0.016426778      -0.029987944       0.015574452
## education             -0.165486570      -0.101680729      -0.099549689
## marital                0.076991432       0.075286584       0.078296909
## household_size         0.003348289      -0.024622488      -0.043335704
## household_income      -0.145231837      -0.166565166      -0.197203111
## weight                 0.054062624       0.035842786       0.128597838
## height                -0.019397815      -0.033075015      -0.040010495
## bmi                    0.069563577       0.057931789       0.163298313
## pulse                  0.086510403       0.120457212       0.175416295
## bp_sys1               -0.029932158      -0.061219730       0.009937245
## bp_dia1                0.060725777       0.043237280       0.057515565
## bp_sys2               -0.021097356      -0.045289054      -0.001432162
## bp_dia2                0.066765423       0.054028153       0.065502380
## time_sed               0.026518491       0.068057904       0.030903810
## drink_regularly        0.013546512       0.043494545       0.041957926
## days_drinking         -0.091712111      -0.103534470      -0.091562615
## dep1                   1.000000000       0.531659479       0.355425881
## dep2                   0.531659479       1.000000000       0.430426607
## dep3                   0.355425881       0.430426607       1.000000000
## dep4                   0.478628650       0.516163119       0.515146065
## dep5                   0.389892773       0.380391220       0.516265612
## dep6                   0.451876269       0.612566401       0.416029788
## dep7                   0.407885666       0.506828795       0.371891354
## dep8                   0.302860623       0.387922544       0.310417331
## dep9                   0.200882083       0.359070197       0.277796046
##                  Correlation1.dep4 Correlation1.dep5 Correlation1.dep6
## id                     -0.07418640        0.04208142        0.01894711
## sex                     0.11782465        0.14968943        0.12544766
## age                    -0.01473548       -0.09166227       -0.06578026
## ethnicity               0.02626103        0.01546965        0.01500612
## education              -0.09390923       -0.11351831       -0.04712416
## marital                 0.07683130        0.14992068        0.09509207
## household_size         -0.01149790       -0.05147778       -0.01762343
## household_income       -0.13992236       -0.22931647       -0.13011837
## weight                  0.08356667        0.14897626        0.06005675
## height                 -0.03474188       -0.08307120       -0.03900910
## bmi                     0.11012156        0.20327983        0.08052673
## pulse                   0.12231122        0.18272191        0.08705505
## bp_sys1                -0.02604267       -0.04737180       -0.05968199
## bp_dia1                 0.02740026        0.03118607       -0.02615469
## bp_sys2                -0.02246274       -0.05134768       -0.05087821
## bp_dia2                 0.07684724        0.03167275        0.01944077
## time_sed                0.07840568        0.05134840        0.08426896
## drink_regularly         0.04487285        0.01366844        0.01594800
## days_drinking          -0.10656319       -0.06079168       -0.07196187
## dep1                    0.47862865        0.38989277        0.45187627
## dep2                    0.51616312        0.38039122        0.61256640
## dep3                    0.51514607        0.51626561        0.41602979
## dep4                    1.00000000        0.50762303        0.41582328
## dep5                    0.50762303        1.00000000        0.40840103
## dep6                    0.41582328        0.40840103        1.00000000
## dep7                    0.44728951        0.28789904        0.52018848
## dep8                    0.31313090        0.33627657        0.42075653
## dep9                    0.22062947        0.16880761        0.33052440
##                  Correlation1.dep7 Correlation1.dep8 Correlation1.dep9
## id                     -0.05910594      -0.013813271       0.014889936
## sex                     0.06160827       0.039328621      -0.061537017
## age                    -0.05587873      -0.071050606       0.022371793
## ethnicity               0.02788931      -0.029223534      -0.079233176
## education              -0.07873697      -0.155730995      -0.137345384
## marital                 0.06894903       0.120500874       0.078166467
## household_size         -0.03105865       0.049013792       0.154825990
## household_income       -0.07947655      -0.134043377      -0.055135708
## weight                  0.05491076       0.018019985       0.011873820
## height                  0.01682520      -0.050534781       0.009213944
## bmi                     0.04489616       0.039038055       0.009429586
## pulse                   0.08946042       0.127174396       0.023627604
## bp_sys1                -0.10276598      -0.025918588       0.055780801
## bp_dia1                -0.04345257      -0.004580857       0.076561787
## bp_sys2                -0.07194608       0.013760427       0.067937920
## bp_dia2                 0.02816752       0.026626086       0.106402543
## time_sed                0.05252586      -0.032895905      -0.101067153
## drink_regularly         0.03999663       0.042084183       0.067641179
## days_drinking          -0.10226164      -0.042997630      -0.034195320
## dep1                    0.40788567       0.302860623       0.200882083
## dep2                    0.50682879       0.387922544       0.359070197
## dep3                    0.37189135       0.310417331       0.277796046
## dep4                    0.44728951       0.313130895       0.220629469
## dep5                    0.28789904       0.336276573       0.168807606
## dep6                    0.52018848       0.420756525       0.330524395
## dep7                    1.00000000       0.461019386       0.237190291
## dep8                    0.46101939       1.000000000       0.229995578
## dep9                    0.23719029       0.229995578       1.000000000
##                  Correlation2.id Correlation2.sex Correlation2.age
## id                   1.000000000      -0.05317545     -0.002276105
## sex                 -0.053175454       1.00000000     -0.063952534
## age                 -0.002276105      -0.06395253      1.000000000
## ethnicity           -0.035437642      -0.17985483      0.094081873
## education           -0.003002854      -0.04289781     -0.002330056
## marital              0.027369336       0.12969303     -0.347414999
## household_size       0.061231696       0.15521282     -0.346197982
## household_income    -0.033751166      -0.10301234     -0.030968901
## weight              -0.020770620      -0.30441606      0.053511978
## height              -0.003927402      -0.72253226     -0.021830956
## bmi                 -0.021902239       0.02515022      0.095499834
## pulse               -0.054038979       0.16614422     -0.129248337
## bp_sys1              0.021249306      -0.28159678      0.395539507
## bp_dia1              0.035232574      -0.03699273      0.176862727
## bp_sys2              0.013395807      -0.31524051      0.397004628
## bp_dia2              0.038168565      -0.04151953      0.157760752
## time_sed             0.091144103      -0.09773092     -0.171538179
## drink_regularly      0.116209870       0.12679009     -0.011966665
## days_drinking       -0.011595180      -0.24739987      0.051273908
## dep1                 0.131402973       0.01769036     -0.125731081
## dep2                 0.035251327       0.02870827     -0.106066244
## dep3                -0.013550269       0.13594785     -0.018909433
## dep4                -0.047593785       0.15351734     -0.130332994
## dep5                 0.106406993       0.16571963     -0.169892392
## dep6                 0.086418021       0.03935623     -0.022402285
## dep7                -0.047095282       0.15010310     -0.151210760
## dep8                 0.101114771       0.10655202     -0.141801472
## dep9                 0.001234142      -0.01798982     -0.015020876
##                  Correlation2.ethnicity Correlation2.education
## id                         -0.035437642           -0.003002854
## sex                        -0.179854832           -0.042897808
## age                         0.094081873           -0.002330056
## ethnicity                   1.000000000            0.205012197
## education                   0.205012197            1.000000000
## marital                     0.118635238           -0.059525948
## household_size             -0.315564020           -0.307730824
## household_income            0.043563631            0.324451009
## weight                      0.241620888            0.077728264
## height                      0.404234125            0.287553219
## bmi                         0.089883360           -0.050273227
## pulse                       0.116095769           -0.087687403
## bp_sys1                     0.132275904            0.044164422
## bp_dia1                     0.114125051            0.033329734
## bp_sys2                     0.121183962            0.033293549
## bp_dia2                     0.115024449            0.105443445
## time_sed                    0.101396736            0.329169644
## drink_regularly             0.050131549           -0.046961454
## days_drinking               0.153414092            0.130718354
## dep1                        0.009757537           -0.007678133
## dep2                        0.124161734            0.021260763
## dep3                        0.070259535           -0.242021635
## dep4                        0.093364780           -0.052406423
## dep5                        0.060258975           -0.199214036
## dep6                        0.044150074           -0.043504103
## dep7                       -0.050950161           -0.001121101
## dep8                        0.032973997           -0.146332016
## dep9                        0.054876685           -0.031959361
##                  Correlation2.marital Correlation2.household_size
## id                      0.02736933567                 0.061231696
## sex                     0.12969303476                 0.155212815
## age                    -0.34741499943                -0.346197982
## ethnicity               0.11863523770                -0.315564020
## education              -0.05952594778                -0.307730824
## marital                 1.00000000000                 0.060803128
## household_size          0.06080312781                 1.000000000
## household_income       -0.26054516459                -0.036424322
## weight                  0.05686866445                -0.169090175
## height                  0.02308751299                -0.281971238
## bmi                     0.03896272508                -0.067980008
## pulse                   0.00004389539                 0.128496999
## bp_sys1                -0.06932594639                -0.173709027
## bp_dia1                -0.00087850761                 0.008893365
## bp_sys2                -0.06459353492                -0.128000070
## bp_dia2                 0.00612434602                -0.042865748
## time_sed                0.20853666431                -0.161443254
## drink_regularly        -0.01543425792                 0.125171867
## days_drinking           0.10353540727                -0.164332991
## dep1                    0.12922839057                -0.018798947
## dep2                    0.19778947669                -0.001860202
## dep3                    0.09738091521                 0.044880660
## dep4                    0.15615839353                 0.039257738
## dep5                    0.23620555104                 0.029350780
## dep6                    0.14533888582                -0.043106569
## dep7                    0.11296017305                 0.088616396
## dep8                    0.19322949107                 0.033152707
## dep9                    0.14519070167                 0.122896158
##                  Correlation2.household_income Correlation2.weight
## id                                -0.033751166         -0.02077062
## sex                               -0.103012341         -0.30441606
## age                               -0.030968901          0.05351198
## ethnicity                          0.043563631          0.24162089
## education                          0.324451009          0.07772826
## marital                           -0.260545165          0.05686866
## household_size                    -0.036424322         -0.16909017
## household_income                   1.000000000          0.04816387
## weight                             0.048163873          1.00000000
## height                             0.172106834          0.47330089
## bmi                               -0.030371914          0.88505114
## pulse                              0.045798547          0.20259261
## bp_sys1                            0.029065977          0.27989266
## bp_dia1                           -0.013716254          0.16079838
## bp_sys2                            0.053269188          0.26796311
## bp_dia2                            0.033881106          0.14935624
## time_sed                           0.247685742          0.14077805
## drink_regularly                   -0.111498516         -0.02618593
## days_drinking                      0.120322164          0.18221234
## dep1                              -0.083624100          0.24874122
## dep2                              -0.122138483          0.15931382
## dep3                              -0.108340910          0.17249174
## dep4                              -0.012093979          0.17099272
## dep5                              -0.207468754          0.22497615
## dep6                              -0.144415291          0.07451680
## dep7                              -0.023387590          0.08447968
## dep8                              -0.097552281          0.12106640
## dep9                               0.009153002          0.15863236
##                  Correlation2.height Correlation2.bmi Correlation2.pulse
## id                      -0.003927402      -0.02190224     -0.05403897861
## sex                     -0.722532256       0.02515022      0.16614422499
## age                     -0.021830956       0.09549983     -0.12924833661
## ethnicity                0.404234125       0.08988336      0.11609576927
## education                0.287553219      -0.05027323     -0.08768740294
## marital                  0.023087513       0.03896273      0.00004389539
## household_size          -0.281971238      -0.06798001      0.12849699862
## household_income         0.172106834      -0.03037191      0.04579854651
## weight                   0.473300889       0.88505114      0.20259261187
## height                   1.000000000       0.02406659     -0.03645864211
## bmi                      0.024066587       1.00000000      0.24084018635
## pulse                   -0.036458642       0.24084019      1.00000000000
## bp_sys1                  0.203543405       0.21588066     -0.02418255263
## bp_dia1                  0.044673159       0.16128707      0.16230171222
## bp_sys2                  0.255127402       0.18314912     -0.02280450108
## bp_dia2                  0.068085621       0.13628992      0.09133460881
## time_sed                 0.220536773       0.03993516      0.02534172064
## drink_regularly         -0.111972367       0.03071495      0.02103098794
## days_drinking            0.340214150       0.02402465     -0.05042143991
## dep1                     0.075345786       0.22507851      0.09866183429
## dep2                     0.018000752       0.17029592      0.05637505074
## dep3                    -0.078834207       0.22649222      0.25532351821
## dep4                    -0.064666372       0.21935462      0.02756444068
## dep5                    -0.133141369       0.31084715      0.09709907081
## dep6                    -0.009180345       0.08400255     -0.00708692131
## dep7                    -0.036462633       0.11564580      0.12192189315
## dep8                    -0.057950727       0.13907595      0.13871640910
## dep9                     0.057628941       0.14729224      0.02413566675
##                  Correlation2.bp_sys1 Correlation2.bp_dia1 Correlation2.bp_sys2
## id                        0.021249306         0.0352325744          0.013395807
## sex                      -0.281596777        -0.0369927263         -0.315240512
## age                       0.395539507         0.1768627275          0.397004628
## ethnicity                 0.132275904         0.1141250515          0.121183962
## education                 0.044164422         0.0333297342          0.033293549
## marital                  -0.069325946        -0.0008785076         -0.064593535
## household_size           -0.173709027         0.0088933651         -0.128000070
## household_income          0.029065977        -0.0137162536          0.053269188
## weight                    0.279892656         0.1607983834          0.267963107
## height                    0.203543405         0.0446731587          0.255127402
## bmi                       0.215880663         0.1612870653          0.183149116
## pulse                    -0.024182553         0.1623017122         -0.022804501
## bp_sys1                   1.000000000         0.4949457172          0.919915343
## bp_dia1                   0.494945717         1.0000000000          0.482730495
## bp_sys2                   0.919915343         0.4827304946          1.000000000
## bp_dia2                   0.450601249         0.8845695997          0.459109573
## time_sed                  0.031224411        -0.0383178177          0.046892537
## drink_regularly          -0.095005305         0.0343748192         -0.119492100
## days_drinking             0.192123023         0.1583876956          0.194125369
## dep1                     -0.080327579         0.0008762567         -0.023457433
## dep2                      0.007780815         0.1014952948         -0.001674213
## dep3                      0.086869817         0.1845356805          0.072666014
## dep4                      0.020892273         0.0601421231          0.022045134
## dep5                      0.070289844         0.1351529625          0.032105826
## dep6                     -0.052080069        -0.0211730064         -0.041019017
## dep7                     -0.125935014        -0.0752176411         -0.067651846
## dep8                      0.021267349         0.0481398418         -0.008103926
## dep9                      0.087144809         0.1012306737          0.065230696
##                  Correlation2.bp_dia2 Correlation2.time_sed
## id                        0.038168565           0.091144103
## sex                      -0.041519534          -0.097730920
## age                       0.157760752          -0.171538179
## ethnicity                 0.115024449           0.101396736
## education                 0.105443445           0.329169644
## marital                   0.006124346           0.208536664
## household_size           -0.042865748          -0.161443254
## household_income          0.033881106           0.247685742
## weight                    0.149356243           0.140778051
## height                    0.068085621           0.220536773
## bmi                       0.136289921           0.039935162
## pulse                     0.091334609           0.025341721
## bp_sys1                   0.450601249           0.031224411
## bp_dia1                   0.884569600          -0.038317818
## bp_sys2                   0.459109573           0.046892537
## bp_dia2                   1.000000000          -0.001216683
## time_sed                 -0.001216683           1.000000000
## drink_regularly          -0.002095607          -0.218078067
## days_drinking             0.131839456           0.288441702
## dep1                      0.036848526           0.047947832
## dep2                      0.063765883           0.120509227
## dep3                      0.143204430          -0.036670858
## dep4                      0.105911308           0.015434764
## dep5                      0.101878824           0.042338131
## dep6                     -0.033636785           0.005146929
## dep7                     -0.076340986           0.040743386
## dep8                      0.054925393           0.088082824
## dep9                      0.088322140           0.054415820
##                  Correlation2.drink_regularly Correlation2.days_drinking
## id                                0.116209870                -0.01159518
## sex                               0.126790089                -0.24739987
## age                              -0.011966665                 0.05127391
## ethnicity                         0.050131549                 0.15341409
## education                        -0.046961454                 0.13071835
## marital                          -0.015434258                 0.10353541
## household_size                    0.125171867                -0.16433299
## household_income                 -0.111498516                 0.12032216
## weight                           -0.026185929                 0.18221234
## height                           -0.111972367                 0.34021415
## bmi                               0.030714949                 0.02402465
## pulse                             0.021030988                -0.05042144
## bp_sys1                          -0.095005305                 0.19212302
## bp_dia1                           0.034374819                 0.15838770
## bp_sys2                          -0.119492100                 0.19412537
## bp_dia2                          -0.002095607                 0.13183946
## time_sed                         -0.218078067                 0.28844170
## drink_regularly                   1.000000000                -0.30302913
## days_drinking                    -0.303029134                 1.00000000
## dep1                             -0.167616110                 0.01606481
## dep2                             -0.037649562                -0.05630038
## dep3                             -0.018145893                -0.06264327
## dep4                             -0.156416401                -0.08323949
## dep5                             -0.083941151                -0.02280718
## dep6                             -0.003833638                -0.11723723
## dep7                             -0.052426078                -0.16428494
## dep8                             -0.012342702                 0.08353104
## dep9                              0.132178372                -0.08778648
##                  Correlation2.dep1 Correlation2.dep2 Correlation2.dep3
## id                    0.1314029735       0.035251327       -0.01355027
## sex                   0.0176903573       0.028708275        0.13594785
## age                  -0.1257310813      -0.106066244       -0.01890943
## ethnicity             0.0097575373       0.124161734        0.07025954
## education            -0.0076781334       0.021260763       -0.24202163
## marital               0.1292283906       0.197789477        0.09738092
## household_size       -0.0187989473      -0.001860202        0.04488066
## household_income     -0.0836241002      -0.122138483       -0.10834091
## weight                0.2487412200       0.159313815        0.17249174
## height                0.0753457859       0.018000752       -0.07883421
## bmi                   0.2250785129       0.170295924        0.22649222
## pulse                 0.0986618343       0.056375051        0.25532352
## bp_sys1              -0.0803275788       0.007780815        0.08686982
## bp_dia1               0.0008762567       0.101495295        0.18453568
## bp_sys2              -0.0234574335      -0.001674213        0.07266601
## bp_dia2               0.0368485263       0.063765883        0.14320443
## time_sed              0.0479478318       0.120509227       -0.03667086
## drink_regularly      -0.1676161096      -0.037649562       -0.01814589
## days_drinking         0.0160648051      -0.056300378       -0.06264327
## dep1                  1.0000000000       0.449989580        0.31272982
## dep2                  0.4499895804       1.000000000        0.31103697
## dep3                  0.3127298241       0.311036973        1.00000000
## dep4                  0.4006425699       0.376959408        0.49823240
## dep5                  0.4163066105       0.437692284        0.61815208
## dep6                  0.4702227247       0.654100215        0.25789713
## dep7                  0.3883858961       0.465352056        0.27027105
## dep8                  0.3050335689       0.239866474        0.32056362
## dep9                  0.0992030215       0.450969361        0.18524844
##                  Correlation2.dep4 Correlation2.dep5 Correlation2.dep6
## id                     -0.04759379        0.10640699       0.086418021
## sex                     0.15351734        0.16571963       0.039356233
## age                    -0.13033299       -0.16989239      -0.022402285
## ethnicity               0.09336478        0.06025898       0.044150074
## education              -0.05240642       -0.19921404      -0.043504103
## marital                 0.15615839        0.23620555       0.145338886
## household_size          0.03925774        0.02935078      -0.043106569
## household_income       -0.01209398       -0.20746875      -0.144415291
## weight                  0.17099272        0.22497615       0.074516799
## height                 -0.06466637       -0.13314137      -0.009180345
## bmi                     0.21935462        0.31084715       0.084002546
## pulse                   0.02756444        0.09709907      -0.007086921
## bp_sys1                 0.02089227        0.07028984      -0.052080069
## bp_dia1                 0.06014212        0.13515296      -0.021173006
## bp_sys2                 0.02204513        0.03210583      -0.041019017
## bp_dia2                 0.10591131        0.10187882      -0.033636785
## time_sed                0.01543476        0.04233813       0.005146929
## drink_regularly        -0.15641640       -0.08394115      -0.003833638
## days_drinking          -0.08323949       -0.02280718      -0.117237230
## dep1                    0.40064257        0.41630661       0.470222725
## dep2                    0.37695941        0.43769228       0.654100215
## dep3                    0.49823240        0.61815208       0.257897135
## dep4                    1.00000000        0.57074546       0.261577176
## dep5                    0.57074546        1.00000000       0.349493047
## dep6                    0.26157718        0.34949305       1.000000000
## dep7                    0.42079208        0.45082819       0.463516413
## dep8                    0.30972311        0.55906700       0.312811391
## dep9                    0.22360680        0.28211304       0.521096600
##                  Correlation2.dep7 Correlation2.dep8 Correlation2.dep9
## id                    -0.047095282       0.101114771       0.001234142
## sex                    0.150103101       0.106552019      -0.017989824
## age                   -0.151210760      -0.141801472      -0.015020876
## ethnicity             -0.050950161       0.032973997       0.054876685
## education             -0.001121101      -0.146332016      -0.031959361
## marital                0.112960173       0.193229491       0.145190702
## household_size         0.088616396       0.033152707       0.122896158
## household_income      -0.023387590      -0.097552281       0.009153002
## weight                 0.084479677       0.121066400       0.158632364
## height                -0.036462633      -0.057950727       0.057628941
## bmi                    0.115645801       0.139075945       0.147292236
## pulse                  0.121921893       0.138716409       0.024135667
## bp_sys1               -0.125935014       0.021267349       0.087144809
## bp_dia1               -0.075217641       0.048139842       0.101230674
## bp_sys2               -0.067651846      -0.008103926       0.065230696
## bp_dia2               -0.076340986       0.054925393       0.088322140
## time_sed               0.040743386       0.088082824       0.054415820
## drink_regularly       -0.052426078      -0.012342702       0.132178372
## days_drinking         -0.164284937       0.083531036      -0.087786479
## dep1                   0.388385896       0.305033569       0.099203022
## dep2                   0.465352056       0.239866474       0.450969361
## dep3                   0.270271047       0.320563620       0.185248436
## dep4                   0.420792081       0.309723105       0.223606798
## dep5                   0.450828190       0.559067004       0.282113038
## dep6                   0.463516413       0.312811391       0.521096600
## dep7                   1.000000000       0.487490569       0.416927913
## dep8                   0.487490569       1.000000000       0.313209182
## dep9                   0.416927913       0.313209182       1.000000000
##                  Difference.id Difference.sex Difference.age
## id                 0.000000000    0.044114360   -0.023286666
## sex                0.044114360    0.000000000    0.087050648
## age               -0.023286666    0.087050648    0.000000000
## ethnicity          0.021125175    0.140476488   -0.043805133
## education          0.024618135    0.078793972   -0.029333744
## marital           -0.065545095   -0.154136845   -0.009636256
## household_size    -0.065452704   -0.143830532    0.089670525
## household_income   0.060484367    0.041717963    0.115315819
## weight            -0.004301567    0.031466565   -0.045807026
## height             0.025643372    0.032909139   -0.067874147
## bmi               -0.016241242    0.017108787   -0.033601806
## pulse              0.018992674   -0.051561660   -0.004722139
## bp_sys1           -0.064377452    0.129309891   -0.013869766
## bp_dia1           -0.007657384   -0.134764891   -0.003906610
## bp_sys2           -0.099109504    0.152349330   -0.034822158
## bp_dia2           -0.021763424   -0.124262097   -0.009294210
## time_sed          -0.094561144    0.087150445    0.094316505
## drink_regularly   -0.177871851    0.115486049    0.065332023
## days_drinking      0.098557108    0.004924517    0.022344558
## dep1              -0.088075125    0.060502005    0.107714238
## dep2              -0.038793391    0.048208891    0.078837221
## dep3              -0.036318200   -0.042770656   -0.022761297
## dep4              -0.026592618   -0.035692687    0.115597513
## dep5              -0.064325572   -0.016030206    0.078230127
## dep6              -0.067470913    0.086091426   -0.043377973
## dep7              -0.012010655   -0.088494834    0.095332027
## dep8              -0.114928042   -0.067223398    0.070750866
## dep9               0.013655795   -0.043547193    0.037392669
##                  Difference.ethnicity Difference.education Difference.marital
## id                         0.02112517          0.024618135       -0.065545095
## sex                        0.14047649          0.078793972       -0.154136845
## age                       -0.04380513         -0.029333744       -0.009636256
## ethnicity                  0.00000000          0.066134371       -0.071878804
## education                  0.06613437          0.000000000       -0.052063576
## marital                   -0.07187880         -0.052063576        0.000000000
## household_size             0.11510612          0.087761315       -0.074421507
## household_income           0.04181241          0.079640044       -0.005646766
## weight                    -0.11963116          0.008057375       -0.086047440
## height                    -0.09801676         -0.043302316        0.014603862
## bmi                       -0.09552930          0.030206742       -0.089433713
## pulse                     -0.11711325         -0.004154577        0.064581872
## bp_sys1                   -0.05685053         -0.154949060        0.026463658
## bp_dia1                   -0.06790799         -0.033316916       -0.081081162
## bp_sys2                   -0.05992341         -0.148428906        0.037438450
## bp_dia2                   -0.04264374         -0.093959959       -0.064649712
## time_sed                   0.04540950         -0.039722156       -0.090434780
## drink_regularly            0.06079423         -0.047482311       -0.034502807
## days_drinking             -0.12019474          0.007225192       -0.112569782
## dep1                      -0.02618432         -0.157808437       -0.052236958
## dep2                      -0.15414968         -0.122941492       -0.122502892
## dep3                      -0.05468508          0.142471946       -0.019084006
## dep4                      -0.06710375         -0.041502804       -0.079327093
## dep5                      -0.04478932          0.085695723       -0.086284873
## dep6                      -0.02914395         -0.003620056       -0.050246811
## dep7                       0.07883947         -0.077615871       -0.044011143
## dep8                      -0.06219753         -0.009398979       -0.072728617
## dep9                      -0.13410986         -0.105386022       -0.067024235
##                  Difference.household_size Difference.household_income
## id                            -0.065452704                 0.060484367
## sex                           -0.143830532                 0.041717963
## age                            0.089670525                 0.115315819
## ethnicity                      0.115106118                 0.041812405
## education                      0.087761315                 0.079640044
## marital                       -0.074421507                -0.005646766
## household_size                 0.000000000                 0.087623394
## household_income               0.087623394                 0.000000000
## weight                         0.074025362                -0.087641536
## height                         0.131209999                -0.012023974
## bmi                            0.036749022                -0.091257864
## pulse                         -0.058513391                -0.165228256
## bp_sys1                        0.064254697                -0.095637487
## bp_dia1                       -0.019499134                 0.074544116
## bp_sys2                        0.017469254                -0.112830267
## bp_dia2                        0.002524989                 0.029757828
## time_sed                      -0.037922447                -0.121988693
## drink_regularly               -0.064726817                 0.011701492
## days_drinking                  0.025097106                 0.074035878
## dep1                           0.022147236                -0.061607736
## dep2                          -0.022762286                -0.044426684
## dep3                          -0.088216364                -0.088862201
## dep4                          -0.050755636                -0.127828378
## dep5                          -0.080828557                -0.021847716
## dep6                           0.025483144                 0.014296921
## dep7                          -0.119675048                -0.056088964
## dep8                           0.015861086                -0.036491096
## dep9                           0.031929832                -0.064288710
##                  Difference.weight Difference.height Difference.bmi
## id                    -0.004301567       0.025643372   -0.016241242
## sex                    0.031466565       0.032909139    0.017108787
## age                   -0.045807026      -0.067874147   -0.033601806
## ethnicity             -0.119631157      -0.098016763   -0.095529302
## education              0.008057375      -0.043302316    0.030206742
## marital               -0.086047440       0.014603862   -0.089433713
## household_size         0.074025362       0.131209999    0.036749022
## household_income      -0.087641536      -0.012023974   -0.091257864
## weight                 0.000000000      -0.057871496    0.001328277
## height                -0.057871496       0.000000000   -0.062888809
## bmi                    0.001328277      -0.062888809    0.000000000
## pulse                 -0.050134697      -0.045796183   -0.029030088
## bp_sys1               -0.077285319      -0.143126799   -0.011781648
## bp_dia1                0.076390168       0.150553047    0.006303263
## bp_sys2               -0.051448532      -0.170263879    0.023316848
## bp_dia2                0.060388168       0.114693947    0.007811222
## time_sed              -0.022645056      -0.084010266    0.019739171
## drink_regularly       -0.043369910      -0.101935678   -0.001530571
## days_drinking         -0.159410929      -0.068244854   -0.132470732
## dep1                  -0.194678596      -0.094743601   -0.155514936
## dep2                  -0.123471029      -0.051075768   -0.112364135
## dep3                  -0.043893899       0.038823712   -0.063193912
## dep4                  -0.087426042       0.029924494   -0.109233051
## dep5                  -0.075999885       0.050070172   -0.107567319
## dep6                  -0.014460053      -0.029828753   -0.003475820
## dep7                  -0.029568922       0.053287829   -0.070749636
## dep8                  -0.103046415       0.007415945   -0.100037890
## dep9                  -0.146758543      -0.048414997   -0.137862650
##                  Difference.pulse Difference.bp_sys1 Difference.bp_dia1
## id                   0.0189926738       -0.064377452       -0.007657384
## sex                 -0.0515616596        0.129309891       -0.134764891
## age                 -0.0047221385       -0.013869766       -0.003906610
## ethnicity           -0.1171132483       -0.056850530       -0.067907991
## education           -0.0041545769       -0.154949060       -0.033316916
## marital              0.0645818721        0.026463658       -0.081081162
## household_size      -0.0585133911        0.064254697       -0.019499134
## household_income    -0.1652282563       -0.095637487        0.074544116
## weight              -0.0501346973       -0.077285319        0.076390168
## height              -0.0457961833       -0.143126799        0.150553047
## bmi                 -0.0290300879       -0.011781648        0.006303263
## pulse                0.0000000000        0.016657394        0.008565375
## bp_sys1              0.0166573938        0.000000000       -0.011942043
## bp_dia1              0.0085653753       -0.011942043        0.000000000
## bp_sys2              0.0345080714        0.006466153        0.014560843
## bp_dia2              0.0595681670        0.015928947       -0.016028676
## time_sed             0.0372175627       -0.035006865        0.040496212
## drink_regularly      0.0467910286        0.068199579       -0.092035484
## days_drinking       -0.0326848646       -0.043532193       -0.041181298
## dep1                -0.0121514311        0.050395420        0.059849521
## dep2                 0.0640821613       -0.069000545       -0.058258015
## dep3                -0.0799072237       -0.076932572       -0.127020116
## dep4                 0.0947467759       -0.046934947       -0.032741861
## dep5                 0.0856228410       -0.117661639       -0.103966896
## dep6                 0.0941419664       -0.007601920       -0.004981686
## dep7                -0.0324614771        0.023169032        0.031765071
## dep8                -0.0115420136       -0.047185937       -0.052720699
## dep9                -0.0005080625       -0.031364008       -0.024668887
##                  Difference.bp_sys2 Difference.bp_dia2 Difference.time_sed
## id                     -0.099109504       -0.021763424        -0.094561144
## sex                     0.152349330       -0.124262097         0.087150445
## age                    -0.034822158       -0.009294210         0.094316505
## ethnicity              -0.059923412       -0.042643742         0.045409499
## education              -0.148428906       -0.093959959        -0.039722156
## marital                 0.037438450       -0.064649712        -0.090434780
## household_size          0.017469254        0.002524989        -0.037922447
## household_income       -0.112830267        0.029757828        -0.121988693
## weight                 -0.051448532        0.060388168        -0.022645056
## height                 -0.170263879        0.114693947        -0.084010266
## bmi                     0.023316848        0.007811222         0.019739171
## pulse                   0.034508071        0.059568167         0.037217563
## bp_sys1                 0.006466153        0.015928947        -0.035006865
## bp_dia1                 0.014560843       -0.016028676         0.040496212
## bp_sys2                 0.000000000        0.028740967        -0.024156231
## bp_dia2                 0.028740967        0.000000000        -0.001229762
## time_sed               -0.024156231       -0.001229762         0.000000000
## drink_regularly         0.105673258       -0.030356685         0.162479202
## days_drinking          -0.070042957       -0.019529114        -0.171751786
## dep1                    0.002360078        0.029916897        -0.021429341
## dep2                   -0.043614841       -0.009737730        -0.052451323
## dep3                   -0.074098176       -0.077702049         0.067574669
## dep4                   -0.044507870       -0.029064073         0.062970919
## dep5                   -0.083453502       -0.070206078         0.009010268
## dep6                   -0.009859198        0.053077551         0.079122032
## dep7                   -0.004294235        0.104508502         0.011782471
## dep8                    0.021864353       -0.028299307        -0.120978729
## dep9                    0.002707224        0.018080403        -0.155482973
##                  Difference.drink_regularly Difference.days_drinking
## id                             -0.177871851              0.098557108
## sex                             0.115486049              0.004924517
## age                             0.065332023              0.022344558
## ethnicity                       0.060794231             -0.120194742
## education                      -0.047482311              0.007225192
## marital                        -0.034502807             -0.112569782
## household_size                 -0.064726817              0.025097106
## household_income                0.011701492              0.074035878
## weight                         -0.043369910             -0.159410929
## height                         -0.101935678             -0.068244854
## bmi                            -0.001530571             -0.132470732
## pulse                           0.046791029             -0.032684865
## bp_sys1                         0.068199579             -0.043532193
## bp_dia1                        -0.092035484             -0.041181298
## bp_sys2                         0.105673258             -0.070042957
## bp_dia2                        -0.030356685             -0.019529114
## time_sed                        0.162479202             -0.171751786
## drink_regularly                 0.000000000             -0.008545924
## days_drinking                  -0.008545924              0.000000000
## dep1                            0.181162622             -0.107776916
## dep2                            0.081144107             -0.047234092
## dep3                            0.060103819             -0.028919342
## dep4                            0.201289255             -0.023323698
## dep5                            0.097609590             -0.037984502
## dep6                            0.019781639              0.045275361
## dep7                            0.092422713              0.062023300
## dep8                            0.054426885             -0.126528667
## dep9                           -0.064537194              0.053591159
##                  Difference.dep1 Difference.dep2 Difference.dep3
## id                  -0.088075125     -0.03879339     -0.03631820
## sex                  0.060502005      0.04820889     -0.04277066
## age                  0.107714238      0.07883722     -0.02276130
## ethnicity           -0.026184315     -0.15414968     -0.05468508
## education           -0.157808437     -0.12294149      0.14247195
## marital             -0.052236958     -0.12250289     -0.01908401
## household_size       0.022147236     -0.02276229     -0.08821636
## household_income    -0.061607736     -0.04442668     -0.08886220
## weight              -0.194678596     -0.12347103     -0.04389390
## height              -0.094743601     -0.05107577      0.03882371
## bmi                 -0.155514936     -0.11236414     -0.06319391
## pulse               -0.012151431      0.06408216     -0.07990722
## bp_sys1              0.050395420     -0.06900055     -0.07693257
## bp_dia1              0.059849521     -0.05825802     -0.12702012
## bp_sys2              0.002360078     -0.04361484     -0.07409818
## bp_dia2              0.029916897     -0.00973773     -0.07770205
## time_sed            -0.021429341     -0.05245132      0.06757467
## drink_regularly      0.181162622      0.08114411      0.06010382
## days_drinking       -0.107776916     -0.04723409     -0.02891934
## dep1                 0.000000000      0.08166990      0.04269606
## dep2                 0.081669899      0.00000000      0.11938963
## dep3                 0.042696057      0.11938963      0.00000000
## dep4                 0.077986080      0.13920371      0.01691367
## dep5                -0.026413838     -0.05730106     -0.10188646
## dep6                -0.018346456     -0.04153381      0.15813265
## dep7                 0.019499770      0.04147674      0.10162031
## dep8                -0.002172946      0.14805607     -0.01014629
## dep9                 0.101679062     -0.09189916      0.09254761
##                  Difference.dep4 Difference.dep5 Difference.dep6
## id                  -0.026592618    -0.064325572    -0.067470913
## sex                 -0.035692687    -0.016030206     0.086091426
## age                  0.115597513     0.078230127    -0.043377973
## ethnicity           -0.067103752    -0.044789322    -0.029143955
## education           -0.041502804     0.085695723    -0.003620056
## marital             -0.079327093    -0.086284873    -0.050246811
## household_size      -0.050755636    -0.080828557     0.025483144
## household_income    -0.127828378    -0.021847716     0.014296921
## weight              -0.087426042    -0.075999885    -0.014460053
## height               0.029924494     0.050070172    -0.029828753
## bmi                 -0.109233051    -0.107567319    -0.003475820
## pulse                0.094746776     0.085622841     0.094141966
## bp_sys1             -0.046934947    -0.117661639    -0.007601920
## bp_dia1             -0.032741861    -0.103966896    -0.004981686
## bp_sys2             -0.044507870    -0.083453502    -0.009859198
## bp_dia2             -0.029064073    -0.070206078     0.053077551
## time_sed             0.062970919     0.009010268     0.079122032
## drink_regularly      0.201289255     0.097609590     0.019781639
## days_drinking       -0.023323698    -0.037984502     0.045275361
## dep1                 0.077986080    -0.026413838    -0.018346456
## dep2                 0.139203711    -0.057301064    -0.041533814
## dep3                 0.016913670    -0.101886465     0.158132653
## dep4                 0.000000000    -0.063122427     0.154246099
## dep5                -0.063122427     0.000000000     0.058907982
## dep6                 0.154246099     0.058907982     0.000000000
## dep7                 0.026497433    -0.162929150     0.056672069
## dep8                 0.003407790    -0.222790431     0.107945134
## dep9                -0.002977329    -0.113305432    -0.190572205
##                  Difference.dep7 Difference.dep8 Difference.dep9
## id                  -0.012010655    -0.114928042    0.0136557946
## sex                 -0.088494834    -0.067223398   -0.0435471935
## age                  0.095332027     0.070750866    0.0373926690
## ethnicity            0.078839474    -0.062197530   -0.1341098608
## education           -0.077615871    -0.009398979   -0.1053860223
## marital             -0.044011143    -0.072728617   -0.0670242347
## household_size      -0.119675048     0.015861086    0.0319298315
## household_income    -0.056088964    -0.036491096   -0.0642887097
## weight              -0.029568922    -0.103046415   -0.1467585435
## height               0.053287829     0.007415945   -0.0484149970
## bmi                 -0.070749636    -0.100037890   -0.1378626502
## pulse               -0.032461477    -0.011542014   -0.0005080625
## bp_sys1              0.023169032    -0.047185937   -0.0313640077
## bp_dia1              0.031765071    -0.052720699   -0.0246688869
## bp_sys2             -0.004294235     0.021864353    0.0027072237
## bp_dia2              0.104508502    -0.028299307    0.0180804029
## time_sed             0.011782471    -0.120978729   -0.1554829729
## drink_regularly      0.092422713     0.054426885   -0.0645371937
## days_drinking        0.062023300    -0.126528667    0.0535911589
## dep1                 0.019499770    -0.002172946    0.1016790618
## dep2                 0.041476739     0.148056070   -0.0918991639
## dep3                 0.101620307    -0.010146289    0.0925476095
## dep4                 0.026497433     0.003407790   -0.0029773285
## dep5                -0.162929150    -0.222790431   -0.1133054317
## dep6                 0.056672069     0.107945134   -0.1905722051
## dep7                 0.000000000    -0.026471182   -0.1797376219
## dep8                -0.026471182     0.000000000   -0.0832136036
## dep9                -0.179737622    -0.083213604    0.0000000000