## Import the data into R with the read.table function
DASH_Score <- read.table("C:/Users/fanzh/Dropbox/Project with Zhaohu at UC (Model confidence bounds for variable selection)/National Health and Nutrition Examination Survey/Copy of Copy of Couch DASH Baseline Data6-13-2019.csv", sep=",", header = TRUE)
We know that this dataset contains 188 subjects and 222 variables (222=69+152+1(subject ID)). There are 69 demographical variables, such as age, race, sex and income, etc.We also have 152 treatments or variables of interest, which are considered as NDSR Food subcodes used in DASH score and to calculate total servings of different food groups.
| Variable | Definition |
|---|---|
| Subject | subject id |
| Month | Month of visit |
| Day | Day of visit |
| Year | Year of visit |
| Age | age at visit in years |
| Race | race 1=white; 2=black 3=hispanic; 4=native american; 5=asian pacific islander; 6=multiracial |
| Sex | gender 1=males; 2=females |
| Income | household income 1=<$20,000; 2=20-50; 3=50-80; 4=>$80,000 |
| SBP | Systolic blood pressure |
| DBP | diastolic blood pressure |
| HTNST | hypertension status 1=pre; 2=stage 1; 3=normal; 4=stage 2 |
| WTKG | weight in kg |
| HTM | height in meters |
| HTPCT | height percentile |
| BMIcal | EPIC calculated BMI |
| BMIPCT | BMI percentile from clinic |
| DMETS | Daily Metabolic equivalents |
| Sleep | hours of sleep per week |
| lightmin | minutes of light activity per week |
| modmin | minutes of moderate activity per week |
| hardmin | minutes of hard activity per week |
| vhardmin | minutes of very hard activity per week |
| actweek | minutes of mod, hard and very hard activity per week |
| X_D_B1 | TwoD_TwoD_BDimension |
| X_D_B2 | TwoD_TwoD_BDimension_2 |
| X_D_B3 | TwoD_TwoD_BDimension_3 |
| X_D_601 | TwoD_TwoD_60Dimension |
| X_D_602 | TwoD_TwoD_60Dimension_2 |
| X_D_603 | TwoD_TwoD_60Dimension_3 |
| X_D_901 | TwoD_TwoD_90Dimension |
| X_D_902 | TwoD_TwoD_90Dimension_2 |
| X_D_903 | TwoD_TwoD_90Dimension_3 |
| X_D_1201 | TwoD_TwoD_120Dimension |
| X_D_1202 | TwoD_TwoD_120Dimension_2 |
| X_D_1203 | TwoD_TwoD_120Dimension_3 |
| PSV_B_1 | DOPPLER_PSV_BVelocity |
| PSV_B_2 | DOPPLER_PSV_BVelocity_2 |
| PSV_B_3 | DOPPLER_PSV_BVelocity_3 |
| PSV_I_41 | DOPPLER_PSV_I_4Velocity |
| PSV_I_42 | DOPPLER_PSV_I_4Velocity_2 |
| PSV_I_43 | DOPPLER_PSV_I_4Velocity_3 |
| PSV_I_51 | DOPPLER_PSV_I_5Velocity |
| PSV_I_52 | DOPPLER_PSV_I_5Velocity_2 |
| PSV_I_53 | DOPPLER_PSV_I_5Velocity_3 |
| PSV_I_61 | DOPPLER_PSV_I_6Velocity |
| PSV_I_62 | DOPPLER_PSV_I_6Velocity_2 |
| PSV_I_63 | DOPPLER_PSV_I_6Velocity_3 |
| PSV_601 | DOPPLER_PSV_60Velocity |
| PSV_602 | DOPPLER_PSV_60Velocity_2 |
| PSV_603 | DOPPLER_PSV_60Velocity_3 |
| PSV_901 | DOPPLER_PSV_90Velocity |
| PSV_902 | DOPPLER_PSV_90Velocity_2 |
| PSV_903 | DOPPLER_PSV_90Velocity_3 |
| PSV_1201 | DOPPLER_PSV_120Velocity |
| PSV_1202 | DOPPLER_PSV_120Velocity_2 |
| PSV_1203 | DOPPLER_PSV_120Velocity_3 |
| energy | energy (kcal/day) |
| Chol | cholesterol (mg/day) |
| Sucrose | sucrose (grams/day) |
| Fiber | total dietay fiber (grams/day) |
| CA | calcium (mg/day) |
| Mg | magnesium (mg/day) |
| Na | sodium (mg/day) |
| K | potassium (mg/day) |
| Caffeine | caffeine (mg/day) |
| pfat | percent kcal from fat |
| pcarb | percent kcal from carbohydrate |
| ppro | percent kcal from protein |
| psfat | percent kcal from saturated fat |
| trans | trans fatty acids (g/day) |
We have 15 subjects’ response including at least one missing value or marked with “.”. After removing missing values, we have cleaned dataset including 173 subjects and 222 variables.
DASH_Score[DASH_Score=="."] <- NA
DASH_Score=DASH_Score[complete.cases(DASH_Score), ]
Notes:
FRU0600 VEG0800 VEG0900 SWT0600 MSC1100
There is one variable “na” in the SAS code. We instead used the variable “Na” in the new data set.
These are the criterion for adult used in the SAS code to calculate the DASH scores (either Gunther or SuperWIN). However, we only have adolescent data set. For the sake of the calculation, we extend the original condition of “age >19 and age <31” to “age >1 and age <31” for both Gunther and SuperWIN.
head(DASHSC_SuperWIN,10)
## [1] 43.79941 43.03675 47.71267 29.92237 50.15139 29.39656 26.67415
## [8] 52.18893 45.73657 25.25528
Figures of SuperWIN Dash Scores vs SBP/DBP
par(mfrow=c(1,2))
plot(DASHSC_SuperWIN,SBP)
plot(DASHSC_SuperWIN,DBP)
From Wiki
In statistics, the Pearson correlation coefficient, also referred to as Pearson’s r, the Pearson product-moment correlation coefficient (PPMCC) or the bivariate correlation, is a measure of the linear correlation between two variables X and Y.
In statistics, Spearman’s rank correlation coefficient or Spearman’s rho, is a nonparametric measure of rank correlation (statistical dependence between the rankings of two variables). It assesses how well the relationship between two variables can be described using a monotonic function.
In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall’s tau coefficient, is a statistic used to measure the ordinal association between two measured quantities. A tau test is a non-parametric hypothesis test for statistical dependence based on the tau coefficient.
data111=data.frame(DASHSC_SuperWIN,SBP,DBP)
cor(data111, use="complete.obs", method="pearson")
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00000000 -0.05779287 -0.14156465
## SBP -0.05779287 1.00000000 0.02158761
## DBP -0.14156465 0.02158761 1.00000000
cor(data111, use="complete.obs", method="spearman")
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.0000000 -0.06729720 -0.16456628
## SBP -0.0672972 1.00000000 0.06712612
## DBP -0.1645663 0.06712612 1.00000000
cor(data111, use="complete.obs", method="kendall")
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00000000 -0.04654998 -0.10827770
## SBP -0.04654998 1.00000000 0.04790639
## DBP -0.10827770 0.04790639 1.00000000
We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.
summary(l21)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + Age + Sex + as.factor(HTNST) +
## WTKG + HTM + HTPCT + BMIcal + BMIPCT + DMETS + Sleep + lightmin +
## modmin + hardmin + vhardmin + actweek, data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.5322 -1.8837 0.0068 2.3681 11.1078
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 110.545833 91.455603 1.209 0.228628
## DASHSC_SuperWIN -0.037551 0.029773 -1.261 0.209131
## Age 0.911027 0.264390 3.446 0.000735 ***
## Sex -2.795948 0.891400 -3.137 0.002050 **
## as.factor(HTNST)2 8.968842 0.658810 13.614 < 2e-16 ***
## as.factor(HTNST)3 -7.911099 1.494539 -5.293 4.1e-07 ***
## as.factor(HTNST)4 9.560653 3.962058 2.413 0.017002 *
## WTKG 0.003392 0.040643 0.083 0.933599
## HTM 19.621541 7.501571 2.616 0.009797 **
## HTPCT 0.032782 0.018863 1.738 0.084241 .
## BMIcal -0.059818 0.101098 -0.592 0.554934
## BMIPCT -0.010170 0.025169 -0.404 0.686720
## DMETS 0.114191 0.139652 0.818 0.414808
## Sleep -0.143328 0.534552 -0.268 0.788963
## lightmin -0.003356 0.008885 -0.378 0.706196
## modmin -0.010960 0.005054 -2.168 0.031679 *
## hardmin -0.009651 0.004770 -2.023 0.044797 *
## vhardmin -0.010988 0.005533 -1.986 0.048824 *
## actweek 0.004915 0.010791 0.456 0.649388
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.795 on 153 degrees of freedom
## Multiple R-squared: 0.7535, Adjusted R-squared: 0.7245
## F-statistic: 25.98 on 18 and 153 DF, p-value: < 2.2e-16
As a rule of thumb, a VIF value that exceeds 10 indicates a problematic amount of collinearity (James et al. 2014).We calculate variance-inflation factor to chek the multicollinearity in the model.
car::vif(l21)
## GVIF Df GVIF^(1/(2*Df))
## DASHSC_SuperWIN 1.186093 1 1.089079
## Age 3.368040 1 1.835222
## Sex 2.155714 1 1.468235
## as.factor(HTNST) 1.387435 3 1.056093
## WTKG 13.706503 1 3.702229
## HTM 7.553481 1 2.748360
## HTPCT 3.208182 1 1.791140
## BMIcal 9.013899 1 3.002316
## BMIPCT 1.897636 1 1.377547
## DMETS 4.056549 1 2.014088
## Sleep 215.798063 1 14.690067
## lightmin 274.337447 1 16.563135
## modmin 11.896568 1 3.449140
## hardmin 9.851141 1 3.138653
## vhardmin 3.583235 1 1.892944
## actweek 147.396835 1 12.140710
We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value.
l21_1 <- lm(SBP~DASHSC_SuperWIN+Age+Sex+as.factor(HTNST)+HTM+HTPCT+BMIcal+BMIPCT+DMETS+hardmin+vhardmin, data=data21)
We can see that dash score is not statistically significance under \(\alpha\)=0.05.
summary(l21_1)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + Age + Sex + as.factor(HTNST) +
## HTM + HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin,
## data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.0268 -2.4057 0.2345 2.5296 12.2536
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 80.1274935 8.9762590 8.927 1.06e-15 ***
## DASHSC_SuperWIN -0.0224611 0.0291606 -0.770 0.442298
## Age 0.8230130 0.2628176 3.131 0.002073 **
## Sex -2.7225122 0.8814366 -3.089 0.002375 **
## as.factor(HTNST)2 8.7180312 0.6506841 13.398 < 2e-16 ***
## as.factor(HTNST)3 -8.6356525 1.4809147 -5.831 3.01e-08 ***
## as.factor(HTNST)4 9.9527954 3.9656125 2.510 0.013088 *
## HTM 21.0260058 5.9883176 3.511 0.000582 ***
## HTPCT 0.0308345 0.0190235 1.621 0.107040
## BMIcal -0.0628041 0.0458036 -1.371 0.172270
## BMIPCT -0.0090262 0.0240136 -0.376 0.707510
## DMETS 0.0721704 0.1175351 0.614 0.540077
## hardmin -0.0008495 0.0021653 -0.392 0.695351
## vhardmin -0.0015643 0.0039440 -0.397 0.692180
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.837 on 158 degrees of freedom
## Multiple R-squared: 0.7398, Adjusted R-squared: 0.7184
## F-statistic: 34.56 on 13 and 158 DF, p-value: < 2.2e-16
summary(l22)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + Age + Sex + as.factor(HTNST) +
## WTKG + HTM + HTPCT + BMIcal + BMIPCT + DMETS + Sleep + lightmin +
## modmin + hardmin + vhardmin + actweek, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.3781 -3.6699 0.2789 5.3425 13.7299
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.065e+02 1.877e+02 -2.698 0.00776 **
## DASHSC_SuperWIN -5.130e-02 6.112e-02 -0.839 0.40254
## Age 1.448e+00 5.428e-01 2.668 0.00845 **
## Sex 1.479e+00 1.830e+00 0.808 0.42013
## as.factor(HTNST)2 2.659e+00 1.352e+00 1.966 0.05108 .
## as.factor(HTNST)3 -3.111e+00 3.068e+00 -1.014 0.31214
## as.factor(HTNST)4 2.231e+01 8.134e+00 2.743 0.00682 **
## WTKG -6.053e-02 8.343e-02 -0.725 0.46929
## HTM -1.012e+01 1.540e+01 -0.657 0.51188
## HTPCT -3.736e-03 3.872e-02 -0.096 0.92326
## BMIcal 2.638e-01 2.075e-01 1.271 0.20557
## BMIPCT -2.179e-02 5.167e-02 -0.422 0.67388
## DMETS -2.310e-01 2.867e-01 -0.806 0.42154
## Sleep 3.529e+00 1.097e+00 3.216 0.00159 **
## lightmin 5.722e-02 1.824e-02 3.137 0.00205 **
## modmin 9.522e-04 1.038e-02 0.092 0.92701
## hardmin 9.424e-03 9.792e-03 0.962 0.33736
## vhardmin 1.042e-02 1.136e-02 0.918 0.36030
## actweek 5.662e-02 2.215e-02 2.556 0.01157 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.79 on 153 degrees of freedom
## Multiple R-squared: 0.2493, Adjusted R-squared: 0.161
## F-statistic: 2.823 on 18 and 153 DF, p-value: 0.0002879
We calculate variance-inflation factor to chek the multicollinearity in the model.
car::vif(l22)
## GVIF Df GVIF^(1/(2*Df))
## DASHSC_SuperWIN 1.186093 1 1.089079
## Age 3.368040 1 1.835222
## Sex 2.155714 1 1.468235
## as.factor(HTNST) 1.387435 3 1.056093
## WTKG 13.706503 1 3.702229
## HTM 7.553481 1 2.748360
## HTPCT 3.208182 1 1.791140
## BMIcal 9.013899 1 3.002316
## BMIPCT 1.897636 1 1.377547
## DMETS 4.056549 1 2.014088
## Sleep 215.798063 1 14.690067
## lightmin 274.337447 1 16.563135
## modmin 11.896568 1 3.449140
## hardmin 9.851141 1 3.138653
## vhardmin 3.583235 1 1.892944
## actweek 147.396835 1 12.140710
We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value.
l22_1 <- lm(DBP~DASHSC_SuperWIN+Age+Sex+as.factor(HTNST)+HTM+HTPCT+BMIcal+BMIPCT+DMETS+hardmin+vhardmin, data=data22)
We can see that dash score is not statistically significance under \(\alpha\)=0.05.
summary(l22_1)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + Age + Sex + as.factor(HTNST) +
## HTM + HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin,
## data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.7011 -4.3467 0.6443 5.6729 15.6037
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 100.187957 18.622546 5.380 2.64e-07 ***
## DASHSC_SuperWIN -0.084845 0.060498 -1.402 0.1627
## Age 1.372935 0.545253 2.518 0.0128 *
## Sex 0.496042 1.828668 0.271 0.7865
## as.factor(HTNST)2 2.645934 1.349938 1.960 0.0518 .
## as.factor(HTNST)3 -2.567509 3.072372 -0.836 0.4046
## as.factor(HTNST)4 19.211561 8.227236 2.335 0.0208 *
## HTM -18.697273 12.423630 -1.505 0.1343
## HTPCT 0.002761 0.039467 0.070 0.9443
## BMIcal 0.150997 0.095026 1.589 0.1141
## BMIPCT -0.037698 0.049820 -0.757 0.4504
## DMETS -0.441093 0.243844 -1.809 0.0724 .
## hardmin 0.010046 0.004492 2.236 0.0267 *
## vhardmin 0.010255 0.008182 1.253 0.2119
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.96 on 158 degrees of freedom
## Multiple R-squared: 0.1907, Adjusted R-squared: 0.1241
## F-statistic: 2.863 on 13 and 158 DF, p-value: 0.0009901
Figures of SBP/DBP vs Gunther Dash Scores.
par(mfrow=c(1,2))
plot(DASHSC_Gunther,SBP)
plot(DASHSC_Gunther,DBP)
data222=data.frame(DASHSC_Gunther,SBP,DBP)
cor(data222, use="complete.obs", method="pearson")
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00000000 -0.01454611 -0.17323074
## SBP -0.01454611 1.00000000 0.02158761
## DBP -0.17323074 0.02158761 1.00000000
cor(data222, use="complete.obs", method="spearman")
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.000000000 0.006163971 -0.15670933
## SBP 0.006163971 1.000000000 0.06712612
## DBP -0.156709334 0.067126122 1.00000000
cor(data222, use="complete.obs", method="kendall")
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.000000000 0.006313038 -0.10841581
## SBP 0.006313038 1.000000000 0.04790639
## DBP -0.108415807 0.047906389 1.00000000
We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.
summary(l11)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + Age + Sex + as.factor(HTNST) +
## WTKG + HTM + HTPCT + BMIcal + BMIPCT + DMETS + Sleep + lightmin +
## modmin + hardmin + vhardmin + actweek, data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.578 -1.829 0.168 2.382 10.812
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 110.801184 91.393237 1.212 0.227245
## DASHSC_Gunther -0.053773 0.041624 -1.292 0.198342
## Age 0.898016 0.265274 3.385 0.000904 ***
## Sex -2.877204 0.896707 -3.209 0.001625 **
## as.factor(HTNST)2 8.998069 0.659599 13.642 < 2e-16 ***
## as.factor(HTNST)3 -7.825900 1.495129 -5.234 5.38e-07 ***
## as.factor(HTNST)4 9.174566 4.024209 2.280 0.023999 *
## WTKG 0.005275 0.040628 0.130 0.896860
## HTM 19.667733 7.500383 2.622 0.009619 **
## HTPCT 0.032090 0.018886 1.699 0.091332 .
## BMIcal -0.062143 0.101010 -0.615 0.539325
## BMIPCT -0.012397 0.024982 -0.496 0.620426
## DMETS 0.114823 0.139575 0.823 0.411980
## Sleep -0.143356 0.534047 -0.268 0.788728
## lightmin -0.003311 0.008870 -0.373 0.709468
## modmin -0.010938 0.005050 -2.166 0.031871 *
## hardmin -0.009571 0.004764 -2.009 0.046304 *
## vhardmin -0.010838 0.005529 -1.960 0.051790 .
## actweek 0.004899 0.010787 0.454 0.650338
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.794 on 153 degrees of freedom
## Multiple R-squared: 0.7536, Adjusted R-squared: 0.7247
## F-statistic: 26 on 18 and 153 DF, p-value: < 2.2e-16
We calculate variance-inflation factor to chek the multicollinearity in the model.
car::vif(l11)
## GVIF Df GVIF^(1/(2*Df))
## DASHSC_Gunther 1.229783 1 1.108956
## Age 3.392319 1 1.841825
## Sex 2.182563 1 1.477350
## as.factor(HTNST) 1.438585 3 1.062484
## WTKG 13.703303 1 3.701797
## HTM 7.554908 1 2.748619
## HTPCT 3.217645 1 1.793779
## BMIcal 9.002901 1 3.000483
## BMIPCT 1.870494 1 1.367660
## DMETS 4.054148 1 2.013492
## Sleep 215.499459 1 14.679900
## lightmin 273.542758 1 16.539128
## modmin 11.882432 1 3.447090
## hardmin 9.832225 1 3.135638
## vhardmin 3.579972 1 1.892082
## actweek 147.357518 1 12.139091
We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value.
l11_1 <- lm(SBP~DASHSC_Gunther+Age+Sex+as.factor(HTNST)+HTM+HTPCT+BMIcal+BMIPCT+DMETS+hardmin+vhardmin, data=data11)
We can see that dash score is not statistically significance under \(\alpha\)=0.05.
summary(l11_1)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + Age + Sex + as.factor(HTNST) +
## HTM + HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin,
## data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.0624 -2.5008 0.3359 2.4979 12.0814
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 80.2780970 8.9777814 8.942 9.64e-16 ***
## DASHSC_Gunther -0.0352276 0.0410015 -0.859 0.391544
## Age 0.8128820 0.2637109 3.082 0.002423 **
## Sex -2.7732134 0.8849609 -3.134 0.002058 **
## as.factor(HTNST)2 8.7403262 0.6509503 13.427 < 2e-16 ***
## as.factor(HTNST)3 -8.5789282 1.4787058 -5.802 3.48e-08 ***
## as.factor(HTNST)4 9.6711261 4.0184376 2.407 0.017252 *
## HTM 21.2271174 5.9927833 3.542 0.000522 ***
## HTPCT 0.0303106 0.0190490 1.591 0.113565
## BMIcal -0.0613433 0.0458443 -1.338 0.182793
## BMIPCT -0.0101356 0.0237878 -0.426 0.670626
## DMETS 0.0727404 0.1174268 0.619 0.536510
## hardmin -0.0008406 0.0021638 -0.388 0.698174
## vhardmin -0.0014878 0.0039473 -0.377 0.706740
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.835 on 158 degrees of freedom
## Multiple R-squared: 0.74, Adjusted R-squared: 0.7187
## F-statistic: 34.6 on 13 and 158 DF, p-value: < 2.2e-16
summary(l12)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + Age + Sex + as.factor(HTNST) +
## WTKG + HTM + HTPCT + BMIcal + BMIPCT + DMETS + Sleep + lightmin +
## modmin + hardmin + vhardmin + actweek, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.3038 -3.8996 0.1987 5.3429 13.7821
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.008e+02 1.875e+02 -2.671 0.00837 **
## DASHSC_Gunther -8.645e-02 8.538e-02 -1.013 0.31289
## Age 1.421e+00 5.441e-01 2.611 0.00993 **
## Sex 1.334e+00 1.839e+00 0.725 0.46953
## as.factor(HTNST)2 2.712e+00 1.353e+00 2.004 0.04683 *
## as.factor(HTNST)3 -2.978e+00 3.067e+00 -0.971 0.33311
## as.factor(HTNST)4 2.146e+01 8.255e+00 2.599 0.01025 *
## WTKG -5.776e-02 8.334e-02 -0.693 0.48929
## HTM -1.001e+01 1.539e+01 -0.650 0.51647
## HTPCT -5.079e-03 3.874e-02 -0.131 0.89587
## BMIcal 2.614e-01 2.072e-01 1.262 0.20904
## BMIPCT -2.433e-02 5.124e-02 -0.475 0.63566
## DMETS -2.327e-01 2.863e-01 -0.813 0.41767
## Sleep 3.501e+00 1.095e+00 3.195 0.00170 **
## lightmin 5.680e-02 1.819e-02 3.122 0.00215 **
## modmin 8.250e-04 1.036e-02 0.080 0.93663
## hardmin 9.446e-03 9.773e-03 0.967 0.33527
## vhardmin 1.061e-02 1.134e-02 0.935 0.35101
## actweek 5.625e-02 2.213e-02 2.542 0.01201 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.782 on 153 degrees of freedom
## Multiple R-squared: 0.2509, Adjusted R-squared: 0.1627
## F-statistic: 2.846 on 18 and 153 DF, p-value: 0.0002573
We calculate variance-inflation factor to chek the multicollinearity in the model.
car::vif(l12)
## GVIF Df GVIF^(1/(2*Df))
## DASHSC_Gunther 1.229783 1 1.108956
## Age 3.392319 1 1.841825
## Sex 2.182563 1 1.477350
## as.factor(HTNST) 1.438585 3 1.062484
## WTKG 13.703303 1 3.701797
## HTM 7.554908 1 2.748619
## HTPCT 3.217645 1 1.793779
## BMIcal 9.002901 1 3.000483
## BMIPCT 1.870494 1 1.367660
## DMETS 4.054148 1 2.013492
## Sleep 215.499459 1 14.679900
## lightmin 273.542758 1 16.539128
## modmin 11.882432 1 3.447090
## hardmin 9.832225 1 3.135638
## vhardmin 3.579972 1 1.892082
## actweek 147.357518 1 12.139091
We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value.
l12_1 <- lm(DBP~DASHSC_Gunther+Age+Sex+as.factor(HTNST)+HTM+HTPCT+BMIcal+BMIPCT+DMETS+hardmin+vhardmin, data=data12)
We can see that dash score is not statistically significance under \(\alpha\)=0.05.
summary(l12_1)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + Age + Sex + as.factor(HTNST) +
## HTM + HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin,
## data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.6063 -4.7220 0.5221 5.4868 15.5415
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.009e+02 1.860e+01 5.425 2.14e-07 ***
## DASHSC_Gunther -1.370e-01 8.493e-02 -1.613 0.1088
## Age 1.332e+00 5.463e-01 2.437 0.0159 *
## Sex 2.960e-01 1.833e+00 0.161 0.8720
## as.factor(HTNST)2 2.733e+00 1.348e+00 2.027 0.0444 *
## as.factor(HTNST)3 -2.353e+00 3.063e+00 -0.768 0.4436
## as.factor(HTNST)4 1.805e+01 8.324e+00 2.169 0.0316 *
## HTM -1.791e+01 1.241e+01 -1.443 0.1511
## HTPCT 6.352e-04 3.946e-02 0.016 0.9872
## BMIcal 1.568e-01 9.497e-02 1.651 0.1008
## BMIPCT -4.168e-02 4.928e-02 -0.846 0.3989
## DMETS -4.394e-01 2.432e-01 -1.806 0.0728 .
## hardmin 1.009e-02 4.482e-03 2.252 0.0257 *
## vhardmin 1.058e-02 8.177e-03 1.294 0.1975
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.944 on 158 degrees of freedom
## Multiple R-squared: 0.1939, Adjusted R-squared: 0.1275
## F-statistic: 2.923 on 13 and 158 DF, p-value: 0.0007883
plot(DASHSC_Gunther,DASHSC_SuperWIN)
data333=data.frame(DASHSC_SuperWIN,DASHSC_Gunther,SBP,DBP)
cor(data333, use="complete.obs", method="pearson")
## DASHSC_SuperWIN DASHSC_Gunther SBP DBP
## DASHSC_SuperWIN 1.00000000 0.84715053 -0.05779287 -0.14156465
## DASHSC_Gunther 0.84715053 1.00000000 -0.01454611 -0.17323074
## SBP -0.05779287 -0.01454611 1.00000000 0.02158761
## DBP -0.14156465 -0.17323074 0.02158761 1.00000000
cor(data333, use="complete.obs", method="spearman")
## DASHSC_SuperWIN DASHSC_Gunther SBP DBP
## DASHSC_SuperWIN 1.0000000 0.839423805 -0.067297201 -0.16456628
## DASHSC_Gunther 0.8394238 1.000000000 0.006163971 -0.15670933
## SBP -0.0672972 0.006163971 1.000000000 0.06712612
## DBP -0.1645663 -0.156709334 0.067126122 1.00000000
cor(data333, use="complete.obs", method="kendall")
## DASHSC_SuperWIN DASHSC_Gunther SBP DBP
## DASHSC_SuperWIN 1.00000000 0.651842785 -0.046549983 -0.10827770
## DASHSC_Gunther 0.65184279 1.000000000 0.006313038 -0.10841581
## SBP -0.04654998 0.006313038 1.000000000 0.04790639
## DBP -0.10827770 -0.108415807 0.047906389 1.00000000