Back to Homepage

1 Exploratory Data Analysis

## Import the data into R with the read.table function
DASH_Score <- read.table("C:/Users/fanzh/Dropbox/Project with Zhaohu at UC (Model confidence bounds for variable selection)/National Health and Nutrition Examination Survey/Copy of Copy of Couch DASH Baseline Data6-13-2019.csv", sep=",", header = TRUE) 

1.1 Variable description

We know that this dataset contains 188 subjects and 222 variables (222=69+152+1(subject ID)). There are 69 demographical variables, such as age, race, sex and income, etc.We also have 152 treatments or variables of interest, which are considered as NDSR Food subcodes used in DASH score and to calculate total servings of different food groups.

Variable Definition
Subject subject id
Month Month of visit
Day Day of visit
Year Year of visit
Age age at visit in years
Race race 1=white; 2=black 3=hispanic; 4=native american; 5=asian pacific islander; 6=multiracial
Sex gender 1=males; 2=females
Income household income 1=<$20,000; 2=20-50; 3=50-80; 4=>$80,000
SBP Systolic blood pressure
DBP diastolic blood pressure
HTNST hypertension status 1=pre; 2=stage 1; 3=normal; 4=stage 2
WTKG weight in kg
HTM height in meters
HTPCT height percentile
BMIcal EPIC calculated BMI
BMIPCT BMI percentile from clinic
DMETS Daily Metabolic equivalents
Sleep hours of sleep per week
lightmin minutes of light activity per week
modmin minutes of moderate activity per week
hardmin minutes of hard activity per week
vhardmin minutes of very hard activity per week
actweek minutes of mod, hard and very hard activity per week
X_D_B1 TwoD_TwoD_BDimension
X_D_B2 TwoD_TwoD_BDimension_2
X_D_B3 TwoD_TwoD_BDimension_3
X_D_601 TwoD_TwoD_60Dimension
X_D_602 TwoD_TwoD_60Dimension_2
X_D_603 TwoD_TwoD_60Dimension_3
X_D_901 TwoD_TwoD_90Dimension
X_D_902 TwoD_TwoD_90Dimension_2
X_D_903 TwoD_TwoD_90Dimension_3
X_D_1201 TwoD_TwoD_120Dimension
X_D_1202 TwoD_TwoD_120Dimension_2
X_D_1203 TwoD_TwoD_120Dimension_3
PSV_B_1 DOPPLER_PSV_BVelocity
PSV_B_2 DOPPLER_PSV_BVelocity_2
PSV_B_3 DOPPLER_PSV_BVelocity_3
PSV_I_41 DOPPLER_PSV_I_4Velocity
PSV_I_42 DOPPLER_PSV_I_4Velocity_2
PSV_I_43 DOPPLER_PSV_I_4Velocity_3
PSV_I_51 DOPPLER_PSV_I_5Velocity
PSV_I_52 DOPPLER_PSV_I_5Velocity_2
PSV_I_53 DOPPLER_PSV_I_5Velocity_3
PSV_I_61 DOPPLER_PSV_I_6Velocity
PSV_I_62 DOPPLER_PSV_I_6Velocity_2
PSV_I_63 DOPPLER_PSV_I_6Velocity_3
PSV_601 DOPPLER_PSV_60Velocity
PSV_602 DOPPLER_PSV_60Velocity_2
PSV_603 DOPPLER_PSV_60Velocity_3
PSV_901 DOPPLER_PSV_90Velocity
PSV_902 DOPPLER_PSV_90Velocity_2
PSV_903 DOPPLER_PSV_90Velocity_3
PSV_1201 DOPPLER_PSV_120Velocity
PSV_1202 DOPPLER_PSV_120Velocity_2
PSV_1203 DOPPLER_PSV_120Velocity_3
energy energy (kcal/day)
Chol cholesterol (mg/day)
Sucrose sucrose (grams/day)
Fiber total dietay fiber (grams/day)
CA calcium (mg/day)
Mg magnesium (mg/day)
Na sodium (mg/day)
K potassium (mg/day)
Caffeine caffeine (mg/day)
pfat percent kcal from fat
pcarb percent kcal from carbohydrate
ppro percent kcal from protein
psfat percent kcal from saturated fat
trans trans fatty acids (g/day)

1.2 Basic Inspections

We have 15 subjects’ response including at least one missing value or marked with “.”. After removing missing values, we have cleaned dataset including 173 subjects and 222 variables.

DASH_Score[DASH_Score=="."] <- NA 
DASH_Score=DASH_Score[complete.cases(DASH_Score), ]

go to top

2 The SuperWIN scoring system

Notes:

FRU0600 VEG0800 VEG0900 SWT0600 MSC1100

head(DASHSC_SuperWIN,10)
##  [1] 43.79941 43.03675 47.71267 29.92237 50.15139 29.39656 26.67415
##  [8] 52.18893 45.73657 25.25528

2.1 Scatterplots of Gunther DASH Score vs SBP/DBP, respectively.

Figures of SuperWIN Dash Scores vs SBP/DBP

par(mfrow=c(1,2))
plot(DASHSC_SuperWIN,SBP)
plot(DASHSC_SuperWIN,DBP)

From Wiki

  • In statistics, the Pearson correlation coefficient, also referred to as Pearson’s r, the Pearson product-moment correlation coefficient (PPMCC) or the bivariate correlation, is a measure of the linear correlation between two variables X and Y.

  • In statistics, Spearman’s rank correlation coefficient or Spearman’s rho, is a nonparametric measure of rank correlation (statistical dependence between the rankings of two variables). It assesses how well the relationship between two variables can be described using a monotonic function.

  • In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall’s tau coefficient, is a statistic used to measure the ordinal association between two measured quantities. A tau test is a non-parametric hypothesis test for statistical dependence based on the tau coefficient.

2.2 correlation (pearson) between SuperWIN DASH Score and SBP

data111=data.frame(DASHSC_SuperWIN,SBP,DBP)
cor(data111, use="complete.obs", method="pearson") 
##                 DASHSC_SuperWIN         SBP         DBP
## DASHSC_SuperWIN      1.00000000 -0.05779287 -0.14156465
## SBP                 -0.05779287  1.00000000  0.02158761
## DBP                 -0.14156465  0.02158761  1.00000000

2.3 correlation (spearman) between SuperWIN DASH Score and SBP

cor(data111, use="complete.obs", method="spearman") 
##                 DASHSC_SuperWIN         SBP         DBP
## DASHSC_SuperWIN       1.0000000 -0.06729720 -0.16456628
## SBP                  -0.0672972  1.00000000  0.06712612
## DBP                  -0.1645663  0.06712612  1.00000000

2.4 correlation (kendall) between SuperWIN DASH Score and SBP

cor(data111, use="complete.obs", method="kendall") 
##                 DASHSC_SuperWIN         SBP         DBP
## DASHSC_SuperWIN      1.00000000 -0.04654998 -0.10827770
## SBP                 -0.04654998  1.00000000  0.04790639
## DBP                 -0.10827770  0.04790639  1.00000000

2.5 Linear Regression

We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.

2.5.1 SBP as response

summary(l21)
## 
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + Age + Sex + as.factor(HTNST) + 
##     WTKG + HTM + HTPCT + BMIcal + BMIPCT + DMETS + Sleep + lightmin + 
##     modmin + hardmin + vhardmin + actweek, data = data21)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.5322  -1.8837   0.0068   2.3681  11.1078 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       110.545833  91.455603   1.209 0.228628    
## DASHSC_SuperWIN    -0.037551   0.029773  -1.261 0.209131    
## Age                 0.911027   0.264390   3.446 0.000735 ***
## Sex                -2.795948   0.891400  -3.137 0.002050 ** 
## as.factor(HTNST)2   8.968842   0.658810  13.614  < 2e-16 ***
## as.factor(HTNST)3  -7.911099   1.494539  -5.293  4.1e-07 ***
## as.factor(HTNST)4   9.560653   3.962058   2.413 0.017002 *  
## WTKG                0.003392   0.040643   0.083 0.933599    
## HTM                19.621541   7.501571   2.616 0.009797 ** 
## HTPCT               0.032782   0.018863   1.738 0.084241 .  
## BMIcal             -0.059818   0.101098  -0.592 0.554934    
## BMIPCT             -0.010170   0.025169  -0.404 0.686720    
## DMETS               0.114191   0.139652   0.818 0.414808    
## Sleep              -0.143328   0.534552  -0.268 0.788963    
## lightmin           -0.003356   0.008885  -0.378 0.706196    
## modmin             -0.010960   0.005054  -2.168 0.031679 *  
## hardmin            -0.009651   0.004770  -2.023 0.044797 *  
## vhardmin           -0.010988   0.005533  -1.986 0.048824 *  
## actweek             0.004915   0.010791   0.456 0.649388    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.795 on 153 degrees of freedom
## Multiple R-squared:  0.7535, Adjusted R-squared:  0.7245 
## F-statistic: 25.98 on 18 and 153 DF,  p-value: < 2.2e-16

2.5.2 Detecting multicollinearity

As a rule of thumb, a VIF value that exceeds 10 indicates a problematic amount of collinearity (James et al. 2014).We calculate variance-inflation factor to chek the multicollinearity in the model.

car::vif(l21)
##                        GVIF Df GVIF^(1/(2*Df))
## DASHSC_SuperWIN    1.186093  1        1.089079
## Age                3.368040  1        1.835222
## Sex                2.155714  1        1.468235
## as.factor(HTNST)   1.387435  3        1.056093
## WTKG              13.706503  1        3.702229
## HTM                7.553481  1        2.748360
## HTPCT              3.208182  1        1.791140
## BMIcal             9.013899  1        3.002316
## BMIPCT             1.897636  1        1.377547
## DMETS              4.056549  1        2.014088
## Sleep            215.798063  1       14.690067
## lightmin         274.337447  1       16.563135
## modmin            11.896568  1        3.449140
## hardmin            9.851141  1        3.138653
## vhardmin           3.583235  1        1.892944
## actweek          147.396835  1       12.140710

We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value.

l21_1 <- lm(SBP~DASHSC_SuperWIN+Age+Sex+as.factor(HTNST)+HTM+HTPCT+BMIcal+BMIPCT+DMETS+hardmin+vhardmin, data=data21)

We can see that dash score is not statistically significance under \(\alpha\)=0.05.

summary(l21_1)
## 
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + Age + Sex + as.factor(HTNST) + 
##     HTM + HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, 
##     data = data21)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.0268  -2.4057   0.2345   2.5296  12.2536 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       80.1274935  8.9762590   8.927 1.06e-15 ***
## DASHSC_SuperWIN   -0.0224611  0.0291606  -0.770 0.442298    
## Age                0.8230130  0.2628176   3.131 0.002073 ** 
## Sex               -2.7225122  0.8814366  -3.089 0.002375 ** 
## as.factor(HTNST)2  8.7180312  0.6506841  13.398  < 2e-16 ***
## as.factor(HTNST)3 -8.6356525  1.4809147  -5.831 3.01e-08 ***
## as.factor(HTNST)4  9.9527954  3.9656125   2.510 0.013088 *  
## HTM               21.0260058  5.9883176   3.511 0.000582 ***
## HTPCT              0.0308345  0.0190235   1.621 0.107040    
## BMIcal            -0.0628041  0.0458036  -1.371 0.172270    
## BMIPCT            -0.0090262  0.0240136  -0.376 0.707510    
## DMETS              0.0721704  0.1175351   0.614 0.540077    
## hardmin           -0.0008495  0.0021653  -0.392 0.695351    
## vhardmin          -0.0015643  0.0039440  -0.397 0.692180    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.837 on 158 degrees of freedom
## Multiple R-squared:  0.7398, Adjusted R-squared:  0.7184 
## F-statistic: 34.56 on 13 and 158 DF,  p-value: < 2.2e-16

2.5.3 DBP as response

summary(l22)
## 
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + Age + Sex + as.factor(HTNST) + 
##     WTKG + HTM + HTPCT + BMIcal + BMIPCT + DMETS + Sleep + lightmin + 
##     modmin + hardmin + vhardmin + actweek, data = data22)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.3781  -3.6699   0.2789   5.3425  13.7299 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)   
## (Intercept)       -5.065e+02  1.877e+02  -2.698  0.00776 **
## DASHSC_SuperWIN   -5.130e-02  6.112e-02  -0.839  0.40254   
## Age                1.448e+00  5.428e-01   2.668  0.00845 **
## Sex                1.479e+00  1.830e+00   0.808  0.42013   
## as.factor(HTNST)2  2.659e+00  1.352e+00   1.966  0.05108 . 
## as.factor(HTNST)3 -3.111e+00  3.068e+00  -1.014  0.31214   
## as.factor(HTNST)4  2.231e+01  8.134e+00   2.743  0.00682 **
## WTKG              -6.053e-02  8.343e-02  -0.725  0.46929   
## HTM               -1.012e+01  1.540e+01  -0.657  0.51188   
## HTPCT             -3.736e-03  3.872e-02  -0.096  0.92326   
## BMIcal             2.638e-01  2.075e-01   1.271  0.20557   
## BMIPCT            -2.179e-02  5.167e-02  -0.422  0.67388   
## DMETS             -2.310e-01  2.867e-01  -0.806  0.42154   
## Sleep              3.529e+00  1.097e+00   3.216  0.00159 **
## lightmin           5.722e-02  1.824e-02   3.137  0.00205 **
## modmin             9.522e-04  1.038e-02   0.092  0.92701   
## hardmin            9.424e-03  9.792e-03   0.962  0.33736   
## vhardmin           1.042e-02  1.136e-02   0.918  0.36030   
## actweek            5.662e-02  2.215e-02   2.556  0.01157 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.79 on 153 degrees of freedom
## Multiple R-squared:  0.2493, Adjusted R-squared:  0.161 
## F-statistic: 2.823 on 18 and 153 DF,  p-value: 0.0002879

2.5.4 Detecting multicollinearity

We calculate variance-inflation factor to chek the multicollinearity in the model.

car::vif(l22)
##                        GVIF Df GVIF^(1/(2*Df))
## DASHSC_SuperWIN    1.186093  1        1.089079
## Age                3.368040  1        1.835222
## Sex                2.155714  1        1.468235
## as.factor(HTNST)   1.387435  3        1.056093
## WTKG              13.706503  1        3.702229
## HTM                7.553481  1        2.748360
## HTPCT              3.208182  1        1.791140
## BMIcal             9.013899  1        3.002316
## BMIPCT             1.897636  1        1.377547
## DMETS              4.056549  1        2.014088
## Sleep            215.798063  1       14.690067
## lightmin         274.337447  1       16.563135
## modmin            11.896568  1        3.449140
## hardmin            9.851141  1        3.138653
## vhardmin           3.583235  1        1.892944
## actweek          147.396835  1       12.140710

We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value.

l22_1 <- lm(DBP~DASHSC_SuperWIN+Age+Sex+as.factor(HTNST)+HTM+HTPCT+BMIcal+BMIPCT+DMETS+hardmin+vhardmin, data=data22)

We can see that dash score is not statistically significance under \(\alpha\)=0.05.

summary(l22_1)
## 
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + Age + Sex + as.factor(HTNST) + 
##     HTM + HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, 
##     data = data22)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -21.7011  -4.3467   0.6443   5.6729  15.6037 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       100.187957  18.622546   5.380 2.64e-07 ***
## DASHSC_SuperWIN    -0.084845   0.060498  -1.402   0.1627    
## Age                 1.372935   0.545253   2.518   0.0128 *  
## Sex                 0.496042   1.828668   0.271   0.7865    
## as.factor(HTNST)2   2.645934   1.349938   1.960   0.0518 .  
## as.factor(HTNST)3  -2.567509   3.072372  -0.836   0.4046    
## as.factor(HTNST)4  19.211561   8.227236   2.335   0.0208 *  
## HTM               -18.697273  12.423630  -1.505   0.1343    
## HTPCT               0.002761   0.039467   0.070   0.9443    
## BMIcal              0.150997   0.095026   1.589   0.1141    
## BMIPCT             -0.037698   0.049820  -0.757   0.4504    
## DMETS              -0.441093   0.243844  -1.809   0.0724 .  
## hardmin             0.010046   0.004492   2.236   0.0267 *  
## vhardmin            0.010255   0.008182   1.253   0.2119    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.96 on 158 degrees of freedom
## Multiple R-squared:  0.1907, Adjusted R-squared:  0.1241 
## F-statistic: 2.863 on 13 and 158 DF,  p-value: 0.0009901

3 The Gunther’s scoring system

3.1 Scatterplots of Gunther DASH Score vs SBP/DBP, respectively.

Figures of SBP/DBP vs Gunther Dash Scores.

par(mfrow=c(1,2))

plot(DASHSC_Gunther,SBP)
plot(DASHSC_Gunther,DBP)

3.2 Correlation (pearson) between Gunther DASH Score and SBP

data222=data.frame(DASHSC_Gunther,SBP,DBP)
cor(data222, use="complete.obs", method="pearson") 
##                DASHSC_Gunther         SBP         DBP
## DASHSC_Gunther     1.00000000 -0.01454611 -0.17323074
## SBP               -0.01454611  1.00000000  0.02158761
## DBP               -0.17323074  0.02158761  1.00000000

3.3 Correlation (spearman) between Gunther DASH Score and SBP

cor(data222, use="complete.obs", method="spearman") 
##                DASHSC_Gunther         SBP         DBP
## DASHSC_Gunther    1.000000000 0.006163971 -0.15670933
## SBP               0.006163971 1.000000000  0.06712612
## DBP              -0.156709334 0.067126122  1.00000000

3.4 Correlation (kendall) between Gunther DASH Score and SBP

cor(data222, use="complete.obs", method="kendall") 
##                DASHSC_Gunther         SBP         DBP
## DASHSC_Gunther    1.000000000 0.006313038 -0.10841581
## SBP               0.006313038 1.000000000  0.04790639
## DBP              -0.108415807 0.047906389  1.00000000

3.5 Linear Regression

We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.

3.5.1 SBP as response

summary(l11)
## 
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + Age + Sex + as.factor(HTNST) + 
##     WTKG + HTM + HTPCT + BMIcal + BMIPCT + DMETS + Sleep + lightmin + 
##     modmin + hardmin + vhardmin + actweek, data = data11)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.578  -1.829   0.168   2.382  10.812 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       110.801184  91.393237   1.212 0.227245    
## DASHSC_Gunther     -0.053773   0.041624  -1.292 0.198342    
## Age                 0.898016   0.265274   3.385 0.000904 ***
## Sex                -2.877204   0.896707  -3.209 0.001625 ** 
## as.factor(HTNST)2   8.998069   0.659599  13.642  < 2e-16 ***
## as.factor(HTNST)3  -7.825900   1.495129  -5.234 5.38e-07 ***
## as.factor(HTNST)4   9.174566   4.024209   2.280 0.023999 *  
## WTKG                0.005275   0.040628   0.130 0.896860    
## HTM                19.667733   7.500383   2.622 0.009619 ** 
## HTPCT               0.032090   0.018886   1.699 0.091332 .  
## BMIcal             -0.062143   0.101010  -0.615 0.539325    
## BMIPCT             -0.012397   0.024982  -0.496 0.620426    
## DMETS               0.114823   0.139575   0.823 0.411980    
## Sleep              -0.143356   0.534047  -0.268 0.788728    
## lightmin           -0.003311   0.008870  -0.373 0.709468    
## modmin             -0.010938   0.005050  -2.166 0.031871 *  
## hardmin            -0.009571   0.004764  -2.009 0.046304 *  
## vhardmin           -0.010838   0.005529  -1.960 0.051790 .  
## actweek             0.004899   0.010787   0.454 0.650338    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.794 on 153 degrees of freedom
## Multiple R-squared:  0.7536, Adjusted R-squared:  0.7247 
## F-statistic:    26 on 18 and 153 DF,  p-value: < 2.2e-16

3.5.2 Detecting multicollinearity

We calculate variance-inflation factor to chek the multicollinearity in the model.

car::vif(l11)
##                        GVIF Df GVIF^(1/(2*Df))
## DASHSC_Gunther     1.229783  1        1.108956
## Age                3.392319  1        1.841825
## Sex                2.182563  1        1.477350
## as.factor(HTNST)   1.438585  3        1.062484
## WTKG              13.703303  1        3.701797
## HTM                7.554908  1        2.748619
## HTPCT              3.217645  1        1.793779
## BMIcal             9.002901  1        3.000483
## BMIPCT             1.870494  1        1.367660
## DMETS              4.054148  1        2.013492
## Sleep            215.499459  1       14.679900
## lightmin         273.542758  1       16.539128
## modmin            11.882432  1        3.447090
## hardmin            9.832225  1        3.135638
## vhardmin           3.579972  1        1.892082
## actweek          147.357518  1       12.139091

We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value.

l11_1 <- lm(SBP~DASHSC_Gunther+Age+Sex+as.factor(HTNST)+HTM+HTPCT+BMIcal+BMIPCT+DMETS+hardmin+vhardmin, data=data11)

We can see that dash score is not statistically significance under \(\alpha\)=0.05.

summary(l11_1)
## 
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + Age + Sex + as.factor(HTNST) + 
##     HTM + HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, 
##     data = data11)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.0624  -2.5008   0.3359   2.4979  12.0814 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       80.2780970  8.9777814   8.942 9.64e-16 ***
## DASHSC_Gunther    -0.0352276  0.0410015  -0.859 0.391544    
## Age                0.8128820  0.2637109   3.082 0.002423 ** 
## Sex               -2.7732134  0.8849609  -3.134 0.002058 ** 
## as.factor(HTNST)2  8.7403262  0.6509503  13.427  < 2e-16 ***
## as.factor(HTNST)3 -8.5789282  1.4787058  -5.802 3.48e-08 ***
## as.factor(HTNST)4  9.6711261  4.0184376   2.407 0.017252 *  
## HTM               21.2271174  5.9927833   3.542 0.000522 ***
## HTPCT              0.0303106  0.0190490   1.591 0.113565    
## BMIcal            -0.0613433  0.0458443  -1.338 0.182793    
## BMIPCT            -0.0101356  0.0237878  -0.426 0.670626    
## DMETS              0.0727404  0.1174268   0.619 0.536510    
## hardmin           -0.0008406  0.0021638  -0.388 0.698174    
## vhardmin          -0.0014878  0.0039473  -0.377 0.706740    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.835 on 158 degrees of freedom
## Multiple R-squared:   0.74,  Adjusted R-squared:  0.7187 
## F-statistic:  34.6 on 13 and 158 DF,  p-value: < 2.2e-16

3.5.3 DBP as response

summary(l12)
## 
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + Age + Sex + as.factor(HTNST) + 
##     WTKG + HTM + HTPCT + BMIcal + BMIPCT + DMETS + Sleep + lightmin + 
##     modmin + hardmin + vhardmin + actweek, data = data12)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.3038  -3.8996   0.1987   5.3429  13.7821 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)   
## (Intercept)       -5.008e+02  1.875e+02  -2.671  0.00837 **
## DASHSC_Gunther    -8.645e-02  8.538e-02  -1.013  0.31289   
## Age                1.421e+00  5.441e-01   2.611  0.00993 **
## Sex                1.334e+00  1.839e+00   0.725  0.46953   
## as.factor(HTNST)2  2.712e+00  1.353e+00   2.004  0.04683 * 
## as.factor(HTNST)3 -2.978e+00  3.067e+00  -0.971  0.33311   
## as.factor(HTNST)4  2.146e+01  8.255e+00   2.599  0.01025 * 
## WTKG              -5.776e-02  8.334e-02  -0.693  0.48929   
## HTM               -1.001e+01  1.539e+01  -0.650  0.51647   
## HTPCT             -5.079e-03  3.874e-02  -0.131  0.89587   
## BMIcal             2.614e-01  2.072e-01   1.262  0.20904   
## BMIPCT            -2.433e-02  5.124e-02  -0.475  0.63566   
## DMETS             -2.327e-01  2.863e-01  -0.813  0.41767   
## Sleep              3.501e+00  1.095e+00   3.195  0.00170 **
## lightmin           5.680e-02  1.819e-02   3.122  0.00215 **
## modmin             8.250e-04  1.036e-02   0.080  0.93663   
## hardmin            9.446e-03  9.773e-03   0.967  0.33527   
## vhardmin           1.061e-02  1.134e-02   0.935  0.35101   
## actweek            5.625e-02  2.213e-02   2.542  0.01201 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.782 on 153 degrees of freedom
## Multiple R-squared:  0.2509, Adjusted R-squared:  0.1627 
## F-statistic: 2.846 on 18 and 153 DF,  p-value: 0.0002573

3.5.4 Detecting multicollinearity

We calculate variance-inflation factor to chek the multicollinearity in the model.

car::vif(l12)
##                        GVIF Df GVIF^(1/(2*Df))
## DASHSC_Gunther     1.229783  1        1.108956
## Age                3.392319  1        1.841825
## Sex                2.182563  1        1.477350
## as.factor(HTNST)   1.438585  3        1.062484
## WTKG              13.703303  1        3.701797
## HTM                7.554908  1        2.748619
## HTPCT              3.217645  1        1.793779
## BMIcal             9.002901  1        3.000483
## BMIPCT             1.870494  1        1.367660
## DMETS              4.054148  1        2.013492
## Sleep            215.499459  1       14.679900
## lightmin         273.542758  1       16.539128
## modmin            11.882432  1        3.447090
## hardmin            9.832225  1        3.135638
## vhardmin           3.579972  1        1.892082
## actweek          147.357518  1       12.139091

We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value.

l12_1 <- lm(DBP~DASHSC_Gunther+Age+Sex+as.factor(HTNST)+HTM+HTPCT+BMIcal+BMIPCT+DMETS+hardmin+vhardmin, data=data12)

We can see that dash score is not statistically significance under \(\alpha\)=0.05.

summary(l12_1)
## 
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + Age + Sex + as.factor(HTNST) + 
##     HTM + HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, 
##     data = data12)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -21.6063  -4.7220   0.5221   5.4868  15.5415 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.009e+02  1.860e+01   5.425 2.14e-07 ***
## DASHSC_Gunther    -1.370e-01  8.493e-02  -1.613   0.1088    
## Age                1.332e+00  5.463e-01   2.437   0.0159 *  
## Sex                2.960e-01  1.833e+00   0.161   0.8720    
## as.factor(HTNST)2  2.733e+00  1.348e+00   2.027   0.0444 *  
## as.factor(HTNST)3 -2.353e+00  3.063e+00  -0.768   0.4436    
## as.factor(HTNST)4  1.805e+01  8.324e+00   2.169   0.0316 *  
## HTM               -1.791e+01  1.241e+01  -1.443   0.1511    
## HTPCT              6.352e-04  3.946e-02   0.016   0.9872    
## BMIcal             1.568e-01  9.497e-02   1.651   0.1008    
## BMIPCT            -4.168e-02  4.928e-02  -0.846   0.3989    
## DMETS             -4.394e-01  2.432e-01  -1.806   0.0728 .  
## hardmin            1.009e-02  4.482e-03   2.252   0.0257 *  
## vhardmin           1.058e-02  8.177e-03   1.294   0.1975    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.944 on 158 degrees of freedom
## Multiple R-squared:  0.1939, Adjusted R-squared:  0.1275 
## F-statistic: 2.923 on 13 and 158 DF,  p-value: 0.0007883

go to top

4 Correlation between Gunther and SuperWIN

plot(DASHSC_Gunther,DASHSC_SuperWIN)

4.1 Correlation (pearson) between SuperWIN DASH Score, Gunther DASH Score, SBP and DBP

data333=data.frame(DASHSC_SuperWIN,DASHSC_Gunther,SBP,DBP)
cor(data333, use="complete.obs", method="pearson") 
##                 DASHSC_SuperWIN DASHSC_Gunther         SBP         DBP
## DASHSC_SuperWIN      1.00000000     0.84715053 -0.05779287 -0.14156465
## DASHSC_Gunther       0.84715053     1.00000000 -0.01454611 -0.17323074
## SBP                 -0.05779287    -0.01454611  1.00000000  0.02158761
## DBP                 -0.14156465    -0.17323074  0.02158761  1.00000000

4.2 Correlation (spearman) between SuperWIN DASH Score, Gunther DASH Score, SBP and DBP

cor(data333, use="complete.obs", method="spearman") 
##                 DASHSC_SuperWIN DASHSC_Gunther          SBP         DBP
## DASHSC_SuperWIN       1.0000000    0.839423805 -0.067297201 -0.16456628
## DASHSC_Gunther        0.8394238    1.000000000  0.006163971 -0.15670933
## SBP                  -0.0672972    0.006163971  1.000000000  0.06712612
## DBP                  -0.1645663   -0.156709334  0.067126122  1.00000000

4.3 Correlation (kendall) between SuperWIN DASH Score, Gunther DASH Score, SBP and DBP

cor(data333, use="complete.obs", method="kendall") 
##                 DASHSC_SuperWIN DASHSC_Gunther          SBP         DBP
## DASHSC_SuperWIN      1.00000000    0.651842785 -0.046549983 -0.10827770
## DASHSC_Gunther       0.65184279    1.000000000  0.006313038 -0.10841581
## SBP                 -0.04654998    0.006313038  1.000000000  0.04790639
## DBP                 -0.10827770   -0.108415807  0.047906389  1.00000000

go to top

Back to Homepage

5 References