We have 188 observations with 227 variables. After removing values marked with “.” and missing values, we have 172 observations with 227 variables. There are four duplicated variables: FRU0200, SWT0800, MSC0700, VEG0500. After we remove index variable and duplicated variables, we have 172 observations with 222 variables.
DASH_Score[DASH_Score=="."] <- NA
DASH_Score=DASH_Score[complete.cases(DASH_Score), ]
DASH_Score=DASH_Score[,-c(1) ]
drops<-c("FRU0200.1","VEG0500.1","SWT0800.1","MSC0700.1")
DASH_Score=DASH_Score[ , !(names(DASH_Score) %in% drops)]
n=dim(DASH_Score)[1]
n
## [1] 172
p=dim(DASH_Score)[2]
p
## [1] 222
(If we only remove rows containing Sex or Age column of NA, we have 187 observations with 222 variables.However, to ensure fair comparison between models dropping high VIF value covariates and models using stepwise variable selection, we are stick to our approach above.)
Notes:
Gunther’s scoring system: 1) ‘FRU0600’ is not found in Tfruit, 2) ‘VEG0800’ and ‘VEG0900’ are not found in Tveg, 3) ‘SWT0600’ and ‘MSC1100’ are not found in Dairy and 4) ‘SWT0600’ and ‘MSC1100’ are not found in Lfdairy.
There are four duplicated variables below in the data set.
FRU0200 SWT0800 MSC0700 VEG0500
Scatterplots of SuperWIN DASH Score vs SBP/DBP
par(mfrow=c(1,2))
plot(DASHSC_SuperWIN,SBP, pch = 20)
plot(DASHSC_SuperWIN,DBP, pch = 20)
Correlation between SuperWIN DASH Score and SBP/DBP
From Wiki
Given paired data \({\displaystyle \left\{(x_{1},y_{1}),\ldots ,(x_{n},y_{n})\right\}}\) consisting of n pairs, \(r_{xy}\) is defined as:
\({\displaystyle r_{xy}={\frac {\sum _{i=1}^{n}(x_{i}-{\bar {x}})(y_{i}-{\bar {y}})}{{\sqrt {\sum _{i=1}^{n}(x_{i}-{\bar {x}})^{2}}}{\sqrt {\sum _{i=1}^{n}(y_{i}-{\bar {y}})^{2}}}}}}\)
where n is sample size, \(x_{i},y_{i}\) are the individual sample points indexed with i, \(\bar {x}={\frac {1}{n}}\sum _{i=1}^{n}x_{i}\) and analogously for \(\bar{y}\).
For a sample of size n, the n raw scores \(X_{i},Y_{i}\) are converted to ranks \(\operatorname {rg} X_{i}\), \(\operatorname {rg} Y_{i}\) and \(r_{s}\) is computed from:
\({\displaystyle r_{s}=\rho _{\operatorname {rg} _{X},\operatorname {rg} _{Y}}={\frac {\operatorname {cov} (\operatorname {rg} _{X},\operatorname {rg} _{Y})}{\sigma _{\operatorname {rg} _{X}}\sigma _{\operatorname {rg} _{Y}}}}}\)
where \(\rho\) denotes the usual Pearson correlation coefficient, but applied to the rank variables. \(\operatorname {cov} (\operatorname {rg}_{X},\operatorname {rg}_{Y})\) is the covariance of the rank variables. \(\sigma_{\operatorname {rg} _{X}}\) and \(\sigma_{\operatorname{rg}_{Y}}\) are the standard deviations of the rank variables.
Let (\(x_1\), \(y_1\)), (\(x_2\), \(y_2\)), …, (\(x_n\), \(y_n\)) be a set of observations of the joint random variables X and Y respectively, such that all the values of ( \({\displaystyle x_{i}}\) ) and ( \({\displaystyle y_{i}}\)) are unique. Any pair of observations \({\displaystyle (x_{i},y_{i})}\) and \({\displaystyle (x_{j},y_{j})}\), where \({\displaystyle i<j}\), are said to be concordant if the ranks for both elements (more precisely, the sort order by x and by y) agree: that is, if both \({\displaystyle x_{i}>x_{j}}\) and \({\displaystyle y_{i}>y_{j}}\); or if both \({\displaystyle x_{i}<x_{j}}\) and \({\displaystyle y_{i}<y_{j}}\) . They are said to be discordant, if \({\displaystyle x_{i}>x_{j}}\) and \({\displaystyle y_{i}<y_{j}}\); or if \({\displaystyle x_{i}<x_{j}}\) and \({\displaystyle y_{i}>y_{j}}\). If \({\displaystyle x_{i}=x_{j}}\) or \({\displaystyle y_{i}=y_{j}}\), the pair is neither concordant nor discordant.
The Kendall \(\tau\) coefficient is defined as:
\({\displaystyle \tau ={\frac {({\text{number of concordant pairs}})-({\text{number of discordant pairs}})}{n(n-1)/2}}.}\)
We have pearson correlation as follows:
data111=data.frame(DASHSC_SuperWIN,SBP,DBP)
round(cor(data111, use="complete.obs", method="pearson"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 -0.06 -0.14
## SBP -0.06 1.00 0.02
## DBP -0.14 0.02 1.00
We have spearman correlation as follows:
round(cor(data111, use="complete.obs", method="spearman"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 -0.07 -0.16
## SBP -0.07 1.00 0.07
## DBP -0.16 0.07 1.00
We have kendall correlation as follows:
round(cor(data111, use="complete.obs", method="kendall"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 -0.05 -0.11
## SBP -0.05 1.00 0.05
## DBP -0.11 0.05 1.00
Linear Regression
We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.
SBP as response
As a rule of thumb, a VIF value that exceeds 10 indicates a problematic amount of collinearity (James et al. 2014). Hence, we calculate variance-inflation factor to chek the multicollinearity in the model. We then update our model by removing the the predictor variables (WTKG, Sleep,lightmin,modmin, actweek) with high VIF value. In our final proposed model, we can see that dash score is not statistically significance under \(\alpha\)=0.1.
VIF value for each covariates
car::vif(l21)
## DASHSC_SuperWIN Age as.factor(Sex) WTKG
## 1.151980 3.145512 2.154770 13.598358
## HTM HTPCT BMIcal BMIPCT
## 7.497614 3.165420 8.993570 1.887895
## DMETS Sleep lightmin modmin
## 4.037340 212.973333 270.427986 11.852353
## hardmin vhardmin actweek
## 9.862809 3.536170 145.894301
model after removing high VIF value
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + Age + as.factor(Sex) + HTM +
## HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.4907 -3.6754 -0.4362 3.9725 17.8078
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 78.2374078 13.6155080 5.746 4.45e-08 ***
## DASHSC_SuperWIN -0.0163387 0.0458422 -0.356 0.72200
## Age 0.3437532 0.4143797 0.830 0.40802
## as.factor(Sex)2 -2.4602364 1.4295032 -1.721 0.08716 .
## HTM 27.4558903 9.6880722 2.834 0.00519 **
## HTPCT -0.0002273 0.0305863 -0.007 0.99408
## BMIcal -0.0839217 0.0738199 -1.137 0.25729
## BMIPCT -0.0154961 0.0388138 -0.399 0.69024
## DMETS 0.1054449 0.1902614 0.554 0.58020
## hardmin 0.0008942 0.0035043 0.255 0.79891
## vhardmin -0.0076265 0.0063634 -1.199 0.23248
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.221 on 161 degrees of freedom
## Multiple R-squared: 0.303, Adjusted R-squared: 0.2597
## F-statistic: 7 on 10 and 161 DF, p-value: 4.476e-09
`
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(SBP~DASHSC_SuperWIN, data=data21)
Full = lm(SBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data21)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + as.factor(Sex) + WTKG,
## data = data21)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTM as.factor(Sex)2
## 74.86837 -0.02739 34.07285 -2.23614
## WTKG
## -0.03270
m_AIC=lm(SBP~DASHSC_SuperWIN+HTM+as.factor(Sex)+WTKG, data=data21)
summary(m_AIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + as.factor(Sex) + WTKG,
## data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.9231 -3.7808 0.0802 4.1787 17.4712
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.86837 9.94654 7.527 3.07e-12 ***
## DASHSC_SuperWIN -0.02739 0.04437 -0.617 0.5379
## HTM 34.07285 6.06869 5.615 8.06e-08 ***
## as.factor(Sex)2 -2.23614 1.13493 -1.970 0.0505 .
## WTKG -0.03270 0.02187 -1.495 0.1368
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.173 on 167 degrees of freedom
## Multiple R-squared: 0.2881, Adjusted R-squared: 0.271
## F-statistic: 16.89 on 4 and 167 DF, p-value: 1.198e-11
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n=dim(data21)[1]
Base = lm(SBP~DASHSC_SuperWIN, data=data21)
Full = lm(SBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data21)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n),trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + as.factor(Sex), data = data21)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTM as.factor(Sex)2
## 81.1442 -0.0341 28.8761 -2.6066
m_BIC=lm(SBP~DASHSC_SuperWIN+HTM+as.factor(Sex), data=data21)
summary(m_BIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + as.factor(Sex), data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.5963 -3.6625 -0.3443 4.2700 18.1351
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 81.14422 9.05061 8.966 5.81e-16 ***
## DASHSC_SuperWIN -0.03410 0.04431 -0.770 0.4427
## HTM 28.87610 4.99306 5.783 3.49e-08 ***
## as.factor(Sex)2 -2.60656 1.11163 -2.345 0.0202 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.196 on 168 degrees of freedom
## Multiple R-squared: 0.2785, Adjusted R-squared: 0.2656
## F-statistic: 21.62 on 3 and 168 DF, p-value: 6.859e-12
#m_p=lm(SBP~DASHSC_SuperWIN+as.factor(Sex)+HTM, data=data21)
#summary(m_p)
DBP as response
Similarly, we calculate variance-inflation factor to chek the multicollinearity in the model. Then, we update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value. We can see that dash score is statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + Age + as.factor(Sex) + HTM +
## HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.3757 -4.9172 0.1797 5.6430 17.9812
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 103.080120 17.834544 5.780 3.77e-08 ***
## DASHSC_SuperWIN -0.103086 0.060047 -1.717 0.0880 .
## Age 1.185599 0.542783 2.184 0.0304 *
## as.factor(Sex)2 0.299542 1.872463 0.160 0.8731
## HTM -17.028083 12.690114 -1.342 0.1815
## HTPCT -0.007826 0.040064 -0.195 0.8454
## BMIcal 0.163795 0.096694 1.694 0.0922 .
## BMIPCT -0.040067 0.050841 -0.788 0.4318
## DMETS -0.441494 0.249218 -1.772 0.0784 .
## hardmin 0.010538 0.004590 2.296 0.0230 *
## vhardmin 0.008604 0.008335 1.032 0.3035
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.148 on 161 degrees of freedom
## Multiple R-squared: 0.1358, Adjusted R-squared: 0.08211
## F-statistic: 2.53 on 10 and 161 DF, p-value: 0.007421
`
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(DBP~DASHSC_SuperWIN, data=data22)
Full = lm(DBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data22)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + HTPCT + Age + HTM, data = data22)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTPCT Age
## 87.13563 -0.09209 -0.01265 1.17976
## HTM
## -14.96233
m2_AIC=lm(DBP~DASHSC_SuperWIN+HTPCT+ Age+HTM, data=data22)
summary(m2_AIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + HTPCT + Age + HTM, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.1724 -5.1785 0.0738 5.4363 18.3770
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 87.13563 11.16068 7.807 6.11e-13 ***
## DASHSC_SuperWIN -0.09209 0.05865 -1.570 0.1183
## HTPCT -0.01265 0.03251 -0.389 0.6978
## Age 1.17976 0.47436 2.487 0.0139 *
## HTM -14.96233 9.26544 -1.615 0.1082
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.186 on 167 degrees of freedom
## Multiple R-squared: 0.09529, Adjusted R-squared: 0.07362
## F-statistic: 4.397 on 4 and 167 DF, p-value: 0.002093
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n22=dim(data22)[1]
Base = lm(DBP~DASHSC_SuperWIN, data=data22)
Full = lm(DBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data22)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n22),trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + HTPCT, data = data22)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTPCT
## 83.17976 -0.10955 -0.06238
m2_BIC=lm(DBP~DASHSC_SuperWIN+HTPCT, data=data22)
summary(m2_BIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + HTPCT, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.471 -5.060 0.462 5.691 18.982
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 83.17976 2.99196 27.801 < 2e-16 ***
## DASHSC_SuperWIN -0.10955 0.05885 -1.861 0.06442 .
## HTPCT -0.06238 0.02301 -2.711 0.00739 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.288 on 169 degrees of freedom
## Multiple R-squared: 0.06144, Adjusted R-squared: 0.05033
## F-statistic: 5.532 on 2 and 169 DF, p-value: 0.00471
#m2_p=lm(DBP~DASHSC_SuperWIN+HTPCT, data=data22)
#summary(m2_p)
Scatterplots of Gunther DASH Score vs SBP/DBP
par(mfrow=c(1,2))
plot(DASHSC_Gunther,SBP,pch=20)
plot(DASHSC_Gunther,DBP,pch=20)
Correlation between Gunther DASH Score and SBP
We have pearson correlation as follows:
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 0.04 -0.17
## SBP 0.04 1.00 0.02
## DBP -0.17 0.02 1.00
We have spearman correlation as follows:
round(cor(data222, use="complete.obs", method="spearman") ,2)
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 0.05 -0.15
## SBP 0.05 1.00 0.07
## DBP -0.15 0.07 1.00
We have kendall correlation as follows:
round(cor(data222, use="complete.obs", method="kendall"),2)
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 0.03 -0.11
## SBP 0.03 1.00 0.05
## DBP -0.11 0.05 1.00
Linear Regression
We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.
Again, we calculate variance-inflation factor to chek the multicollinearity in the model. Then, We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value. We can see that dash score is not statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + Age + +as.factor(Sex) + HTM +
## HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.4274 -3.6258 -0.3318 3.8793 17.5144
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 77.4323742 13.4958417 5.737 4.64e-08 ***
## DASHSC_Gunther -0.0041570 0.0589016 -0.071 0.94382
## Age 0.3490019 0.4159385 0.839 0.40267
## as.factor(Sex)2 -2.4344770 1.4333787 -1.698 0.09136 .
## HTM 27.6257031 9.6971273 2.849 0.00496 **
## HTPCT -0.0004097 0.0306493 -0.013 0.98935
## BMIcal -0.0842574 0.0739000 -1.140 0.25592
## BMIPCT -0.0173191 0.0384941 -0.450 0.65338
## DMETS 0.1078963 0.1902251 0.567 0.57137
## hardmin 0.0008105 0.0035024 0.231 0.81728
## vhardmin -0.0077755 0.0063737 -1.220 0.22428
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.223 on 161 degrees of freedom
## Multiple R-squared: 0.3025, Adjusted R-squared: 0.2592
## F-statistic: 6.983 on 10 and 161 DF, p-value: 4.726e-09
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(SBP~DASHSC_Gunther, data=data11)
Full = lm(SBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data11)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + as.factor(Sex) + WTKG,
## data = data11)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTM as.factor(Sex)2
## 73.65718 -0.01602 34.48555 -2.21359
## WTKG
## -0.03363
m_AIC=lm(SBP~DASHSC_Gunther+HTM+as.factor(Sex)+HTM+WTKG, data=data11)
summary(m_AIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + as.factor(Sex) + HTM +
## WTKG, data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.0669 -3.6721 -0.0413 4.1937 17.0568
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 73.65718 9.80965 7.509 3.41e-12 ***
## DASHSC_Gunther -0.01602 0.05749 -0.279 0.7809
## HTM 34.48555 6.02771 5.721 4.79e-08 ***
## as.factor(Sex)2 -2.21359 1.15124 -1.923 0.0562 .
## WTKG -0.03363 0.02183 -1.540 0.1254
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.179 on 167 degrees of freedom
## Multiple R-squared: 0.2868, Adjusted R-squared: 0.2697
## F-statistic: 16.79 on 4 and 167 DF, p-value: 1.388e-11
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n=dim(data11)[1]
Base = lm(SBP~DASHSC_Gunther, data=data11)
Full = lm(SBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data11)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n),trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + as.factor(Sex), data = data11)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTM as.factor(Sex)2
## 79.94475 -0.02235 29.20332 -2.60014
m_BIC=lm(SBP~DASHSC_Gunther+HTM+as.factor(Sex), data=data11)
summary(m_BIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + as.factor(Sex), data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.7676 -3.5669 -0.4077 4.1677 17.6632
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 79.94475 8.95622 8.926 7.4e-16 ***
## DASHSC_Gunther -0.02235 0.05758 -0.388 0.6983
## HTM 29.20332 4.97715 5.867 2.3e-08 ***
## as.factor(Sex)2 -2.60014 1.12813 -2.305 0.0224 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.204 on 168 degrees of freedom
## Multiple R-squared: 0.2766, Adjusted R-squared: 0.2637
## F-statistic: 21.42 on 3 and 168 DF, p-value: 8.521e-12
#m_p=lm(SBP~DASHSC_Gunther+as.factor(Sex)+HTM, data=data11)
#summary(m_p)
Similarly, we calculate variance-inflation factor to chek the multicollinearity in the model. Then, We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value. We can see that dash score is statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + Age + as.factor(Sex) + HTM +
## HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.1708 -4.3948 0.0457 5.5879 17.9785
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 102.226282 17.601206 5.808 3.28e-08 ***
## DASHSC_Gunther -0.158401 0.076819 -2.062 0.0408 *
## Age 1.126544 0.542465 2.077 0.0394 *
## as.factor(Sex)2 0.148350 1.869405 0.079 0.9368
## HTM -14.840655 12.646943 -1.173 0.2423
## HTPCT -0.012974 0.039973 -0.325 0.7459
## BMIcal 0.168842 0.096380 1.752 0.0817 .
## BMIPCT -0.046498 0.050204 -0.926 0.3557
## DMETS -0.435685 0.248091 -1.756 0.0810 .
## hardmin 0.010544 0.004568 2.308 0.0223 *
## vhardmin 0.009076 0.008313 1.092 0.2765
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.116 on 161 degrees of freedom
## Multiple R-squared: 0.1426, Adjusted R-squared: 0.08935
## F-statistic: 2.678 on 10 and 161 DF, p-value: 0.004678
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(DBP~DASHSC_Gunther, data=data12)
Full = lm(DBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data12)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + HTPCT + Age, data = data12)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTPCT Age
## 73.95786 -0.16187 -0.04884 0.65041
m_AIC=lm(DBP~DASHSC_Gunther+ HTPCT+Age, data=data12)
summary(m_AIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + HTPCT + Age, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.0950 -4.8093 0.3582 5.3002 17.6977
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 73.95786 6.34978 11.647 <2e-16 ***
## DASHSC_Gunther -0.16187 0.07442 -2.175 0.0310 *
## HTPCT -0.04884 0.02387 -2.046 0.0423 *
## Age 0.65041 0.32724 1.988 0.0485 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.174 on 168 degrees of freedom
## Multiple R-squared: 0.09247, Adjusted R-squared: 0.07627
## F-statistic: 5.706 on 3 and 168 DF, p-value: 0.0009615
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n111=dim(data12)[1]
Base = lm(DBP~DASHSC_Gunther, data=data12)
Full = lm(DBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data12)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n111),trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + HTPCT, data = data12)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTPCT
## 84.87401 -0.17185 -0.06364
m_BIC=lm(DBP~DASHSC_Gunther+HTPCT, data=data12)
summary(m_BIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + HTPCT, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.0877 -4.9903 0.6082 5.8378 18.6922
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 84.87401 3.21445 26.404 < 2e-16 ***
## DASHSC_Gunther -0.17185 0.07490 -2.294 0.02300 *
## HTPCT -0.06364 0.02288 -2.781 0.00603 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.245 on 169 degrees of freedom
## Multiple R-squared: 0.07113, Adjusted R-squared: 0.06014
## F-statistic: 6.471 on 2 and 169 DF, p-value: 0.001959
Correlation between Gunther and SuperWIN
plot(DASHSC_Gunther,DASHSC_SuperWIN,pch=20)
HTNST dataset is a subset of Adolescents dataset. We separate all the patients into four different data sets by their HTNST (hypertension status=1, 2, 3, 4). The corresponding sample size are given as follows
DASH_Score_HTNST1=subset(DASH_Score, DASH_Score$HTNST==1)
n1=dim(DASH_Score_HTNST1)[1]
n1
## [1] 101
DASH_Score_HTNST2=subset(DASH_Score, DASH_Score$HTNST==2)
n2=dim(DASH_Score_HTNST2)[1]
n2
## [1] 62
DASH_Score_HTNST3=subset(DASH_Score, DASH_Score$HTNST==3)
n3=dim(DASH_Score_HTNST3)[1]
n3
## [1] 8
DASH_Score_HTNST4=subset(DASH_Score, DASH_Score$HTNST==4)
n4=dim(DASH_Score_HTNST4)[1]
n4
## [1] 1
Scatterplots of SuperWIN DASH Score vs SBP/DBP
par(mfrow=c(1,2))
plot(DASHSC_SuperWIN,SBP, pch = 20)
plot(DASHSC_SuperWIN,DBP, pch = 20)
Correlation between SuperWIN DASH Score and SBP/DBP
We have pearson correlation as follows:
data111=data.frame(DASHSC_SuperWIN,SBP,DBP)
round(cor(data111, use="complete.obs", method="pearson"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 -0.18 -0.23
## SBP -0.18 1.00 -0.04
## DBP -0.23 -0.04 1.00
We have spearman correlation as follows:
round(cor(data111, use="complete.obs", method="spearman"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 -0.16 -0.27
## SBP -0.16 1.00 -0.01
## DBP -0.27 -0.01 1.00
We have kendall correlation as follows:
round(cor(data111, use="complete.obs", method="kendall"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 -0.12 -0.17
## SBP -0.12 1.00 -0.01
## DBP -0.17 -0.01 1.00
Linear Regression
We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.
SBP as response
We calculate variance-inflation factor to chek the multicollinearity in the model. We then update our model by removing the the predictor variables (WTKG, Sleep,lightmin,modmin, actweek) with high VIF value. In our final proposed model, we can see that dash score is not statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + Age + as.factor(Sex) + HTM +
## HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.4308 -2.4961 0.6625 2.2984 7.8460
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 79.472944 11.939293 6.656 2.15e-09 ***
## DASHSC_SuperWIN -0.034604 0.038219 -0.905 0.36767
## Age 0.523478 0.331960 1.577 0.11832
## as.factor(Sex)2 -2.258398 1.200701 -1.881 0.06322 .
## HTM 23.334589 7.996010 2.918 0.00445 **
## HTPCT 0.013804 0.025386 0.544 0.58795
## BMIcal -0.101713 0.056679 -1.795 0.07608 .
## BMIPCT -0.008072 0.033191 -0.243 0.80840
## DMETS 0.118250 0.141806 0.834 0.40655
## hardmin -0.001500 0.002865 -0.524 0.60179
## vhardmin -0.008453 0.005825 -1.451 0.15018
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.812 on 90 degrees of freedom
## Multiple R-squared: 0.4772, Adjusted R-squared: 0.4191
## F-statistic: 8.215 on 10 and 90 DF, p-value: 2.559e-09
`
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(SBP~DASHSC_SuperWIN, data=data21)
Full = lm(SBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data21)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + as.factor(Sex) + BMIcal +
## Age + Sleep, data = data21)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTM as.factor(Sex)2
## 78.08291 -0.04119 23.52970 -2.06893
## BMIcal Age Sleep
## -0.08046 0.48063 0.08075
m_AIC=lm(SBP~DASHSC_SuperWIN+HTM+as.factor(Sex)+BMIcal+Age, data=data21)
summary(m_AIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + as.factor(Sex) + BMIcal +
## Age, data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.2425 -2.1841 0.5032 2.3311 8.0120
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 82.12483 8.21247 10.000 < 2e-16 ***
## DASHSC_SuperWIN -0.03425 0.03765 -0.910 0.3653
## HTM 24.69724 4.86766 5.074 1.93e-06 ***
## as.factor(Sex)2 -1.75985 0.91499 -1.923 0.0574 .
## BMIcal -0.08820 0.04227 -2.087 0.0396 *
## Age 0.40450 0.22590 1.791 0.0765 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.763 on 95 degrees of freedom
## Multiple R-squared: 0.4623, Adjusted R-squared: 0.434
## F-statistic: 16.34 on 5 and 95 DF, p-value: 1.322e-11
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n222=dim(data21)[1]
Base = lm(SBP~DASHSC_SuperWIN, data=data21)
Full = lm(SBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data21)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n222),trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM, data = data21)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTM
## 74.56631 -0.03637 30.72652
m_BIC=lm(SBP~DASHSC_SuperWIN+HTM, data=data21)
summary(m_BIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM, data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.3782 -2.5104 0.3789 2.4151 10.5675
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.56631 7.27742 10.246 < 2e-16 ***
## DASHSC_SuperWIN -0.03637 0.03843 -0.946 0.346
## HTM 30.72652 3.96937 7.741 9.02e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.914 on 98 degrees of freedom
## Multiple R-squared: 0.3998, Adjusted R-squared: 0.3876
## F-statistic: 32.64 on 2 and 98 DF, p-value: 1.366e-11
DBP as response
Similarly, we calculate variance-inflation factor to chek the multicollinearity in the model. Then, we update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value. We can see that dash score is statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + Age + as.factor(Sex) + HTM +
## HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.062 -4.353 1.774 5.249 12.421
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 103.976814 23.700179 4.387 3.11e-05 ***
## DASHSC_SuperWIN -0.168331 0.075867 -2.219 0.0290 *
## Age 0.808458 0.658960 1.227 0.2231
## as.factor(Sex)2 0.309717 2.383460 0.130 0.8969
## HTM -13.337810 15.872538 -0.840 0.4030
## HTPCT 0.006555 0.050392 0.130 0.8968
## BMIcal 0.206489 0.112512 1.835 0.0698 .
## BMIPCT -0.094794 0.065886 -1.439 0.1537
## DMETS -0.347802 0.281492 -1.236 0.2198
## hardmin 0.005303 0.005687 0.932 0.3536
## vhardmin 0.015700 0.011562 1.358 0.1779
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.567 on 90 degrees of freedom
## Multiple R-squared: 0.1382, Adjusted R-squared: 0.04243
## F-statistic: 1.443 on 10 and 90 DF, p-value: 0.1746
`
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(DBP~DASHSC_SuperWIN, data=data22)
Full = lm(DBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data22)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + Age, data = data22)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN Age
## 69.9287 -0.1536 0.6875
m2_AIC=lm(DBP~DASHSC_SuperWIN+ Age, data=data22)
summary(m2_AIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + Age, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.875 -4.123 1.421 4.976 12.368
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 69.92870 7.20486 9.706 5.27e-16 ***
## DASHSC_SuperWIN -0.15361 0.07305 -2.103 0.038 *
## Age 0.68746 0.40026 1.718 0.089 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.498 on 98 degrees of freedom
## Multiple R-squared: 0.07858, Adjusted R-squared: 0.05978
## F-statistic: 4.179 on 2 and 98 DF, p-value: 0.01813
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n22=dim(data22)[1]
Base = lm(DBP~DASHSC_SuperWIN, data=data22)
Full = lm(DBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data22)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n22),trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN, data = data22)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN
## 81.0210 -0.1687
m2_BIC=lm(DBP~DASHSC_SuperWIN, data=data22)
summary(m2_BIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.277 -4.354 1.088 5.530 12.714
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 81.02096 3.22529 25.121 <2e-16 ***
## DASHSC_SuperWIN -0.16866 0.07323 -2.303 0.0234 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.572 on 99 degrees of freedom
## Multiple R-squared: 0.05085, Adjusted R-squared: 0.04126
## F-statistic: 5.304 on 1 and 99 DF, p-value: 0.02337
Scatterplots of Gunther DASH Score vs SBP/DBP
par(mfrow=c(1,2))
plot(DASHSC_Gunther,SBP,pch=20)
plot(DASHSC_Gunther,DBP,pch=20)
Correlation between Gunther DASH Score and SBP
We have pearson correlation as follows:
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 -0.05 -0.21
## SBP -0.05 1.00 -0.04
## DBP -0.21 -0.04 1.00
We have spearman correlation as follows:
round(cor(data222, use="complete.obs", method="spearman") ,2)
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 -0.03 -0.23
## SBP -0.03 1.00 -0.01
## DBP -0.23 -0.01 1.00
We have kendall correlation as follows:
round(cor(data222, use="complete.obs", method="kendall"),2)
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 -0.03 -0.16
## SBP -0.03 1.00 -0.01
## DBP -0.16 -0.01 1.00
Linear Regression
We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.
Again, we calculate variance-inflation factor to chek the multicollinearity in the model. Then, We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value. We can see that dash score is not statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + Age + +as.factor(Sex) + HTM +
## HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.0384 -2.5430 0.7112 2.4047 7.9696
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 79.304925 11.694939 6.781 1.21e-09 ***
## DASHSC_Gunther -0.054848 0.047608 -1.152 0.25234
## Age 0.513405 0.331307 1.550 0.12474
## as.factor(Sex)2 -2.361172 1.205714 -1.958 0.05329 .
## HTM 23.999044 7.911340 3.033 0.00316 **
## HTPCT 0.012641 0.025330 0.499 0.61896
## BMIcal -0.098296 0.056727 -1.733 0.08656 .
## BMIPCT -0.011230 0.033217 -0.338 0.73609
## DMETS 0.117461 0.141407 0.831 0.40837
## hardmin -0.001409 0.002860 -0.493 0.62333
## vhardmin -0.008187 0.005809 -1.409 0.16219
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.801 on 90 degrees of freedom
## Multiple R-squared: 0.4801, Adjusted R-squared: 0.4223
## F-statistic: 8.311 on 10 and 90 DF, p-value: 2.038e-09
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(SBP~DASHSC_Gunther, data=data11)
Full = lm(SBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data11)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + as.factor(Sex) + BMIcal +
## Age + modmin, data = data11)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTM as.factor(Sex)2
## 84.34466 -0.06139 23.43791 -2.16650
## BMIcal Age modmin
## -0.07426 0.46796 -0.00316
m_AIC=lm(SBP~DASHSC_Gunther+HTM+as.factor(Sex)+WTKG, data=data11)
summary(m_AIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + as.factor(Sex) + WTKG,
## data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.2426 -2.4636 0.0842 2.5105 9.5517
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 76.97860 8.77375 8.774 6.39e-14 ***
## DASHSC_Gunther -0.05966 0.04728 -1.262 0.2101
## HTM 31.58364 5.23897 6.029 3.07e-08 ***
## as.factor(Sex)2 -1.73678 0.95437 -1.820 0.0719 .
## WTKG -0.02795 0.01697 -1.647 0.1028
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.802 on 96 degrees of freedom
## Multiple R-squared: 0.4452, Adjusted R-squared: 0.4221
## F-statistic: 19.26 on 4 and 96 DF, p-value: 1.167e-11
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n=dim(data11)[1]
Base = lm(SBP~DASHSC_Gunther, data=data11)
Full =lm(SBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data11)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n),trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + as.factor(Sex), data = data11)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTM as.factor(Sex)2
## 83.09022 -0.06945 26.75831 -2.10748
m_BIC=lm(SBP~DASHSC_Gunther+HTM+as.factor(Sex), data=data11)
summary(m_BIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + as.factor(Sex), data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.9646 -2.4279 0.1571 2.3564 9.9947
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 83.09022 8.02042 10.360 < 2e-16 ***
## DASHSC_Gunther -0.06945 0.04732 -1.468 0.1454
## HTM 26.75831 4.38151 6.107 2.1e-08 ***
## as.factor(Sex)2 -2.10748 0.93561 -2.253 0.0265 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.836 on 97 degrees of freedom
## Multiple R-squared: 0.4296, Adjusted R-squared: 0.4119
## F-statistic: 24.35 on 3 and 97 DF, p-value: 7.893e-12
Similarly, we calculate variance-inflation factor to chek the multicollinearity in the model. Then, We update our model by removing the the predictor variables(WTKG, Sleep,lightmin,modmin,actweek) with high VIF value. We can see that dash score is statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + Age + as.factor(Sex) + HTM +
## HTPCT + BMIcal + BMIPCT + DMETS + hardmin + vhardmin, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.894 -4.026 1.294 4.944 13.171
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 100.241398 23.275485 4.307 4.21e-05 ***
## DASHSC_Gunther -0.210997 0.094750 -2.227 0.0285 *
## Age 0.779404 0.659374 1.182 0.2403
## as.factor(Sex)2 0.040111 2.399634 0.017 0.9867
## HTM -9.837783 15.745297 -0.625 0.5337
## HTPCT 0.001927 0.050412 0.038 0.9696
## BMIcal 0.213977 0.112898 1.895 0.0613 .
## BMIPCT -0.106869 0.066109 -1.617 0.1095
## DMETS -0.351809 0.281431 -1.250 0.2145
## hardmin 0.005534 0.005692 0.972 0.3335
## vhardmin 0.016833 0.011561 1.456 0.1489
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.565 on 90 degrees of freedom
## Multiple R-squared: 0.1385, Adjusted R-squared: 0.04279
## F-statistic: 1.447 on 10 and 90 DF, p-value: 0.1729
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(DBP~DASHSC_Gunther, data=data12)
Full = lm(DBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data12)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + Age, data = data12)
##
## Coefficients:
## (Intercept) DASHSC_Gunther Age
## 69.0397 -0.1895 0.7677
m_AIC=lm(DBP~DASHSC_Gunther+Age, data=data12)
summary(m_AIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + Age, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.590 -4.070 1.270 5.155 12.980
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 69.03966 6.97991 9.891 <2e-16 ***
## DASHSC_Gunther -0.18947 0.08974 -2.111 0.0373 *
## Age 0.76771 0.39742 1.932 0.0563 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.497 on 98 degrees of freedom
## Multiple R-squared: 0.07891, Adjusted R-squared: 0.06011
## F-statistic: 4.198 on 2 and 98 DF, p-value: 0.01782
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n111=dim(data12)[1]
Base = lm(DBP~DASHSC_Gunther, data=data12)
Full = lm(DBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data12)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n111),trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther, data = data12)
##
## Coefficients:
## (Intercept) DASHSC_Gunther
## 80.8630 -0.1937
m_BIC=lm(DBP~DASHSC_Gunther, data=data12)
summary(m_BIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.9558 -4.4050 0.4971 5.5233 13.6274
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 80.86300 3.40107 23.78 <2e-16 ***
## DASHSC_Gunther -0.19374 0.09094 -2.13 0.0356 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.599 on 99 degrees of freedom
## Multiple R-squared: 0.04384, Adjusted R-squared: 0.03418
## F-statistic: 4.539 on 1 and 99 DF, p-value: 0.03562
Correlation between Gunther and SuperWIN
plot(DASHSC_Gunther,DASHSC_SuperWIN,pch=20)
## [1] 62
Scatterplots of SuperWIN DASH Score vs SBP/DBP
par(mfrow=c(1,2))
plot(DASHSC_SuperWIN,SBP, pch = 20)
plot(DASHSC_SuperWIN,DBP, pch = 20)
Correlation between SuperWIN DASH Score and SBP/DBP
We have pearson correlation as follows:
data111=data.frame(DASHSC_SuperWIN,SBP,DBP)
round(cor(data111, use="complete.obs", method="pearson"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 0.05 0.00
## SBP 0.05 1.00 -0.24
## DBP 0.00 -0.24 1.00
We have spearman correlation as follows:
round(cor(data111, use="complete.obs", method="spearman"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 0.01 -0.02
## SBP 0.01 1.00 -0.17
## DBP -0.02 -0.17 1.00
We have kendall correlation as follows:
round(cor(data111, use="complete.obs", method="kendall"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 0.02 0.00
## SBP 0.02 1.00 -0.12
## DBP 0.00 -0.12 1.00
Linear Regression
We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.
SBP as response
We calculate variance-inflation factor to chek the multicollinearity in the model. We then update our model by removing the the predictor variables (WTKG,HTM,BMIcal , Sleep,lightmin,modmin, hardmin, actweek) with high VIF value. In our final proposed model, we can see that dash score is not statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + Age + as.factor(Sex) + HTPCT +
## BMIPCT + DMETS + vhardmin, data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.0493 -2.1411 -0.0412 1.9586 12.8326
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 102.186229 7.860327 13.000 < 2e-16 ***
## DASHSC_SuperWIN 0.006038 0.052588 0.115 0.909
## Age 1.973863 0.291005 6.783 9.28e-09 ***
## as.factor(Sex)2 -5.005247 1.175519 -4.258 8.29e-05 ***
## HTPCT 0.094857 0.019525 4.858 1.06e-05 ***
## BMIPCT -0.025821 0.032773 -0.788 0.434
## DMETS 0.003532 0.163121 0.022 0.983
## vhardmin 0.004308 0.006287 0.685 0.496
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.067 on 54 degrees of freedom
## Multiple R-squared: 0.6385, Adjusted R-squared: 0.5917
## F-statistic: 13.63 on 7 and 54 DF, p-value: 5.335e-10
`
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(SBP~DASHSC_SuperWIN, data=data21)
Full = lm(SBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data21)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + Age + as.factor(Sex) +
## vhardmin, data = data21)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTM Age
## 65.488307 -0.034477 34.922852 0.756531
## as.factor(Sex)2 vhardmin
## -1.996383 0.007127
m_AIC=lm(SBP~DASHSC_SuperWIN+HTM+Age+as.factor(Sex)+BMIcal+vhardmin, data=data21)
summary(m_AIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + Age + as.factor(Sex) +
## BMIcal + vhardmin, data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.0471 -1.9918 0.0926 1.6405 12.0792
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 65.549139 9.592212 6.834 7.07e-09 ***
## DASHSC_SuperWIN -0.033211 0.048059 -0.691 0.4924
## HTM 35.087626 6.374805 5.504 1.00e-06 ***
## Age 0.756902 0.296071 2.556 0.0134 *
## as.factor(Sex)2 -1.940772 1.274168 -1.523 0.1334
## BMIcal -0.013701 0.070025 -0.196 0.8456
## vhardmin 0.006975 0.005140 1.357 0.1804
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.878 on 55 degrees of freedom
## Multiple R-squared: 0.6654, Adjusted R-squared: 0.6288
## F-statistic: 18.23 on 6 and 55 DF, p-value: 1.626e-11
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n222=dim(data21)[1]
Base = lm(SBP~DASHSC_SuperWIN, data=data21)
Full = lm(SBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data21)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n222),trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + Age, data = data21)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTM Age
## 56.62157 -0.01469 39.25890 0.78135
m_BIC=lm(SBP~DASHSC_SuperWIN+HTM+Age, data=data21)
summary(m_BIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_SuperWIN + HTM + Age, data = data21)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.7229 -2.3043 -0.1498 1.6117 11.4831
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 56.62157 8.09770 6.992 3.02e-09 ***
## DASHSC_SuperWIN -0.01469 0.04744 -0.310 0.7580
## HTM 39.25890 5.65502 6.942 3.66e-09 ***
## Age 0.78135 0.30051 2.600 0.0118 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.96 on 58 degrees of freedom
## Multiple R-squared: 0.6321, Adjusted R-squared: 0.6131
## F-statistic: 33.22 on 3 and 58 DF, p-value: 1.257e-12
DBP as response
Similarly, we calculate variance-inflation factor to chek the multicollinearity in the model. Then, we update our model by removing the the predictor variables(WTKG,HTM, BMIcal, Sleep,lightmin,modmin,hardmin, actweek) with high VIF value. We can see that dash score is statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + Age + as.factor(Sex) + HTPCT +
## BMIPCT + DMETS + vhardmin, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.9414 -4.6728 0.3295 5.6869 14.9465
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 55.744241 17.563548 3.174 0.00248 **
## DASHSC_SuperWIN 0.102624 0.117505 0.873 0.38633
## Age 1.105120 0.650238 1.700 0.09497 .
## as.factor(Sex)2 4.553616 2.626645 1.734 0.08869 .
## HTPCT -0.092552 0.043628 -2.121 0.03849 *
## BMIPCT -0.001011 0.073230 -0.014 0.98903
## DMETS 0.130928 0.364487 0.359 0.72084
## vhardmin -0.011086 0.014049 -0.789 0.43351
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.089 on 54 degrees of freedom
## Multiple R-squared: 0.1951, Adjusted R-squared: 0.09072
## F-statistic: 1.869 on 7 and 54 DF, p-value: 0.09285
`
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(DBP~DASHSC_SuperWIN, data=data22)
Full =lm(DBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data22)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + HTPCT + as.factor(Sex) +
## Age + Sleep + HTM, data = data22)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTPCT as.factor(Sex)2
## 80.41289 0.07356 0.03682 1.48331
## Age Sleep HTM
## 2.99257 0.24922 -40.87953
m2_AIC=lm(DBP~DASHSC_SuperWIN+HTPCT+ HTPCT+as.factor(Sex)+Age+Sleep+HTM, data=data22)
summary(m2_AIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + HTPCT + HTPCT + as.factor(Sex) +
## Age + Sleep + HTM, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.1633 -4.8504 0.9615 5.3218 15.3054
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 80.41289 36.25141 2.218 0.0307 *
## DASHSC_SuperWIN 0.07356 0.10769 0.683 0.4975
## HTPCT 0.03682 0.08307 0.443 0.6593
## as.factor(Sex)2 1.48331 3.54677 0.418 0.6774
## Age 2.99257 1.14841 2.606 0.0118 *
## Sleep 0.24922 0.15455 1.613 0.1126
## HTM -40.87953 28.61369 -1.429 0.1588
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.626 on 55 degrees of freedom
## Multiple R-squared: 0.2614, Adjusted R-squared: 0.1808
## F-statistic: 3.244 on 6 and 55 DF, p-value: 0.008408
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n22=dim(data22)[1]
Base = lm(DBP~DASHSC_SuperWIN, data=data22)
Full = lm(DBP~DASHSC_SuperWIN+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data22)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n22),trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + HTPCT, data = data22)
##
## Coefficients:
## (Intercept) DASHSC_SuperWIN HTPCT
## 81.018 0.039 -0.111
m2_BIC=lm(DBP~DASHSC_SuperWIN+HTPCT, data=data22)
summary(m2_BIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_SuperWIN + HTPCT, data = data22)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.0434 -5.8571 -0.6903 6.7117 17.3543
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 81.01782 5.12484 15.809 <2e-16 ***
## DASHSC_SuperWIN 0.03900 0.10658 0.366 0.7158
## HTPCT -0.11099 0.04121 -2.693 0.0092 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.145 on 59 degrees of freedom
## Multiple R-squared: 0.1095, Adjusted R-squared: 0.07932
## F-statistic: 3.628 on 2 and 59 DF, p-value: 0.03267
Scatterplots of Gunther DASH Score vs SBP/DBP
par(mfrow=c(1,2))
plot(DASHSC_Gunther,SBP,pch=20)
plot(DASHSC_Gunther,DBP,pch=20)
Correlation between Gunther DASH Score and SBP
We have pearson correlation as follows:
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 0.13 -0.08
## SBP 0.13 1.00 -0.24
## DBP -0.08 -0.24 1.00
We have spearman correlation as follows:
round(cor(data222, use="complete.obs", method="spearman") ,2)
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 0.10 -0.05
## SBP 0.10 1.00 -0.17
## DBP -0.05 -0.17 1.00
We have kendall correlation as follows:
round(cor(data222, use="complete.obs", method="kendall"),2)
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 0.07 -0.04
## SBP 0.07 1.00 -0.12
## DBP -0.04 -0.12 1.00
Linear Regression
We build a linear regression with SBP/DBP as response and dash score as regressor with other covariates including age, sex, HTNST,WTKG,HTM,HTPCT,BMIcal,BMIPCT, DMETS, sleep,lightmin, Modmin, hardmin, vhardmin and actweek.
Again, we calculate variance-inflation factor to chek the multicollinearity in the model. Then, We update our model by removing the the predictor variables(WTKG,HTM, BMIcal, Sleep,lightmin,modmin,hardmin, actweek) with high VIF value. We can see that dash score is not statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + Age + +as.factor(Sex) + HTPCT +
## BMIPCT + DMETS + vhardmin, data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.5048 -2.2546 -0.0657 2.3132 12.4476
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 100.323910 7.952617 12.615 < 2e-16 ***
## DASHSC_Gunther 0.057252 0.067712 0.846 0.401548
## Age 1.991685 0.288909 6.894 6.13e-09 ***
## as.factor(Sex)2 -4.794268 1.180251 -4.062 0.000158 ***
## HTPCT 0.094008 0.019367 4.854 1.07e-05 ***
## BMIPCT -0.031726 0.031721 -1.000 0.321701
## DMETS 0.011101 0.162016 0.069 0.945626
## vhardmin 0.003336 0.006208 0.537 0.593188
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.041 on 54 degrees of freedom
## Multiple R-squared: 0.6432, Adjusted R-squared: 0.5969
## F-statistic: 13.91 on 7 and 54 DF, p-value: 3.83e-10
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(SBP~DASHSC_Gunther, data=data11)
Full = lm(SBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data11)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + Age + as.factor(Sex),
## data = data11)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTM Age
## 65.69345 0.01728 32.98609 0.84488
## as.factor(Sex)2
## -2.19982
m_AIC=lm(SBP~DASHSC_Gunther+HTM+Age+as.factor(Sex), data=data11)
summary(m_AIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + Age + as.factor(Sex),
## data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.7073 -2.4764 -0.0375 1.7843 11.2464
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 65.69345 9.61691 6.831 6.08e-09 ***
## DASHSC_Gunther 0.01728 0.06220 0.278 0.78214
## HTM 32.98609 6.25897 5.270 2.17e-06 ***
## Age 0.84488 0.29327 2.881 0.00558 **
## as.factor(Sex)2 -2.19982 1.23370 -1.783 0.07989 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.882 on 57 degrees of freedom
## Multiple R-squared: 0.6524, Adjusted R-squared: 0.628
## F-statistic: 26.75 on 4 and 57 DF, p-value: 1.628e-12
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n=dim(data11)[1]
Base = lm(SBP~DASHSC_Gunther, data=data11)
Full = lm(SBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data11)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n),trace=FALSE)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + Age, data = data11)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTM Age
## 56.03493 0.03103 38.09740 0.83279
m_BIC=lm(SBP~DASHSC_Gunther+HTM+Age, data=data11)
summary(m_BIC)
##
## Call:
## lm(formula = SBP ~ DASHSC_Gunther + HTM + Age, data = data11)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.2949 -2.2861 -0.1414 1.9053 11.0517
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 56.03493 8.09427 6.923 3.95e-09 ***
## DASHSC_Gunther 0.03103 0.06287 0.494 0.62348
## HTM 38.09740 5.66755 6.722 8.58e-09 ***
## Age 0.83279 0.29865 2.789 0.00715 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.954 on 58 degrees of freedom
## Multiple R-squared: 0.633, Adjusted R-squared: 0.6141
## F-statistic: 33.35 on 3 and 58 DF, p-value: 1.169e-12
Similarly, we calculate variance-inflation factor to chek the multicollinearity in the model. Then, We update our model by removing the the predictor variables(WTKG, HTM, BMIcal, Sleep,lightmin,modmin,hardmin, actweek) with high VIF value. We can see that dash score is statistically significance under \(\alpha\)=0.1.
model after removing high VIF value
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + Age + as.factor(Sex) + HTPCT +
## BMIPCT + DMETS + vhardmin, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.0810 -4.3914 0.4871 5.9220 14.3703
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 58.583134 18.005191 3.254 0.00197 **
## DASHSC_Gunther 0.027860 0.153303 0.182 0.85648
## Age 1.061717 0.654107 1.623 0.11038
## as.factor(Sex)2 4.231842 2.672157 1.584 0.11911
## HTPCT -0.089782 0.043847 -2.048 0.04548 *
## BMIPCT 0.017589 0.071819 0.245 0.80746
## DMETS 0.112985 0.366813 0.308 0.75925
## vhardmin -0.008646 0.014055 -0.615 0.54103
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.15 on 54 degrees of freedom
## Multiple R-squared: 0.1842, Adjusted R-squared: 0.07844
## F-statistic: 1.742 on 7 and 54 DF, p-value: 0.1188
Forward Stepwise: AIC
We use AICp criterion at each step to select our final model.
Base = lm(DBP~DASHSC_Gunther, data=data12)
Full = lm(DBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data12)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + HTPCT, data = data12)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTPCT
## 84.18861 -0.04575 -0.10721
m_AIC=lm(DBP~DASHSC_Gunther+HTPCT, data=data12)
summary(m_AIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + HTPCT, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.6821 -5.7588 -0.1127 6.4804 17.6607
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 84.18861 5.76421 14.605 <2e-16 ***
## DASHSC_Gunther -0.04575 0.14079 -0.325 0.7463
## HTPCT -0.10721 0.04103 -2.613 0.0114 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.148 on 59 degrees of freedom
## Multiple R-squared: 0.1091, Adjusted R-squared: 0.07888
## F-statistic: 3.612 on 2 and 59 DF, p-value: 0.03313
Forward Stepwise: BIC
We use BICp criterion at each step to select our final model.
n111=dim(data12)[1]
Base = lm(DBP~DASHSC_Gunther, data=data12)
Full = lm(DBP~DASHSC_Gunther+Age+as.factor(Sex)+WTKG+HTM+HTPCT+BMIcal+BMIPCT+DMETS+Sleep+lightmin+modmin+hardmin+vhardmin+actweek, data=data12)
step(Base, scope = list( upper=Full, lower=~1 ), direction = "forward", k=log(n111),trace=FALSE)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + HTPCT, data = data12)
##
## Coefficients:
## (Intercept) DASHSC_Gunther HTPCT
## 84.18861 -0.04575 -0.10721
m_BIC=lm(DBP~DASHSC_Gunther+HTPCT, data=data12)
summary(m_BIC)
##
## Call:
## lm(formula = DBP ~ DASHSC_Gunther + HTPCT, data = data12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.6821 -5.7588 -0.1127 6.4804 17.6607
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 84.18861 5.76421 14.605 <2e-16 ***
## DASHSC_Gunther -0.04575 0.14079 -0.325 0.7463
## HTPCT -0.10721 0.04103 -2.613 0.0114 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.148 on 59 degrees of freedom
## Multiple R-squared: 0.1091, Adjusted R-squared: 0.07888
## F-statistic: 3.612 on 2 and 59 DF, p-value: 0.03313
Correlation between Gunther and SuperWIN
plot(DASHSC_Gunther,DASHSC_SuperWIN,pch=20)
## [1] 8
Scatterplots of SuperWIN DASH Score vs SBP/DBP
par(mfrow=c(1,2))
plot(DASHSC_SuperWIN,SBP, pch = 20)
plot(DASHSC_SuperWIN,DBP, pch = 20)
Correlation between SuperWIN DASH Score and SBP/DBP
We have pearson correlation as follows:
data111=data.frame(DASHSC_SuperWIN,SBP,DBP)
round(cor(data111, use="complete.obs", method="pearson"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 -0.18 0.17
## SBP -0.18 1.00 -0.07
## DBP 0.17 -0.07 1.00
We have spearman correlation as follows:
round(cor(data111, use="complete.obs", method="spearman"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 -0.11 0.17
## SBP -0.11 1.00 -0.16
## DBP 0.17 -0.16 1.00
We have kendall correlation as follows:
round(cor(data111, use="complete.obs", method="kendall"),2)
## DASHSC_SuperWIN SBP DBP
## DASHSC_SuperWIN 1.00 -0.04 0.07
## SBP -0.04 1.00 -0.11
## DBP 0.07 -0.11 1.00
Scatterplots of Gunther DASH Score vs SBP/DBP
par(mfrow=c(1,2))
plot(DASHSC_Gunther,SBP,pch=20)
plot(DASHSC_Gunther,DBP,pch=20)
Correlation between Gunther DASH Score and SBP
We have pearson correlation as follows:
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 -0.13 0.22
## SBP -0.13 1.00 -0.07
## DBP 0.22 -0.07 1.00
We have spearman correlation as follows:
round(cor(data222, use="complete.obs", method="spearman") ,2)
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 0.00 0.26
## SBP 0.00 1.00 -0.16
## DBP 0.26 -0.16 1.00
We have kendall correlation as follows:
round(cor(data222, use="complete.obs", method="kendall"),2)
## DASHSC_Gunther SBP DBP
## DASHSC_Gunther 1.00 0.04 0.14
## SBP 0.04 1.00 -0.11
## DBP 0.14 -0.11 1.00
Correlation between Gunther and SuperWIN
plot(DASHSC_Gunther,DASHSC_SuperWIN,pch=20)