I. Assess distributions

Do this to identify if any transformations are necessary / justified

plot(density(na.omit(dat$osteocalcin_pg_ml))) # lognormal

plot(density(na.omit(dat$ctx_1_ng_ml))) # lognormal

plot(density(na.omit(dat$ad_step_count_sum_0_24hr))) # normal

plot(density(na.omit(dat$dur_day_total_lig_min_pla))) # normal

plot(density(na.omit(dat$dur_day_total_mod_min_pla))) # normal

plot(density(na.omit(dat$dur_day_total_vig_min_pla))) # lognormal

plot(density(na.omit(dat$functional_status_n_y_0_1))) # binary

plot(density(na.omit(dat$pregnat_or_breastfeeding_0_1))) # binary

plot(density(na.omit(dat_full$left_tibia_sos))) # normal

plot(density(na.omit(dat_full$ad_step_count_sum_0_24hr))) # normal

plot(density(na.omit(dat_full$bmi))) # normal ish

plot(density(na.omit(dat$bmi))) # same as in full cohort

II. Linear models

exposure: average daily steps

A. outcome: osteocalcin [bone formation]

1. Model summary

Step count is not a significant predictor of osteocalcin.

BMI and BMI*steps interaction are not significant predictors.

Sex and age*sex interaction are significant predictors (p<0.05, p<0.005 respectively). Reference is female. Males have slightly higher intercept than females. In males, the relationship between age and osteocalcin is significant, positive, and potentially less steep (flatter) than in females (although age is not a significant predictor of osteocalcin for females).

mod1<- lm(log(osteocalcin_pg_ml) ~ ad_step_count_sum_0_24hr * bmi + age_years * sex + pregnat_or_breastfeeding_0_1 + functional_status_n_y_0_1, data = dat)
summary(mod1)
## 
## Call:
## lm(formula = log(osteocalcin_pg_ml) ~ ad_step_count_sum_0_24hr * 
##     bmi + age_years * sex + pregnat_or_breastfeeding_0_1 + functional_status_n_y_0_1, 
##     data = dat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.22926 -0.35528  0.00867  0.25044  1.77114 
## 
## Coefficients:
##                                   Estimate    Std. Error t value
## (Intercept)                  11.1546265563  0.5611120409  19.879
## ad_step_count_sum_0_24hr     -0.0000255441  0.0000461168  -0.554
## bmi                          -0.0335805075  0.0219717341  -1.528
## age_years                     0.0043785893  0.0030053881   1.457
## sexmale                       0.4171114915  0.1940833136   2.149
## pregnat_or_breastfeeding_0_1 -0.0628435304  0.1004596221  -0.626
## functional_status_n_y_0_1     0.0680567252  0.1136159224   0.599
## ad_step_count_sum_0_24hr:bmi  0.0000004069  0.0000019019   0.214
## age_years:sexmale            -0.0121088339  0.0042652542  -2.839
##                                          Pr(>|t|)    
## (Intercept)                  < 0.0000000000000002 ***
## ad_step_count_sum_0_24hr                  0.58017    
## bmi                                       0.12775    
## age_years                                 0.14646    
## sexmale                                   0.03263 *  
## pregnat_or_breastfeeding_0_1              0.53220    
## functional_status_n_y_0_1                 0.54974    
## ad_step_count_sum_0_24hr:bmi              0.83079    
## age_years:sexmale                         0.00492 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4913 on 238 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.1362, Adjusted R-squared:  0.1071 
## F-statistic: 4.689 on 8 and 238 DF,  p-value: 0.00002392

2. Diagnostics

Assumptions check out with log transformation of outcome (osteocalcin)

plot(mod1)

3. Model prediction with observed data

Line is model predicted marginal effect of steps, averaging over covariates. Grey dots are actual data points.

this method computes the model predicted outcome for each observation (using actual values of covariates for each observation), then averages those predicted outcomes for each exposure value

plot_model(mod1, type='pred', terms=c("ad_step_count_sum_0_24hr"), show.data = TRUE) +
  scale_x_continuous(labels = function(ad_step_count_sum_0_24hr) format_thousands(ad_step_count_sum_0_24hr)) +
  labs(title=NULL, 
       x="Average daily step count", 
       y="Serum Osteocalcin\n(pg/mL)") 
## Model has log-transformed response. Back-transforming predictions to
##   original response scale. Standard errors are still on the transformed
##   scale.
## Data points may overlap. Use the `jitter` argument to add some amount of
##   random variation to the location of data points and avoid overplotting.

B. outcome: ctx-1 [bone resorption]

1. Model summary

Step count is not a significant predictor of CTX-1.

BMI “approaches significance” and BMI*steps interaction term is not significant predictor. [body weight protects against bone resorption?]

Sex and age*sex interaction are significant predictors (p<0.05, p<0.005 respectively), as with osteocalcin. Female is reference. Males have higher intercept than females. Relationship between age and CTX-1 is significant and negative in males, age does not significantly predict CTX-1 in females.

mod1<- lm(log(ctx_1_ng_ml) ~ ad_step_count_sum_0_24hr * bmi + age_years * sex + pregnat_or_breastfeeding_0_1 + functional_status_n_y_0_1, data = dat)
summary(mod1)
## 
## Call:
## lm(formula = log(ctx_1_ng_ml) ~ ad_step_count_sum_0_24hr * bmi + 
##     age_years * sex + pregnat_or_breastfeeding_0_1 + functional_status_n_y_0_1, 
##     data = dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.0179 -0.4413  0.1157  0.4786  1.5920 
## 
## Coefficients:
##                                  Estimate   Std. Error t value Pr(>|t|)   
## (Intercept)                  -0.255939365  0.892942860  -0.287  0.77466   
## ad_step_count_sum_0_24hr     -0.000051394  0.000075069  -0.685  0.49426   
## bmi                          -0.063163410  0.035206521  -1.794  0.07410 . 
## age_years                     0.004069489  0.004779988   0.851  0.39545   
## sexmale                       0.657502074  0.307675977   2.137  0.03364 * 
## pregnat_or_breastfeeding_0_1 -0.181129601  0.164877055  -1.099  0.27309   
## functional_status_n_y_0_1     0.037996675  0.178301866   0.213  0.83143   
## ad_step_count_sum_0_24hr:bmi  0.000000627  0.000003118   0.201  0.84082   
## age_years:sexmale            -0.019290480  0.006735298  -2.864  0.00457 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7696 on 232 degrees of freedom
##   (7 observations deleted due to missingness)
## Multiple R-squared:  0.1794, Adjusted R-squared:  0.1511 
## F-statistic: 6.339 on 8 and 232 DF,  p-value: 0.0000001958

2. Diagnostics

Assumptions check out with log transformation of outcome (ctx-1)

plot(mod1)

3. Model prediction with observed data

Line is model predicted marginal effect of steps, averaging over covariates. Grey dots are actual data points. (compute predicted outcome for each observation (using actual values of covariates), average those predicted outcomes for each exposure value)

plot_model(mod1, type='pred', terms=c("ad_step_count_sum_0_24hr"), show.data = TRUE) +
  scale_x_continuous(labels = function(ad_step_count_sum_0_24hr) format_thousands(ad_step_count_sum_0_24hr)) +
  labs(title=NULL, 
       x="Average daily step count", 
       y="Serum CTX-1\n(ng/mL)") 
## Model has log-transformed response. Back-transforming predictions to
##   original response scale. Standard errors are still on the transformed
##   scale.
## Data points may overlap. Use the `jitter` argument to add some amount of
##   random variation to the location of data points and avoid overplotting.

C. outcome: tibial speed of sound (SOS) [proxy BMD]

1. Model summary

Step count is not a significant predictor of SOS.

BMI and BMI*steps interaction term are not significant predictors.

Age and age*sex interaction are significant predictors (p<0.0001, p<0.005 respectively). SOS significantly (though modestly) decreases with age in women, and this negative relationship is tempered in men.

note- the same model without log transforming the outcome yields equivalent results

mod1<- lm(log(left_tibia_sos) ~ ad_step_count_sum_0_24hr * bmi + age_years * sex + pregnat_or_breastfeeding_0_1 + functional_status_n_y_0_1, data = dat_full)
summary(mod1)
## 
## Call:
## lm(formula = log(left_tibia_sos) ~ ad_step_count_sum_0_24hr * 
##     bmi + age_years * sex + pregnat_or_breastfeeding_0_1 + functional_status_n_y_0_1, 
##     data = dat_full)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.192787 -0.022867  0.002471  0.024391  0.100874 
## 
## Coefficients:
##                                    Estimate     Std. Error t value
## (Intercept)                   8.32336983310  0.03526579824 236.018
## ad_step_count_sum_0_24hr      0.00000170494  0.00000286113   0.596
## bmi                          -0.00147247230  0.00124032398  -1.187
## age_years                    -0.00103875213  0.00021411251  -4.851
## sexmale                      -0.01987973650  0.01292772946  -1.538
## pregnat_or_breastfeeding_0_1  0.00483239328  0.00658061449   0.734
## functional_status_n_y_0_1    -0.00067273773  0.00727611019  -0.092
## ad_step_count_sum_0_24hr:bmi -0.00000008792  0.00000011660  -0.754
## age_years:sexmale             0.00093971080  0.00030336928   3.098
##                                          Pr(>|t|)    
## (Intercept)                  < 0.0000000000000002 ***
## ad_step_count_sum_0_24hr                   0.5516    
## bmi                                        0.2359    
## age_years                              0.00000181 ***
## sexmale                                    0.1250    
## pregnat_or_breastfeeding_0_1               0.4632    
## functional_status_n_y_0_1                  0.9264    
## ad_step_count_sum_0_24hr:bmi               0.4513    
## age_years:sexmale                          0.0021 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03972 on 367 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.2017, Adjusted R-squared:  0.1843 
## F-statistic: 11.59 on 8 and 367 DF,  p-value: 0.00000000000001043

2. Diagnostics

Assumptions mostly check out with log transformation of outcome (SOS)

plot(mod1)

3. Model prediction with observed data

Line is model predicted marginal effect of steps, averaging over covariates. Grey dots are actual data points.

this method computes the model predicted outcome for each observation (using actual values of covariates for each observation), then averages those predicted outcomes for each exposure value

plot_model(mod1, type='pred', terms=c("ad_step_count_sum_0_24hr"), show.data = TRUE) +
  scale_x_continuous(labels = function(ad_step_count_sum_0_24hr) format_thousands(ad_step_count_sum_0_24hr)) +
  labs(title=NULL, 
       x="Average daily step count", 
       y="Tibial SOS\n(m/s)") 
## Model has log-transformed response. Back-transforming predictions to
##   original response scale. Standard errors are still on the transformed
##   scale.
## Data points may overlap. Use the `jitter` argument to add some amount of
##   random variation to the location of data points and avoid overplotting.

vif(mod1)
## there are higher-order terms (interactions) in this model
## consider setting type = 'predictor'; see ?vif
##     ad_step_count_sum_0_24hr                          bmi 
##                    21.484969                    11.771111 
##                    age_years                          sex 
##                     2.176178                     9.671350 
## pregnat_or_breastfeeding_0_1    functional_status_n_y_0_1 
##                     1.346739                     1.065409 
## ad_step_count_sum_0_24hr:bmi                age_years:sex 
##                    25.496874                     9.795259
vif(mod1, type = 'predictor')
## GVIFs computed for predictors
##                                  GVIF Df GVIF^(1/(2*Df))
## ad_step_count_sum_0_24hr     1.170253  3        1.026550
## bmi                          1.170253  3        1.026550
## age_years                    1.496735  3        1.069525
## sex                          1.496735  3        1.069525
## pregnat_or_breastfeeding_0_1 1.346739  1        1.160491
## functional_status_n_y_0_1    1.065409  1        1.032186
##                                        Interacts With
## ad_step_count_sum_0_24hr                          bmi
## bmi                          ad_step_count_sum_0_24hr
## age_years                                         sex
## sex                                         age_years
## pregnat_or_breastfeeding_0_1                     --  
## functional_status_n_y_0_1                        --  
##                                                                                                    Other Predictors
## ad_step_count_sum_0_24hr                    age_years, sex, pregnat_or_breastfeeding_0_1, functional_status_n_y_0_1
## bmi                                         age_years, sex, pregnat_or_breastfeeding_0_1, functional_status_n_y_0_1
## age_years                    ad_step_count_sum_0_24hr, bmi, pregnat_or_breastfeeding_0_1, functional_status_n_y_0_1
## sex                          ad_step_count_sum_0_24hr, bmi, pregnat_or_breastfeeding_0_1, functional_status_n_y_0_1
## pregnat_or_breastfeeding_0_1               ad_step_count_sum_0_24hr, bmi, age_years, sex, functional_status_n_y_0_1
## functional_status_n_y_0_1               ad_step_count_sum_0_24hr, bmi, age_years, sex, pregnat_or_breastfeeding_0_1

Model without interaction terms

mod1<- lm(log(left_tibia_sos) ~ ad_step_count_sum_0_24hr + bmi + age_years + sex + pregnat_or_breastfeeding_0_1 + functional_status_n_y_0_1, data = dat_full)
summary(mod1)
## 
## Call:
## lm(formula = log(left_tibia_sos) ~ ad_step_count_sum_0_24hr + 
##     bmi + age_years + sex + pregnat_or_breastfeeding_0_1 + functional_status_n_y_0_1, 
##     data = dat_full)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.192382 -0.024010  0.003627  0.026067  0.100296 
## 
## Coefficients:
##                                   Estimate    Std. Error t value
## (Intercept)                   8.3224788356  0.0173975500 478.371
## ad_step_count_sum_0_24hr     -0.0000004019  0.0000006549  -0.614
## bmi                          -0.0022155301  0.0003875640  -5.717
## age_years                    -0.0005833428  0.0001551710  -3.759
## sexmale                       0.0175104518  0.0046783971   3.743
## pregnat_or_breastfeeding_0_1  0.0102659639  0.0064229590   1.598
## functional_status_n_y_0_1    -0.0004570714  0.0072940088  -0.063
##                                          Pr(>|t|)    
## (Intercept)                  < 0.0000000000000002 ***
## ad_step_count_sum_0_24hr                 0.539775    
## bmi                                  0.0000000225 ***
## age_years                                0.000198 ***
## sexmale                                  0.000211 ***
## pregnat_or_breastfeeding_0_1             0.110827    
## functional_status_n_y_0_1                0.950068    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04016 on 369 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.1793, Adjusted R-squared:  0.166 
## F-statistic: 13.44 on 6 and 369 DF,  p-value: 0.0000000000000857
plot(mod1)

vif(mod1)
##     ad_step_count_sum_0_24hr                          bmi 
##                     1.100963                     1.124055 
##                    age_years                          sex 
##                     1.117853                     1.238769 
## pregnat_or_breastfeeding_0_1    functional_status_n_y_0_1 
##                     1.254799                     1.047138