Author: Elhakim Ibrahim

Instructor: Corey Sparks, PhD1

February 24, 2020



Objective of the exercise

This exercise seeks to evaluate if risk of obesity in the United States differ substantially by socioeconomic characteristics (age, race/ethinicity and income). Data for the State of Texas extracted from nationally representative 2016 Behavioral Risk Factor Surveillance System (BRFSS) SMART metro area survey data set were employed2. Comparison by the hypothesized characteristics will be based on odds ratios from logistic regression models. In addition, predicted values will be fitted to facilitate more intuitive understanding of the logistic regression model estimates.



Set working directory and load packages


Load and process data set

 [1] "dispcode" "statere1" "safetime" "hhadult"  "genhlth"  "physhlth"
 [7] "menthlth" "poorhlth" "hlthpln1" "persdoc2" "medcost"  "checkup1"
[13] "bphigh4"  "bpmeds"   "cholchk1" "toldhi2"  "cholmed1" "cvdinfr4"
[19] "cvdcrhd4" "cvdstrk3" "asthma3"  "asthnow"  "chcscncr" "chcocncr"
[25] "chccopd1" "havarth3" "addepev2" "chckidny" "diabete3" "diabage2"
[31] "lmtjoin3" "arthdis2" "arthsocl" "joinpai1" "sex"      "marital" 
[37] "educa"    "renthom1" "numhhol2" "numphon2" "cpdemo1a" "veteran3"
[43] "employ1"  "children" "income2"  "internet" "weight2"  "height3" 


Operationalize working variables


Health outcome of interest

Obesity status

In the data, the bmi variable was computed as with implied 2 decimal places as shown above. Hence, the variable is divided by 100 to get the values that represent actual bmi’s. The new bmi is then dummied as 0 = Not obese for bmi below 30 and 1 = Obese for bmi 30 and above


Define complex survey design object

I create a survey design object from complex survey parameters ids = PSU identifers, strata=strata identifiers, weights=case weights, data=data frame. In doing this, respondents with missing case weights are excluded from the analysis, while options(survey.lonely.psu = "adjust") was applied to facilitate calculation of within stratum variance for stratum with single PSU.



Results


Odds ratios

Call:
svyglm(formula = bmigrp ~ agegrp + racegrp + incomegrp, design = design.object, 
    family = binomial)

Survey design:
svydesign(ids = ~1, strata = ~ststr, weights = ~mmsawt, data = mydata)

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)                -1.9675     0.2561  -7.682 1.77e-14 ***
agegrp(24,39]               1.1495     0.2363   4.866 1.16e-06 ***
agegrp(39,59]               1.6848     0.2342   7.194 6.90e-13 ***
agegrp(59,79]               1.3892     0.2438   5.699 1.25e-08 ***
agegrp(79,99]               0.4582     0.3891   1.178   0.2389    
racegrpNon-Hispanic Black   0.2261     0.1792   1.262   0.2070    
racegrpHispanic             0.2699     0.1310   2.060   0.0394 *  
racegrpOther               -0.4718     0.2594  -1.819   0.0690 .  
incomegrpQuantile 2        -0.3750     0.1812  -2.070   0.0385 *  
incomegrpQuantile 3        -0.2890     0.1326  -2.179   0.0293 *  
incomegrpQuantile 4        -0.1285     0.1885  -0.682   0.4955    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 0.9723225)

Number of Fisher Scoring iterations: 4

In the above table, the summary function is applied to generate parameter coefficients with accompanying s.e., t-statistic, and p-value for each. But, we are equally interested in relative size of difference in gradients across categories compared with the reference category, not only in the direction of difference in the gradients. Hence, same results are represented below in odds ratios and accompanying confidence intervals using stargazer3 package.


Odds ratios of Obesity by selected characteristics, BRFSS 2016
==============================================================
                          Outcome: Obesity likelihood         
                  --------------------------------------------
                          Odds ratios (25% CI,95% CI)         
--------------------------------------------------------------
Intercept                     0.14 (-0.36,0.64)***            
25-39                         3.16 (2.69,3.62)***             
40-59                         5.39 (4.93,5.85)***             
60-79                         4.01 (3.53,4.49)***             
80+                             1.58 (0.82,2.34)              
NH Black                        1.25 (0.90,1.60)              
Hispanic                       1.31 (1.05,1.57)*              
Other                           0.62 (0.12,1.13)              
Quantile 2                     0.69 (0.33,1.04)*              
Quantile 3                     0.75 (0.49,1.01)*              
Quantile 4                      0.88 (0.51,1.25)              
--------------------------------------------------------------
Observations                         7,750                    
Log Likelihood                     -4,265.40                  
Akaike Inf. Crit.                   8,552.80                  
==============================================================
Notes:            *p<0.05; **p<0.01; ***p<0.001               
                  Variables selected for exercise purpose only
Odds ratios interpretation

The odds ratios indicating variations in likelihood of obesity among Texas residents are presented in the above table. Considering differentials by age groups, the model suggests a curvilinear relation between age and risk of obesity with the highest odds found among population aged 39-59 years (OR: 5.39; CI: 3.41-8.53), followed by those aged 59-79 years (OR: 4.01; CI: 2.49-6.47) compared with those aged less than 24 years. Also, income appears to be a siginficant factor: compared with those in lower 25% of income ladder, obesity propensity is signifcantly lower by approximately 31% and 25% for those in upper 50% and 75% of income ladder. Meanwhile, the likelihood of being obese is 1.31 times higher (p<0.05) for an average Hispanic relative to a Non-Hispanic White. Similar risk gradient is exhibited by an average Non-Hispanic Black, but the probability is falls outside significant thresholds.


Predicted values
Predicted values interpretation

Attemp is made to understand inherent variations in likelihood of obesity within the diverse groups of Texas population. Obesity risk disparities are examined between a hypothetical Non-Hispanic White and Hispanic person by specific age category and income status. As shown in the tables, a low-income (income quantile 1, lower 25%), middle-aged (24-39 years) Non-Hispanic White is averagely 6% point less likely to be obese than a Hispanic of similar income and age characteristics (i.e. estimated probability of being obese: Non-Hispanic White, 30.6% vs. Hispanic, 36.6%). Further, almost equal difference is observed between the two hypothetical individuals (in favor of the Non-Hispanic White) in the same age group but of highest income status (estimated probability of being obese: Non-Hispanic White, 20.0% vs. Hispanic, 33.7%).



Acknowledgements

  1. This content adapts steps and examples from instructional materials authored by Dr. Corey Sparks and made available at https://rpubs.com/corey_sparks.

  2. The 2016 BRFSS SMART metro area survey data are open source resources made available by the US Centers for Disease Control and Prevention at https://www.cdc.gov/brfss/smart/smart_2016.html.

  3. Stargazer is authored by Marek Hlavac and full document can be downloaded from https://cran.r-project.org/web/packages/stargazer/stargazer.pdf