Author: Elhakim Ibrahim

Instructor: Corey Sparks, PhD1

March 30, 2020



Objective of the exercise



Data source

The exercise is based on data for the State of Texas extracted from nationally representative 2016 Behavioral Risk Factor Surveillance System (BRFSS) SMART metro area survey data set were employed2.



Set working directory and load packages



Load and process data set

 [1] "dispcode" "statere1" "safetime" "hhadult"  "genhlth"  "physhlth"
 [7] "menthlth" "poorhlth" "hlthpln1" "persdoc2" "medcost"  "checkup1"
[13] "bphigh4"  "bpmeds"   "cholchk1" "toldhi2"  "cholmed1" "cvdinfr4"
[19] "cvdcrhd4" "cvdstrk3" "asthma3"  "asthnow"  "chcscncr" "chcocncr"
[25] "chccopd1" "havarth3" "addepev2" "chckidny" "diabete3" "diabage2"
[31] "lmtjoin3" "arthdis2" "arthsocl" "joinpai1" "sex"      "marital" 
[37] "educa"    "renthom1" "numhhol2" "numphon2" "cpdemo1a" "veteran3"
[43] "employ1"  "children" "income2"  "internet" "weight2"  "height3" 


Objective 1: Define health outcome of interest and measurement variable

Health outcome of interest

Obesity status

Our outcome of interest is obesity status. Based on Centers for Disease Control and Prevention’s categorization, the outcome is operationalized as follows:

  • Status is classified as normal if BMI ranges from 18.5 to <25;
  • Status is classified as overweight if BMI ranges from 25.0 to <30;
  • Status is classified as obese if BMI ranges from 30.0 or higher

In the data, the bmi variable was computed as with implied 2 decimal places as shown above. Hence, the variable is divided by 100 to get the values that represent actual bmi’s.

The new bmi is then dummied as 0 = Not obese for bmi below 30 and 1 = Obese for bmi 30 and above

Frequencies  
mydata$obstatDES  
Type: Factor  

                     Freq   % Valid   % Valid Cum.   % Total   % Total Cum.
---------------- -------- --------- -------------- --------- --------------
          Normal    68577      30.2           30.2      29.7           29.7
           Obese    82347      36.2           66.4      35.7           65.4
      Overweight    76477      33.6          100.0      33.1           98.5
            <NA>     3474                                1.5          100.0
           Total   230875     100.0          100.0     100.0          100.0
Frequencies  
mydata$obstatNUM  
Label: COMPUTED BODY MASS INDEX  
Type: Numeric  

                Freq   % Valid   % Valid Cum.   % Total   % Total Cum.
----------- -------- --------- -------------- --------- --------------
          1    68577      30.2           30.2      29.7           29.7
          2    76477      33.6           63.8      33.1           62.8
          3    82347      36.2          100.0      35.7           98.5
       <NA>     3474                                1.5          100.0
      Total   230875     100.0          100.0     100.0          100.0


Objective 2: State working hypothesis and operationalize covariates

Hypothesis

It is hypothesized that obesity status will not differ signifcantly by selected sociodemographic characteristics. To test the hypothesis, we would include age, race/ethnicity, education and income covariates in each of the empirical models.

Objective 3: Fit ordinal logit regression model

Step 2: Define complex survey design object

I create a survey design object from complex survey parameters ids = PSU identifers, strata=strata identifiers, weights=case weights, data=data frame. In doing this, respondents with missing case weights are excluded from the analysis, while options(survey.lonely.psu = "adjust") was applied to facilitate calculation of within stratum variance for stratum with single PSU.

Step 3a: Fit the model
                      OR 2.5 % 97.5 %
agegrp(24,39]       1.86  1.72   2.01
agegrp(39,59]       2.40  2.23   2.58
agegrp(59,79]       2.53  2.35   2.72
agegrp(79,99]       1.71  1.54   1.91
racegrpNH Black     1.19  1.14   1.25
racegrpHispanic     1.32  1.25   1.39
racegrpOther        0.74  0.68   0.80
eduHS Grad          1.21  1.16   1.27
eduPrimary          1.15  1.05   1.27
eduSome Coll        1.15  1.10   1.19
eduSome HS          1.08  1.00   1.17
incomegrpQuantile 2 1.12  1.05   1.19
incomegrpQuantile 3 1.09  1.05   1.14
incomegrpQuantile 4 0.93  0.89   0.98
Normal|Obese        1.11  1.02   1.20
Obese|Overweight    5.20  5.13   5.27
Step 3b: Evaluate proportional odds assumption of the model

Approach 1: Fit separate binary logit regression models and examine plots and coefficients of the models

Plot the coefficients

The figure shows that the assumption of proportionality is considerably violated. The deviation from porportional odds assumption is particulalrly striking per age and education patterns of the outcome.

Examine models odds ratios

                      OR 2.5 % 97.5 %
(Intercept)         0.61  0.56   0.66
agegrp(24,39]       2.40  2.23   2.59
agegrp(39,59]       3.56  3.31   3.83
agegrp(59,79]       3.34  3.10   3.59
agegrp(79,99]       1.54  1.39   1.70
racegrpNH Black     1.52  1.42   1.62
racegrpHispanic     1.53  1.42   1.64
racegrpOther        0.69  0.63   0.75
eduHS Grad          1.51  1.43   1.59
eduPrimary          1.81  1.55   2.10
eduSome Coll        1.38  1.32   1.45
eduSome HS          1.36  1.23   1.50
incomegrpQuantile 2 1.04  0.96   1.12
incomegrpQuantile 3 0.96  0.91   1.01
incomegrpQuantile 4 1.12  1.05   1.19

Approach 2: Fit comparative porportional odds, nonproportional odds and multinomial regression models and evaluate model fits

Fit porportional odds model

Fit nonporportional odds model

Fit multinomial regression models

Step 3c: Evaluate best fitting model

Compare AIC values of the models to determine the best fitting model

[1] 479964
[1] 469792
[1] 469754

The AIC values indicate that multinomial model, with the least value, is the best fitting model of the three models being evaluated



Objective 4: Presentation of results of the most parsimonious/best fitting model

                        OR 2.5 % 97.5 %
(Intercept):1         0.26  0.24   0.27
(Intercept):2         0.36  0.34   0.37
agegrp(24,39]:1       2.87  2.77   2.98
agegrp(24,39]:2       1.99  1.92   2.07
agegrp(39,59]:1       4.29  4.13   4.45
agegrp(39,59]:2       2.93  2.82   3.04
agegrp(59,79]:1       3.63  3.49   3.77
agegrp(59,79]:2       3.04  2.93   3.16
agegrp(79,99]:1       1.20  1.13   1.28
agegrp(79,99]:2       1.86  1.76   1.97
racegrpNH Black:1     1.69  1.63   1.74
racegrpNH Black:2     1.34  1.29   1.39
racegrpHispanic:1     1.54  1.49   1.59
racegrpHispanic:2     1.51  1.46   1.56
racegrpOther:1        0.66  0.64   0.69
racegrpOther:2        0.71  0.69   0.74
eduHS Grad:1          1.74  1.68   1.79
eduHS Grad:2          1.32  1.28   1.36
eduPrimary:1          2.27  2.14   2.41
eduPrimary:2          1.39  1.30   1.48
eduSome Coll:1        1.59  1.55   1.64
eduSome Coll:2        1.21  1.18   1.25
eduSome HS:1          1.63  1.55   1.70
eduSome HS:2          1.12  1.07   1.17
incomegrpQuantile 2:1 0.94  0.90   0.97
incomegrpQuantile 2:2 1.16  1.11   1.20
incomegrpQuantile 3:1 0.82  0.80   0.85
incomegrpQuantile 3:2 1.11  1.08   1.14
incomegrpQuantile 4:1 1.34  1.29   1.38
incomegrpQuantile 4:2 0.89  0.86   0.92
Odds ratios interpretation

The results indicating variations in obesity status among Texas residents are presented in the above table. The multinomial estimates compare factors predisposing individuals in Texas to risks of being overweight and obese relatice to normal weight. The results indicate curvilnear relationship between age and obesity status with the odds of being overweight (vs. normal) peaking at age 39-59 years and odds of being obese (vs. normal) at highest at age 59-79 years. Compared with NH Whites, the NH Blacks and the Hispanics were at signifcantly greater risks of abnormal weights (overweight and obese) versus normal weight, whereas those of Other racial/ethnic groups have significantly lower risk than the NH Whites. Attainment of level of education lower than completion of college graduation exposes an individual to greater risks of being overweight and obese (vs. normal weight). Lastly, those in second and third wealth quintile spectrum were less likely to be overweight but more likely to be obese compared to those in first wealth quintile category. Meanwhile, those in highest wealth quintile group were more likely to be overweight but less likely to be obese compared to those in first wealth quintile category.



Acknowledgements

  1. This content adapts steps and examples from instructional materials authored by Dr. Corey Sparks and made available at https://rpubs.com/corey_sparks.

  2. The 2016 BRFSS SMART metro area survey data are open source resources made available by the US Centers for Disease Control and Prevention at https://www.cdc.gov/brfss/smart/smart_2016.html.

  3. Stargazer is authored by Marek Hlavac and full document can be downloaded from https://cran.r-project.org/web/packages/stargazer/stargazer.pdf