agenda for today
check-in
recap: generalized linear model
ordinal, poisson, and negative binomial regression in SAS
recap
Hint: some questions may have more than one best answers 😄
Q1. Which term includes more models, general linear model or generalized linear model?
- generalized linear model
- general linear model
- aren’t they the same thing?
Q2. which model you would choose if your outcome variable is ordinal with equal spacing?
- poisson regression
- multinomial regression
- ordinary linear regression with a quadratic term
- proportional odds regression
Q4. What if you are predicting how many boxes of cookies individuals bought over the past two weeks?
- poisson regression
- negative binomial regression
- ordinary linear regression
- proportional odds regression
Q5. Match the link function with the model:
link functions:
- identity; b. log; c. logit; d. cumulative logit
models:
I. multinomial regression: [ ]
II. ordinary linear regression: [ ]
III. logistic regression: [ ]
IV. proportional odds regression: [ ]
V. negative binomial regression: [ ]
model count and ordinal variables in SAS
syntax file: HB761_Recitation_Week9.sas
model count variables
PROC GENMOD - Generalized Linear Models
outcome: MUD (range=0-30, count/integer), number of perceived mentally unhealthy days in the past 30 days.
predictors: sex (male=1, female=0); sleep (average daily hours of sleep in the past month).
Poisson regression: link=log, dist=poisson
proc genmod data=brfss;
model MUD=sleep_c male/link=log dist=poisson;
output out=pofit pred=yhat_po;
estimate "log odds - male" int 0 sleep_c 0 male 1 / exp;
estimate "log odds - sleep_c" int 0 sleep_c 1 male 0 / exp;
run;


Interpretations
Intercept: expected MUD for a female with 7 hours of daily sleep is exp(2.9132)=18.5 days.
Male: being male was associated with a 0.77 factor decrease in expected number of MUD, controlling for daily sleep hour.
Sleep_c: a one hour increase in average daily sleep hour is associated with an exp(-0.2107) = 0.81 factor decrease in expected MUD, controlling for sex.
Negative Binomial regression: link=log, dist=nb
proc genmod data=brfss;
model MUD=sleep_c male/link=log dist=nb;
output out=nbfit pred=yhat_nb;
run;
We would choose Negative Binomial over Poission because estimate of dispersion paramater is significant from zero (estimate=8, 95%CI:7.53, 8.52)

Can also compare AIC/BIC (a smaller number is referred)
Poisson: AIC=56676.6212, BIC=56695.8739
Negative binomial: AIC=16384.7893, BIC=16410.4597 ❤️
Interpretations: similar to Poisson regression.
OLS: link=id, dist=normal
proc genmod data=brfss;
model MUD=sleep_c male/link=id dist=normal;
output out=lfit pred=yhat_l;
run;
Logistic regression: link=logit dist=binomial
proc genmod data=temp1 descending;
class male;
model FMD=sleep male/link=logit dist=binomial;
run;
[optional] Some visualizations
proc gplot data=pofit;
plot yhat_po*MUD;
run; quit;

proc sgplot data=pofit;
histogram MUD/ binwidth=1 transparency=0.5
name='o' legendlabel= "observed";
histogram yhat_po/ binwidth=1 transparency=0.5
name='p' legendlabel= "yhat_Poisson";
keylegend 'o' 'p'/ location=inside position=topright across=1 noborder;
yaxis offsetmin=0;
xaxis display=(nolabel);
run;

proc sort data=pofit;
by male sleep_c;
run;
proc sgplot data = pofit;
series x = sleep_c y = yhat_po/group=male;
run;

model ordinal variables
proportional odds assumption:
predictor effects on the odds of increasing adjacent response categories, b, are constant across all adjacent categories
outcome variable: levels of function status, ranging from fully active (0) to disabled (4)
Score test: a non-significant p indicates the proportionated odds assumption is appropriate.

For practice purposes, let’s still try to interpret the outputs!
