Surratt Assignment 5

Run a logistic regression model for your outcome of choice with age and two other dummy variables that represent binary concepts. (Be sure to restrict your sample to respondents with valid values on your variables.)

Use a different outcome than the one we used today.
Use a different binary concept than one that we used today.
Interpret the odds ratios for each variable in your model.
Estimate and interpret the marginal effects for a focal binary variable:
MEMs (marginal effects at the means): by R and by hand
MERs (marginal effects at representative values): by R and by hand
AMEs (average marginal effects): by R

I explore the relationship between low (<= $50,000) and high income (> $50,000) and whether the respondent owns or rents. I also include gender. Our IV will be high or low income set to a binomial dummy variable. Our DV will be whether the respondent rents or owns their home. I make a working dataset with the variables INCOME3, SEXVAR, RENTHOM1, and X_AGE80.

library(foreign)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(ggplot2)
library(janitor)

## 
## Attaching package: 'janitor'

## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

library(margins)
library(emmeans)
library(ggplot2)
library(writexl)
library(reticulate)

brfss21 = read.xport("/Users/briansurratt/Library/CloudStorage/OneDrive-UniversityofTexasatSanAntonio/DEM 7283 Stats II/Assignment 4/LLCP2021.XPT_")

#codebook: https://www.cdc.gov/brfss/annual_data/annual_2021.html

head(brfss21)

##   X_STATE FMONTH    IDATE IMONTH IDAY IYEAR DISPCODE      SEQNO     X_PSU
## 1       1      1 01192021     01   19  2021     1100 2021000001 2.021e+09
## 2       1      1 01212021     01   21  2021     1100 2021000002 2.021e+09
## 3       1      1 01212021     01   21  2021     1100 2021000003 2.021e+09
## 4       1      1 01172021     01   17  2021     1100 2021000004 2.021e+09
## 5       1      1 01152021     01   15  2021     1100 2021000005 2.021e+09
## 6       1      1 01142021     01   14  2021     1100 2021000006 2.021e+09
##   CTELENM1 PVTRESD1 COLGHOUS STATERE1 CELPHON1 LADULT1 COLGSEX NUMADULT LANDSEX
## 1        1        1       NA        1        2       1      NA        2      NA
## 2        1        1       NA        1        2       1      NA        2      NA
## 3        1        1       NA        1        2       1      NA        2      NA
## 4        1        1       NA        1        2       1      NA        2      NA
## 5        1        1       NA        1        2       1      NA        2      NA
## 6        1        1       NA        1        2       1      NA        1       1
##   NUMMEN NUMWOMEN RESPSLCT SAFETIME CTELNUM1 CELLFON5 CADULT1 CELLSEX PVTRESD3
## 1      1        1        2       NA       NA       NA      NA      NA       NA
## 2      1        1        2       NA       NA       NA      NA      NA       NA
## 3      1        1        2       NA       NA       NA      NA      NA       NA
## 4      1        1        2       NA       NA       NA      NA      NA       NA
## 5      1        1        1       NA       NA       NA      NA      NA       NA
## 6     NA       NA       NA       NA       NA       NA      NA      NA       NA
##   CCLGHOUS CSTATE1 LANDLINE HHADULT SEXVAR GENHLTH PHYSHLTH MENTHLTH POORHLTH
## 1       NA      NA       NA      NA      2       5       20       10       88
## 2       NA      NA       NA      NA      2       3       88       88       NA
## 3       NA      NA       NA      NA      2       2       88       88       NA
## 4       NA      NA       NA      NA      2       2       88       10       88
## 5       NA      NA       NA      NA      1       5       30       88       30
## 6       NA      NA       NA      NA      1       3       88       88       NA
##   PRIMINSR PERSDOC3 MEDCOST1 CHECKUP1 EXERANY2 BPHIGH6 BPMEDS CHOLCHK3 TOLDHI3
## 1        3        1        2        2        2       3     NA        2       1
## 2        1        2        2        1        1       1      1        2       1
## 3        2        2        2        1        2       1      1        2       2
## 4        2        1        2        1        1       1      1        2       1
## 5        3        1        2        1        1       4     NA        2       1
## 6        3        1        2        1        2       3     NA        2       2
##   CHOLMED3 CVDINFR4 CVDCRHD4 CVDSTRK3 ASTHMA3 ASTHNOW CHCSCNCR CHCOCNCR
## 1        1        2        2        2       1       1        2        2
## 2        1        2        1        2       2      NA        2        2
## 3        2        2        1        2       2      NA        2        2
## 4        2        2        2        2       2      NA        2        2
## 5        1        1        7        1       2      NA        2        2
## 6        2        2        2        2       2      NA        2        2
##   CHCCOPD3 ADDEPEV3 CHCKDNY2 DIABETE4 DIABAGE3 HAVARTH5 ARTHEXER ARTHEDU
## 1        1        2        2        3       NA        1        2       2
## 2        2        2        1        1       98        1        1       2
## 3        2        2        2        1       98        2       NA      NA
## 4        2        2        2        1       56        2       NA      NA
## 5        2        2        2        1       65        2       NA      NA
## 6        1        2        2        3       NA        2       NA      NA
##   LMTJOIN3 ARTHDIS2 JOINPAI2 MARITAL EDUCA RENTHOM1 NUMHHOL3 NUMPHON3 CPDEMO1B
## 1        2        1        8       1     4        1        1        1        1
## 2        1        1       10       9     6        1        2       NA        1
## 3       NA       NA       NA       3     4        1        2       NA        1
## 4       NA       NA       NA       1     4        1        2       NA        1
## 5       NA       NA       NA       1     3        1        2       NA        8
## 6       NA       NA       NA       1     5        1        2       NA        1
##   VETERAN3 EMPLOY1 CHILDREN INCOME3 PREGNANT WEIGHT2 HEIGHT3 DEAF BLIND DECIDE
## 1        2       7       88       5       NA      72     411    2     2      2
## 2        2       8       88      77       NA    7777     506    2     1      1
## 3        2       7       88       3       NA     170     505    2     2      2
## 4        2       7       88       7       NA     195     504    2     2      2
## 5        2       8       88       4       NA     206     511    1     2      2
## 6        2       7       88       6       NA     195     603    2     2      2
##   DIFFWALK DIFFDRES DIFFALON SMOKE100 SMOKDAY2 USENOW3 ECIGNOW1 ALCDAY5
## 1        2        2        1        1        3       3        3     888
## 2        1        2        1        2       NA       3        3     888
## 3        2        2        2        2       NA       3        3     888
## 4        2        2        2        2       NA       3        3     101
## 5        1        2        2        2       NA       2        3     888
## 6        1        2        2        1        3       2        3     888
##   AVEDRNK3 DRNK3GE5 MAXDRNKS FLUSHOT7 FLSHTMY3 IMFVPLA2 PNEUVAC4 HIVTST7
## 1       NA       NA       NA        1    92020        1        1       2
## 2       NA       NA       NA        2       NA       NA        2       2
## 3       NA       NA       NA        2       NA       NA        2       2
## 4        3        1        6        1   102020        1        2       2
## 5       NA       NA       NA        1    92020        1        1       1
## 6       NA       NA       NA        1   102020        5        1       2
##   HIVTSTD3 FRUIT2 FRUITJU2 FVGREEN1 FRENCHF1 POTATOE1 VEGETAB2 PDIABTST
## 1       NA    101      555      204      203      201      101        2
## 2       NA    101      555      201      555      201      207       NA
## 3       NA    101      555      555      201      201      203       NA
## 4       NA    203      205      303      204      308      205       NA
## 5   777777    101      555      101      202      202      101       NA
## 6       NA    202      555      201      555      201      201        2
##   PREDIAB1 INSULIN1 BLDSUGAR FEETCHK3 DOCTDIAB CHKHEMO3 FEETCHK EYEEXAM1
## 1        3       NA       NA       NA       NA       NA      NA       NA
## 2       NA       NA       NA       NA       NA       NA      NA       NA
## 3       NA       NA       NA       NA       NA       NA      NA       NA
## 4       NA       NA       NA       NA       NA       NA      NA       NA
## 5       NA       NA       NA       NA       NA       NA      NA       NA
## 6        3       NA       NA       NA       NA       NA      NA       NA
##   DIABEYE DIABEDU TOLDCFS HAVECFS WORKCFS TOLDHEPC TRETHEPC PRIRHEPC HAVEHEPC
## 1      NA      NA      NA      NA      NA       NA       NA       NA       NA
## 2      NA      NA      NA      NA      NA       NA       NA       NA       NA
## 3      NA      NA      NA      NA      NA       NA       NA       NA       NA
## 4      NA      NA      NA      NA      NA       NA       NA       NA       NA
## 5      NA      NA      NA      NA      NA       NA       NA       NA       NA
## 6      NA      NA      NA      NA      NA       NA       NA       NA       NA
##   HAVEHEPB MEDSHEPB HPVADVC4 HPVADSHT TETANUS1 SHINGLE2 LCSFIRST LCSLAST
## 1       NA       NA       NA       NA       NA       NA       NA      NA
## 2       NA       NA       NA       NA       NA       NA       NA      NA
## 3       NA       NA       NA       NA       NA       NA       NA      NA
## 4       NA       NA       NA       NA       NA       NA       NA      NA
## 5       NA       NA       NA       NA       NA       NA       NA      NA
## 6       NA       NA       NA       NA       NA       NA       NA      NA
##   LCSNUMCG LCSCTSCN HADMAM HOWLONG CERVSCRN CRVCLCNC CRVCLPAP CRVCLHPV HADHYST2
## 1       NA       NA     NA      NA       NA       NA       NA       NA       NA
## 2       NA       NA     NA      NA       NA       NA       NA       NA       NA
## 3       NA       NA     NA      NA       NA       NA       NA       NA       NA
## 4       NA       NA     NA      NA       NA       NA       NA       NA       NA
## 5       NA       NA     NA      NA       NA       NA       NA       NA       NA
## 6       NA       NA     NA      NA       NA       NA       NA       NA       NA
##   PSATEST1 PSATIME1 PCPSARS2 PCSTALK HADSIGM4 COLNSIGM COLNTES1 SIGMTES1
## 1       NA       NA       NA      NA       NA       NA       NA       NA
## 2       NA       NA       NA      NA       NA       NA       NA       NA
## 3       NA       NA       NA      NA       NA       NA       NA       NA
## 4       NA       NA       NA      NA       NA       NA       NA       NA
## 5       NA       NA       NA      NA       NA       NA       NA       NA
## 6       NA       NA       NA      NA       NA       NA       NA       NA
##   LASTSIG4 COLNCNCR VIRCOLO1 VCLNTES1 SMALSTOL STOLTEST STOOLDN1 BLDSTFIT
## 1       NA       NA       NA       NA       NA       NA       NA       NA
## 2       NA       NA       NA       NA       NA       NA       NA       NA
## 3       NA       NA       NA       NA       NA       NA       NA       NA
## 4       NA       NA       NA       NA       NA       NA       NA       NA
## 5       NA       NA       NA       NA       NA       NA       NA       NA
## 6       NA       NA       NA       NA       NA       NA       NA       NA
##   SDNATES1 CNCRDIFF CNCRAGE CNCRTYP1 CSRVTRT3 CSRVDOC1 CSRVSUM CSRVRTRN
## 1       NA       NA      NA       NA       NA       NA      NA       NA
## 2       NA       NA      NA       NA       NA       NA      NA       NA
## 3       NA       NA      NA       NA       NA       NA      NA       NA
## 4       NA       NA      NA       NA       NA       NA      NA       NA
## 5       NA       NA      NA       NA       NA       NA      NA       NA
## 6       NA       NA      NA       NA       NA       NA      NA       NA
##   CSRVINST CSRVINSR CSRVDEIN CSRVCLIN CSRVPAIN CSRVCTL2 HOMBPCHK HOMRGCHK
## 1       NA       NA       NA       NA       NA       NA       NA       NA
## 2       NA       NA       NA       NA       NA       NA       NA       NA
## 3       NA       NA       NA       NA       NA       NA       NA       NA
## 4       NA       NA       NA       NA       NA       NA       NA       NA
## 5       NA       NA       NA       NA       NA       NA       NA       NA
## 6       NA       NA       NA       NA       NA       NA       NA       NA
##   WHEREBP SHAREBP WTCHSALT DRADVISE CIMEMLOS CDHOUSE CDASSIST CDHELP CDSOCIAL
## 1      NA      NA       NA       NA       NA      NA       NA     NA       NA
## 2      NA      NA       NA       NA       NA      NA       NA     NA       NA
## 3      NA      NA       NA       NA       NA      NA       NA     NA       NA
## 4      NA      NA       NA       NA       NA      NA       NA     NA       NA
## 5      NA      NA       NA       NA       NA      NA       NA     NA       NA
## 6      NA      NA       NA       NA       NA      NA       NA     NA       NA
##   CDDISCUS CAREGIV1 CRGVREL4 CRGVLNG1 CRGVHRS1 CRGVPRB3 CRGVALZD CRGVPER1
## 1       NA        2       NA       NA       NA       NA       NA       NA
## 2       NA        2       NA       NA       NA       NA       NA       NA
## 3       NA        2       NA       NA       NA       NA       NA       NA
## 4       NA        2       NA       NA       NA       NA       NA       NA
## 5       NA        2       NA       NA       NA       NA       NA       NA
## 6       NA        2       NA       NA       NA       NA       NA       NA
##   CRGVHOU1 CRGVEXPT ACEDEPRS ACEDRINK ACEDRUGS ACEPRISN ACEDIVRC ACEPUNCH
## 1       NA        2        2        2        2        2        1        1
## 2       NA        2        2        2        2        2        2        1
## 3       NA        2        2        2        2        2        2        1
## 4       NA        7        1        2        2        2        2        1
## 5       NA        2        2        2        2        2        2        1
## 6       NA        2        2        2        2        2        2        1
##   ACEHURT1 ACESWEAR ACETOUCH ACETTHEM ACEHVSEX ACEADSAF ACEADNED MARIJAN1
## 1        1        1        1        1        1        5        3       NA
## 2        1        1        1        1        1        5        5       NA
## 3        1        1        1        1        1        5        5       NA
## 4        3        1        1        1        1        5        5       NA
## 5        1        1        1        1        1        5        5       NA
## 6        1        1        1        1        1        5        5       NA
##   USEMRJN3 RSNMRJN2 LASTSMK2 STOPSMK2 FIREARM5 GUNLOAD LOADULK2 RCSGENDR
## 1       NA       NA       NA       NA       NA      NA       NA       NA
## 2       NA       NA       NA       NA       NA      NA       NA       NA
## 3       NA       NA       NA       NA       NA      NA       NA       NA
## 4       NA       NA       NA       NA       NA      NA       NA       NA
## 5       NA       NA       NA       NA       NA      NA       NA       NA
## 6       NA       NA       NA       NA       NA      NA       NA       NA
##   RCSRLTN2 CASTHDX2 CASTHNO2 BIRTHSEX SOMALE SOFEMALE TRNSGNDR QSTVER QSTLANG
## 1       NA       NA       NA       NA     NA       NA       NA     10       1
## 2       NA       NA       NA       NA     NA       NA       NA     10       1
## 3       NA       NA       NA       NA     NA       NA       NA     10       1
## 4       NA       NA       NA       NA     NA       NA       NA     10       1
## 5       NA       NA       NA       NA     NA       NA       NA     10       1
## 6       NA       NA       NA       NA     NA       NA       NA     10       1
##   X_METSTAT X_URBSTAT MSCODE X_STSTR  X_STRWT X_RAWRAKE X_WT2RAKE X_IMPRACE
## 1         1         1      1   11011 39.76616         2  79.53232         1
## 2         1         1      2   11011 39.76616         2  79.53232         2
## 3         1         1      1   11011 39.76616         2  79.53232         2
## 4         1         1      3   11011 39.76616         2  79.53232         1
## 5         2         1      2   11011 39.76616         2  79.53232         6
## 6         2         2      5   11011 39.76616         1  39.76616         1
##   X_CHISPNC X_CRACE1 X_CPRACE1 CAGEG X_CLLCPWT X_DUALUSE X_DUALCOR X_LLCPWT2
## 1         9       NA        NA    NA        NA         1 0.5190191  874.2429
## 2         9       NA        NA    NA        NA         1 0.5190191  874.2429
## 3         9       NA        NA    NA        NA         1 0.5190191  874.2429
## 4         9       NA        NA    NA        NA         1 0.5190191  874.2429
## 5         9       NA        NA    NA        NA         9        NA 2144.7690
## 6         9       NA        NA    NA        NA         1 0.5190191  437.1215
##    X_LLCPWT X_RFHLTH X_PHYS14D X_MENT14D X_HLTHPLN X_HCVU652 X_TOTINDA
## 1  744.7455        2         3         2         1         9         2
## 2  299.1374        1         1         1         1         9         1
## 3  587.8630        1         1         1         1         9         2
## 4 1099.6216        1         1         2         1         1         1
## 5 1711.8259        2         3         1         1         9         1
## 6  499.3652        1         1         1         1         9         2
##   X_RFHYPE6 X_CHOLCH3 X_RFCHOL3 X_MICHD X_LTASTH1 X_CASTHM1 X_ASTHMS1 X_DRDXAR3
## 1         1         1         2       2         2         2         1         1
## 2         2         1         2       1         1         1         3         1
## 3         2         1         1       1         1         1         3         2
## 4         2         1         2       2         1         1         3         2
## 5         1         1         2       1         1         1         3         2
## 6         1         1         1       2         1         1         3         2
##   X_LMTACT3 X_LMTWRK3 X_PRACE1 X_MRACE1 X_HISPANC X_RACE X_RACEG21 X_RACEGR3
## 1         2         1        1        1         2      1         1         1
## 2         1         1        2        2         2      2         2         2
## 3         3         3        2        2         2      2         2         2
## 4         3         3        1        1         2      1         1         1
## 5         3         3        1        7         2      7         2         4
## 6         3         3        1        1         2      1         1         1
##   X_RACEPRV X_SEX X_AGEG5YR X_AGE65YR X_AGE80 X_AGE_G HTIN4 HTM4 WTKG3 X_BMI5
## 1         1     2        11         2      70       6    59  150  3266   1454
## 2         2     2        10         2      67       6    66  168    NA     NA
## 3         2     2        11         2      72       6    65  165  7711   2829
## 4         1     2         9         1      62       5    64  163  8845   3347
## 5         7     1        12         2      76       6    71  180  9344   2873
## 6         1     1        13         2      80       6    75  191  8845   2437
##   X_BMI5CAT X_RFBMI5 X_CHLDCNT X_EDUCAG X_INCOMG1 X_SMOKER3 X_RFSMOK3 X_CURECI1
## 1         1        1         1        2         3         3         1         1
## 2        NA        9         1        4         9         4         1         1
## 3         3        2         1        2         2         4         1         1
## 4         4        2         1        2         5         4         1         1
## 5         3        2         1        1         2         4         1         1
## 6         2        1         1        3         4         3         1         1
##   DRNKANY5 DROCDY3_ X_RFBING5 X_DRNKWK1 X_RFDRHV7 X_FLSHOT7 X_PNEUMO3 X_AIDTST4
## 1        2        0         1         0         1         1         1         2
## 2        2        0         1         0         1         2         2         2
## 3        2        0         1         0         1         2         2         2
## 4        1       14         2       300         1        NA        NA         2
## 5        2        0         1         0         1         1         1         1
## 6        2        0         1         0         1         1         1         2
##   FTJUDA2_ FRUTDA2_ GRENDA1_ FRNCHDA_ POTADA1_ VEGEDA2_ X_MISFRT1 X_MISVEG1
## 1        0      100       57       43       14      100         0         0
## 2        0      100       14        0       14      100         0         0
## 3        0      100        0       14       14       43         0         0
## 4       71       43       10       57       27       71         0         0
## 5        0      100      100       29       29      100         0         0
## 6        0       29       14        0       14       14         0         0
##   X_FRTRES1 X_VEGRES1 X_FRUTSU1 X_VEGESU1 X_FRTLT1A X_VEGLT1A X_FRT16A X_VEG23A
## 1         1         1       100       214         1         1        1        1
## 2         1         1       100       128         1         1        1        1
## 3         1         1       100        71         1         2        1        1
## 4         1         1       114       165         1         1        1        1
## 5         1         1       100       258         1         1        1        1
## 6         1         1        29        42         2         2        1        1
##   X_FRUITE1 X_VEGETE1
## 1         0         0
## 2         0         0
## 3         0         0
## 4         0         0
## 5         0         0
## 6         0         0

#checking sample size

nrow(brfss21)

## [1] 438693

dat <- brfss21 %>%
  select(INCOME3, SEXVAR, RENTHOM1, X_AGE80)

nrow(dat)

## [1] 438693

First, let’s clean INCOME3 and make a dummy variable.

tabyl(dat$INCOME3)

##  dat$INCOME3     n    percent valid_percent
##            1 10878 0.02479638    0.02530674
##            2 11530 0.02628262    0.02682356
##            3 14960 0.03410130    0.03480316
##            4 21071 0.04803131    0.04901988
##            5 43893 0.10005402    0.10211332
##            6 48339 0.11018867    0.11245655
##            7 59408 0.13542044    0.13820764
##            8 47838 0.10904665    0.11129102
##            9 47642 0.10859986    0.11083504
##           10 19769 0.04506340    0.04599089
##           11 18952 0.04320105    0.04409021
##           77 36138 0.08237651    0.08407197
##           99 49428 0.11267105    0.11499002
##           NA  8847 0.02016672            NA

Filter out 77 (Don’t know/Not sure), 99 (Refused), NA.

dat <- dat %>%
  filter(INCOME3 != "77" & INCOME3 != "99" & INCOME3 != "NA")

nrow(dat)

## [1] 344280

tabyl(dat$INCOME3)

##  dat$INCOME3     n    percent
##            1 10878 0.03159638
##            2 11530 0.03349018
##            3 14960 0.04345300
##            4 21071 0.06120309
##            5 43893 0.12749216
##            6 48339 0.14040606
##            7 59408 0.17255722
##            8 47838 0.13895085
##            9 47642 0.13838155
##           10 19769 0.05742129
##           11 18952 0.05504822

Recode income with a binary dummy variable.

summary(dat$INCOME3)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   5.000   7.000   6.698   9.000  11.000

$50,000 seems like a reasonable cutoff, which is the 6th bracket, so households below $50,000 will be coded as 0 (low income), and households equal to or above $50,000 will be coded as 1 (high income).

dat <- dat %>%
  mutate(HIINCOME = case_when(INCOME3 <= 6 ~ 0,
                              INCOME3 > 6 ~ 1
                              )
  )

tabyl(dat$HIINCOME)

##  dat$HIINCOME      n   percent
##             0 150671 0.4376409
##             1 193609 0.5623591

Next, let’s recode SEXVAR so male = 1.

dat <- dat %>%
  mutate(male = case_when(SEXVAR == 1 ~ 1,
                          TRUE ~ 0
                              )
  )

nrow(dat)

## [1] 344280

tabyl(dat$male)

##  dat$male      n   percent
##         0 179777 0.5221825
##         1 164503 0.4778175

Next, let’s clean the RENTHOM1 variable. Checking the frequency distribution of RENTHOM1.

tabyl(dat$RENTHOM1)

##  dat$RENTHOM1      n      percent valid_percent
##             1 246493 7.159667e-01   0.715970814
##             2  83357 2.421198e-01   0.242121193
##             3  13340 3.874753e-02   0.038747756
##             7    464 1.347740e-03   0.001347748
##             9    624 1.812478e-03   0.001812489
##            NA      2 5.809225e-06            NA

We need to filter out 3 (Other arrangement), 7 (Don’t know/Not Sure), 9 (Refused), and NAs.

dat <- dat %>%
  filter(RENTHOM1 %in% c("1", "2"))
         
nrow(dat)

## [1] 329850

tabyl(dat$RENTHOM1)

##  dat$RENTHOM1      n   percent
##             1 246493 0.7472882
##             2  83357 0.2527118

Code a dummy variable, RENTOWN, for renting or owning. 1 = OWN and 0 = RENT.

dat <- dat %>%
  mutate(RENTOWN = case_when(RENTHOM1 == 1 ~ 1,
                             RENTHOM1 == 2 ~ 0
                             )
         )

nrow(dat)

## [1] 329850

tabyl(dat$RENTOWN)

##  dat$RENTOWN      n   percent
##            0  83357 0.2527118
##            1 246493 0.7472882

Check the frequency distribtuion of X_AGE80.

tabyl(dat$X_AGE80)

##  dat$X_AGE80     n     percent
##           18   969 0.002937699
##           19  1267 0.003841140
##           20  1568 0.004753676
##           21  2014 0.006105806
##           22  2164 0.006560558
##           23  2632 0.007979385
##           24  2670 0.008094588
##           25  3045 0.009231469
##           26  2956 0.008961649
##           27  3198 0.009695316
##           28  3566 0.010810975
##           29  3583 0.010862513
##           30  4289 0.013002880
##           31  3745 0.011353646
##           32  4318 0.013090799
##           33  4161 0.012614825
##           34  4307 0.013057450
##           35  4708 0.014273154
##           36  4599 0.013942701
##           37  4655 0.014112475
##           38  4955 0.015021980
##           39  4774 0.014473245
##           40  5625 0.017053206
##           41  4623 0.014015462
##           42  5170 0.015673791
##           43  4709 0.014276186
##           44  4448 0.013484917
##           45  4966 0.015055328
##           46  4495 0.013627406
##           47  4808 0.014576323
##           48  4826 0.014630893
##           49  5027 0.015240261
##           50  6359 0.019278460
##           51  5533 0.016774291
##           52  6001 0.018193118
##           53  5643 0.017107776
##           54  5440 0.016492345
##           55  6033 0.018290132
##           56  5862 0.017771714
##           57  6088 0.018456874
##           58  6456 0.019572533
##           59  6279 0.019035925
##           60  7325 0.022207064
##           61  6273 0.019017735
##           62  7237 0.021940276
##           63  6943 0.021048962
##           64  7107 0.021546157
##           65  7686 0.023301501
##           66  7041 0.021346066
##           67  7158 0.021700773
##           68  7052 0.021379415
##           69  6864 0.020809459
##           70  7259 0.022006973
##           71  6101 0.018496286
##           72  6716 0.020360770
##           73  6299 0.019096559
##           74  6183 0.018744884
##           75  5139 0.015579809
##           76  4215 0.012778536
##           77  4240 0.012854328
##           78  4314 0.013078672
##           79  3530 0.010701834
##           80 22634 0.068619069

dat$X_AGE80SQ<- dat$X_AGE80*dat$X_AGE80

dat$X_AGE80CU<- dat$X_AGE80*dat$X_AGE80SQ

mean(dat$RENTOWN)

## [1] 0.7472882

mean(dat$X_AGE80)

## [1] 54.74008

mean(dat$X_AGE80SQ)

## [1] 3279.891

mean(dat$X_AGE80CU)

## [1] 209263.7

mean(dat$male)

## [1] 0.4771108

mean(dat$HIINCOME)

## [1] 0.5699591

# Calculating statistics for specific ages so that I can compare actual values to model-based values.

ageprobs<- dat %>%
    group_by(X_AGE80) %>%
    summarize(p=mean(RENTOWN),n=n())
              
ageprobs$num <- ageprobs$p*(1-ageprobs$p)

ageprobs$sep <- sqrt(ageprobs$num/ageprobs$n)

ageprobs$me <- 2*ageprobs$sep
  
ggplot(data =ageprobs, aes(x = X_AGE80, y = p, ymin=p-me, ymax=p+me)) +
     geom_line() + ylim(0,1) + geom_ribbon(alpha=0.3,aes(color=NULL)) +
labs(title="Home Ownership by Age", 
       subtitle="Respondents in Sample", 
       caption="Source: 2021 BRFSS Data") +
       xlab(label="Age at Survey") +
  ylab(label="Proportion of Respondents who Own their Residence")

# Splitting out results into an Excel file so that we can compare actual values with predicted values with different specifications of age

write_xlsx(ageprobs, "/Users/briansurratt/Library/CloudStorage/OneDrive-UniversityofTexasatSanAntonio/DEM 7283 Stats II/Assignment 5/ageprobsincome.xlsx")

model <- glm(RENTOWN ~ X_AGE80, family = "binomial", data = dat)

summary(model)

## 
## Call:
## glm(formula = RENTOWN ~ X_AGE80, family = "binomial", data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2637  -0.9686   0.5421   0.7558   1.4237  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.4462900  0.0136237  -106.2   <2e-16 ***
## X_AGE80      0.0491021  0.0002657   184.8   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 334137  on 329848  degrees of freedom
## AIC: 334141
## 
## Number of Fisher Scoring iterations: 4

# Exponentiate

modelExp = model
modelExp$coefficients = exp(modelExp$coefficients)
summary(modelExp)

## 
## Call:
## glm(formula = RENTOWN ~ X_AGE80, family = "binomial", data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2637  -0.9686   0.5421   0.7558   1.4237  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) 0.2354422  0.0136237   17.28   <2e-16 ***
## X_AGE80     1.0503276  0.0002657 3952.80   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 334137  on 329848  degrees of freedom
## AIC: 334141
## 
## Number of Fisher Scoring iterations: 4

# Showing predicted values in a spreadsheet

# MERs

# AME = Average Marginal Effects

MERs<-margins(model, variables="X_AGE80", at=list(X_AGE80=c(18,20,30,40,50,60,70,80)))

summary(MERs)

##   factor X_AGE80    AME     SE        z      p  lower  upper
##  X_AGE80 18.0000 0.0114 0.0000 295.5095 0.0000 0.0113 0.0114
##  X_AGE80 20.0000 0.0116 0.0000 263.2208 0.0000 0.0116 0.0117
##  X_AGE80 30.0000 0.0123 0.0001 182.5560 0.0000 0.0121 0.0124
##  X_AGE80 40.0000 0.0115 0.0001 164.1054 0.0000 0.0114 0.0116
##  X_AGE80 50.0000 0.0096 0.0001 178.2028 0.0000 0.0095 0.0097
##  X_AGE80 60.0000 0.0073 0.0000 225.7351 0.0000 0.0073 0.0074
##  X_AGE80 70.0000 0.0052 0.0000 291.8333 0.0000 0.0052 0.0052
##  X_AGE80 80.0000 0.0035 0.0000 252.6118 0.0000 0.0035 0.0035

MERs<-margins(model, variables="X_AGE80", at=list(X_AGE80=c(39,40,41)))

summary(MERs)

##   factor X_AGE80    AME     SE        z      p  lower  upper
##  X_AGE80 39.0000 0.0116 0.0001 164.3223 0.0000 0.0115 0.0118
##  X_AGE80 40.0000 0.0115 0.0001 164.1054 0.0000 0.0114 0.0116
##  X_AGE80 41.0000 0.0113 0.0001 164.2188 0.0000 0.0112 0.0115

# This command is computing these for each one-unit change in x

#AMEs

AMEs<-margins(model,variables=c("X_AGE80"))

summary(AMEs)

##   factor    AME     SE        z      p  lower  upper
##  X_AGE80 0.0082 0.0000 224.1707 0.0000 0.0081 0.0082

# But what if the relationship is not linear?

model <- glm(RENTOWN ~ X_AGE80 + X_AGE80SQ, family = "binomial", data = dat)

summary(model)

## 
## Call:
## glm(formula = RENTOWN ~ X_AGE80 + X_AGE80SQ, family = "binomial", 
##     data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0287  -0.7311   0.5434   0.6757   1.7521  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -3.579e+00  3.936e-02  -90.94   <2e-16 ***
## X_AGE80      1.440e-01  1.648e-03   87.41   <2e-16 ***
## X_AGE80SQ   -9.429e-04  1.596e-05  -59.08   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 330670  on 329847  degrees of freedom
## AIC: 330676
## 
## Number of Fisher Scoring iterations: 4

modelExp = model
modelExp$coefficients = exp(modelExp$coefficients)
summary(modelExp)

## 
## Call:
## glm(formula = RENTOWN ~ X_AGE80 + X_AGE80SQ, family = "binomial", 
##     data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0287  -0.7311   0.5434   0.6757   1.7521  
## 
## Coefficients:
##              Estimate Std. Error   z value Pr(>|z|)    
## (Intercept) 2.789e-02  3.936e-02     0.709    0.478    
## X_AGE80     1.155e+00  1.648e-03   700.845   <2e-16 ***
## X_AGE80SQ   9.991e-01  1.596e-05 62597.613   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 330670  on 329847  degrees of freedom
## AIC: 330676
## 
## Number of Fisher Scoring iterations: 4

model <- glm(RENTOWN ~ X_AGE80 + X_AGE80SQ + X_AGE80CU, family = "binomial", data = dat)

summary(model)

## 
## Call:
## glm(formula = RENTOWN ~ X_AGE80 + X_AGE80SQ + X_AGE80CU, family = "binomial", 
##     data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0782  -0.6625   0.5609   0.6797   1.8674  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -5.080e+00  1.108e-01  -45.86   <2e-16 ***
## X_AGE80      2.479e-01  7.327e-03   33.84   <2e-16 ***
## X_AGE80SQ   -3.139e-03  1.515e-04  -20.72   <2e-16 ***
## X_AGE80CU    1.441e-05  9.879e-07   14.59   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 330455  on 329846  degrees of freedom
## AIC: 330463
## 
## Number of Fisher Scoring iterations: 4

modelExp = model
modelExp$coefficients = exp(modelExp$coefficients)
summary(modelExp)

## 
## Call:
## glm(formula = RENTOWN ~ X_AGE80 + X_AGE80SQ + X_AGE80CU, family = "binomial", 
##     data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0782  -0.6625   0.5609   0.6797   1.8674  
## 
## Coefficients:
##              Estimate Std. Error   z value Pr(>|z|)    
## (Intercept) 6.217e-03  1.108e-01 5.600e-02    0.955    
## X_AGE80     1.281e+00  7.327e-03 1.749e+02   <2e-16 ***
## X_AGE80SQ   9.969e-01  1.515e-04 6.579e+03   <2e-16 ***
## X_AGE80CU   1.000e+00  9.879e-07 1.012e+06   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 330455  on 329846  degrees of freedom
## AIC: 330463
## 
## Number of Fisher Scoring iterations: 4

# We really should be using this last model but it is not clear how we can capture marginal effects for nonlinear specifications in R.

# Adding in the binomial income variable.

ageprobs <- dat %>%
    group_by(HIINCOME, male, X_AGE80) %>%
    summarize(pown=mean(RENTOWN),n=n())

## `summarise()` has grouped output by 'HIINCOME', 'male'. You can override using
## the `.groups` argument.

ggplot(data = ageprobs, aes(x = X_AGE80, y = pown, color = HIINCOME, group = HIINCOME)) +
  geom_line() +
  facet_wrap(~ male)

write_xlsx(ageprobs,"/Users/briansurratt/Library/CloudStorage/OneDrive-UniversityofTexasatSanAntonio/DEM 7283 Stats II/Assignment 5/probsown.xlsx")

model <- glm(RENTOWN ~ HIINCOME, family = "binomial", data = dat)

summary(model)

## 
## Call:
## glm(formula = RENTOWN ~ HIINCOME, family = "binomial", data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.9684  -1.3599   0.5579   1.0053   1.0053  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) 0.419396   0.005427   77.27   <2e-16 ***
## HIINCOME    1.362247   0.008520  159.90   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 345581  on 329848  degrees of freedom
## AIC: 345585
## 
## Number of Fisher Scoring iterations: 4

modelExp = model
modelExp$coefficients = exp(modelExp$coefficients)
summary(modelExp)

## 
## Call:
## glm(formula = RENTOWN ~ HIINCOME, family = "binomial", data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.9684  -1.3599   0.5579   1.0053   1.0053  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) 1.521043   0.005427   280.2   <2e-16 ***
## HIINCOME    3.904960   0.008520   458.3   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 345581  on 329848  degrees of freedom
## AIC: 345585
## 
## Number of Fisher Scoring iterations: 4

If you make more than $50,000, the odds of owning your own residence are 3.9 and the difference is statistically significant. You are almost 4 times as likely to own your own home as a person who makes less than $50,000.

sapply(lapply(dat, unique), length)

##   INCOME3    SEXVAR  RENTHOM1   X_AGE80  HIINCOME      male   RENTOWN X_AGE80SQ 
##        11         2         2        63         2         2         2        63 
## X_AGE80CU 
##        63

model <- glm(RENTOWN ~ male, family = "binomial", data = dat)

summary(model)

## 
## Call:
## glm(formula = RENTOWN ~ male, family = "binomial", data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.6595  -1.6576   0.7626   0.7640   0.7640  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  1.086212   0.005544 195.938   <2e-16 ***
## male        -0.004213   0.008021  -0.525    0.599    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 372925  on 329848  degrees of freedom
## AIC: 372929
## 
## Number of Fisher Scoring iterations: 4

modelExp = model
modelExp$coefficients = exp(modelExp$coefficients)
summary(modelExp)

## 
## Call:
## glm(formula = RENTOWN ~ male, family = "binomial", data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.6595  -1.6576   0.7626   0.7640   0.7640  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) 2.963029   0.005544   534.5   <2e-16 ***
## male        0.995796   0.008021   124.1   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 372925  on 329848  degrees of freedom
## AIC: 372929
## 
## Number of Fisher Scoring iterations: 4

If you are male, the odds of your owning your own home (compared to non-male) is .996 and the difference is statistically significant. In this case, the magnitude of the effect is marginal although the results is statistically significant. The odds of a male owning a home are almost even to a non-male owning a home.

model <- glm(RENTOWN ~ X_AGE80 + HIINCOME + male, family = "binomial", data = dat)

summary(model)

## 
## Call:
## glm(formula = RENTOWN ~ X_AGE80 + HIINCOME + male, family = "binomial", 
##     data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7119  -0.5857   0.4504   0.6927   1.9731  
## 
## Coefficients:
##               Estimate Std. Error  z value Pr(>|z|)    
## (Intercept) -2.7667854  0.0176591 -156.678  < 2e-16 ***
## X_AGE80      0.0582606  0.0002924  199.231  < 2e-16 ***
## HIINCOME     1.7575253  0.0097446  180.360  < 2e-16 ***
## male        -0.0743643  0.0092136   -8.071 6.96e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 297127  on 329846  degrees of freedom
## AIC: 297135
## 
## Number of Fisher Scoring iterations: 5

modelExp = model
modelExp$coefficients = exp(modelExp$coefficients)
summary(modelExp)

## 
## Call:
## glm(formula = RENTOWN ~ X_AGE80 + HIINCOME + male, family = "binomial", 
##     data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7119  -0.5857   0.4504   0.6927   1.9731  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) 0.0628638  0.0176591    3.56 0.000371 ***
## X_AGE80     1.0599912  0.0002924 3624.80  < 2e-16 ***
## HIINCOME    5.7980709  0.0097446  595.01  < 2e-16 ***
## male        0.9283334  0.0092136  100.76  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 372925  on 329849  degrees of freedom
## Residual deviance: 297127  on 329846  degrees of freedom
## AIC: 297135
## 
## Number of Fisher Scoring iterations: 5

# MEMs: marginal effects at the mean

emmeans(model, specs = "HIINCOME",
        regrid = "response")

##  HIINCOME   prob        SE  df asymp.LCL asymp.UCL
##         0 0.5951 0.0014813 Inf    0.5922    0.5980
##         1 0.8949 0.0007104 Inf    0.8935    0.8963
## 
## Results are averaged over the levels of: male 
## Confidence level used: 0.95

Interpretation

The marginal effect at the mean age and sex is that a high income person has a 89.5% probability of owning their residence. A low income person has a 60% chance of owning their home.

emmeans(model, specs = "HIINCOME",
        regrid = "response") |>
  contrast(method = "revpairwise") |>
  confint()

##  contrast              estimate      SE  df asymp.LCL asymp.UCL
##  HIINCOME1 - HIINCOME0      0.3 0.00164 Inf     0.297     0.303
## 
## Results are averaged over the levels of: male 
## Confidence level used: 0.95

#MERs: marginal effects at representative values

emmeans(model, specs = "HIINCOME", by = "male", at = list(X_AGE80=20),
        regrid = "response") |>
  contrast(method = "revpairwise") |>
  confint()

## male = 0:
##  contrast              estimate      SE  df asymp.LCL asymp.UCL
##  HIINCOME1 - HIINCOME0    0.371 0.00206 Inf     0.367     0.375
## 
## male = 1:
##  contrast              estimate      SE  df asymp.LCL asymp.UCL
##  HIINCOME1 - HIINCOME0    0.363 0.00199 Inf     0.359     0.367
## 
## Confidence level used: 0.95

Interpretation

At the age of 20, the difference in probability of ownership for a high income male over a low income male is 36.3%. The difference in probability of ownership for a high income non-male over a low income non-male is 37.1%.

emmeans(model, specs = "HIINCOME", by = "male", at = list(X_AGE80=40),
        regrid = "response") |>
  contrast(method = "revpairwise") |>
  confint()

## male = 0:
##  contrast              estimate      SE  df asymp.LCL asymp.UCL
##  HIINCOME1 - HIINCOME0    0.397 0.00203 Inf     0.393     0.401
## 
## male = 1:
##  contrast              estimate      SE  df asymp.LCL asymp.UCL
##  HIINCOME1 - HIINCOME0    0.402 0.00208 Inf     0.398     0.406
## 
## Confidence level used: 0.95

Interpretation

At the age of 40, the difference in probability of ownership for a high income male over a low income male is 40.2%. The difference in probability of ownership for a high income non-male over a low income non-male is 39.7%.

emmeans(model, specs = "HIINCOME", by = "male", at = list(X_AGE80=60),
        regrid = "response") |>
  contrast(method = "revpairwise") |>
  confint()

## male = 0:
##  contrast              estimate      SE  df asymp.LCL asymp.UCL
##  HIINCOME1 - HIINCOME0    0.249 0.00152 Inf     0.246     0.252
## 
## male = 1:
##  contrast              estimate      SE  df asymp.LCL asymp.UCL
##  HIINCOME1 - HIINCOME0    0.260 0.00170 Inf     0.256     0.263
## 
## Confidence level used: 0.95

Interpretation

At the age of 60, the difference in probability of ownership for a high income male over a low income male is 26.0%. The difference in probability of ownership for a high income non-male over a low income non-male is 24.9%.

emmeans(model, specs = "HIINCOME", by = "male", at = list(X_AGE80=80),
        regrid = "response") |>
  contrast(method = "revpairwise") |>
  confint()

## male = 0:
##  contrast              estimate       SE  df asymp.LCL asymp.UCL
##  HIINCOME1 - HIINCOME0    0.105 0.000978 Inf     0.104     0.107
## 
## male = 1:
##  contrast              estimate       SE  df asymp.LCL asymp.UCL
##  HIINCOME1 - HIINCOME0    0.112 0.001108 Inf     0.110     0.114
## 
## Confidence level used: 0.95

Interpretation

At the age of 80, the difference in probability of ownership for a high income male over a low income male is 11.2%. The difference in probability of ownership for a high income non-male over a low income non-male is 10.5%.

#AMEs: average marginal effects

hi_eff <- predict(model, newdata = data.frame(HIINCOME=1, male=dat$male, X_AGE80=dat$X_AGE80), 
type = "response")

low_eff <- predict(model, newdata = data.frame(HIINCOME=0, male=dat$male, X_AGE80=dat$X_AGE80), 
type = "response")

mean(hi_eff - low_eff)

## [1] 0.2778078

Interpretation

The average marginal effect of high income on owning your residence among all ages and sexes is a 27.8% increase in probability.

If I were to do this exercise again, I would choose a different binomial variable than income level. It is a ordinal numerical variable, so there are other statistical models better suited to analyzing this variable.

I initially misinterpreted the instructions for this exercise, and I thought I was supposed to find a numerical continuous variable to replace age. I experimented with number of children in the house and income as a replacement for age but both were problematic. I had to reread the instructions a few times and go through the example to understand I was supposed to use two binary IV variables as well as age.

Surratt Assignment 5

Brian Surratt

2023-02-27

Run a logistic regression model for your outcome of choice with age and two other dummy variables that represent binary concepts. (Be sure to restrict your sample to respondents with valid values on your variables.)

Interpretation

Interpretation

Interpretation

Interpretation

Interpretation

Interpretation