Data

This data comes from a German health survey from 1998. There are 1,127 observations on 3 variables, including: numvisit, the number of visits to doctor during 1998; badh, the patient’s perceived health status (1=patient claims to be in bad health; 0=not in bad health); and age of patient.

I would expect older patients to visit the doctor more frequently. I would also expect patients who claim to be in bad health to visit more than patients who claim to not be in bad health.

Results

library(Zelig)
library(ZeligChoice)
library(faraway)
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.4.4
library(tidyr)
## Warning: package 'tidyr' was built under R version 3.4.4
library(readr)
health <- read_csv("/Users/meredithpowers/Desktop/badhealth.csv")
## Warning: Missing column names filled in: 'X1' [1]
tibble::glimpse(health)
## Observations: 1,127
## Variables: 4
## $ X1       <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16...
## $ numvisit <int> 30, 20, 16, 20, 15, 15, 13, 15, 15, 40, 15, 13, 12, 1...
## $ badh     <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0,...
## $ age      <int> 58, 54, 44, 57, 33, 28, 37, 31, 30, 47, 51, 40, 40, 4...
str(health)
## Classes 'tbl_df', 'tbl' and 'data.frame':    1127 obs. of  4 variables:
##  $ X1      : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ numvisit: int  30 20 16 20 15 15 13 15 15 40 ...
##  $ badh    : int  0 0 0 0 0 0 0 0 0 1 ...
##  $ age     : int  58 54 44 57 33 28 37 31 30 47 ...
##  - attr(*, "spec")=List of 2
##   ..$ cols   :List of 4
##   .. ..$ X1      : list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   .. ..$ numvisit: list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   .. ..$ badh    : list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   .. ..$ age     : list()
##   .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
##   ..$ default: list()
##   .. ..- attr(*, "class")= chr  "collector_guess" "collector"
##   ..- attr(*, "class")= chr "col_spec"
head(health)
h1 <- zelig(numvisit ~ badh + age + age*badh, model = "poisson", data = health, cite = F)
summary(h1)
## Model: 
## 
## Call:
## z5$zelig(formula = numvisit ~ badh + age + age * badh, data = health)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.5892  -1.8947  -0.6480   0.6023   9.9147  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.346885   0.081416   4.261 2.04e-05
## badh         1.568949   0.181533   8.643  < 2e-16
## age          0.008503   0.002086   4.077 4.57e-05
## badh:age    -0.010918   0.004196  -2.602  0.00926
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4020.3  on 1126  degrees of freedom
## Residual deviance: 3458.6  on 1123  degrees of freedom
## AIC: 5633.8
## 
## Number of Fisher Scoring iterations: 6
## 
## Next step: Use 'setx' method

Clearly, health status greatly increases the number of visits to the doctor – that’s an increase of 1.5 for health alone. Age also has an increase, though not as dramatic.

data(health)
## Warning in data(health): data set 'health' not found
a.range = min(health$age):max(health$age)
h1$setrange(age=a.range)
h1$sim()
h1$graph()
## age chosen as the x-axis variable. Use the var argument to specify a different variable.

As age increases, the number of visits to the doctor increased as well.