library(ggplot2)
library(readr)
library(dplyr) 
library(choroplethr)
library(extrafont)
library(extrafontdb)
library(RColorBrewer)
library(scales)
library(gridExtra)

Read in my data.

hc1 <- read.csv("D:/HealthCare/PlanAttributes.csv", stringsAsFactors = FALSE)

The real test of how good a healthcare plan is can be difficult to assess, but one very crude benchmark is the maximum out of pocket. In layman’s terms, that’s the maximum a subscriber has to pay if the absolute worst happens. In this case, I’m looking at family MOOP. Say a family slid off the road during a snow storm and several people got hurt, they needed expensive surgery and rehab. The max out of pocket is the amount of money that the family would have to pay before the insurance covers everything 100%. I want to take a look at that number.

Here’s a quick glimpse and then some data cleaning.

head(hc1$TEHBInnTier1FamilyMOOP, 50)
##  [1] ""        ""        ""        ""        ""        ""        "$12,700"
##  [8] "$8,000"  "$8,000"  "$12,700" "$0"      "$9,500"  "$12,000" "$0"     
## [15] ""        ""        ""        "$9,500"  "$12,000" "$12,700" "$12,700"
## [22] "$0"      ""        ""        ""        ""        ""        "$12,700"
## [29] "$12,000" "$9,500"  "$12,000" "$12,700" "$10,400" "$2,500"  "$1,000" 
## [36] ""        "$9,500"  "$0"      "$12,700" "$10,400" "$2,500"  "$1,000" 
## [43] "$12,700" ""        ""        ""        ""        ""        ""       
## [50] ""
hc1$TEHBInnTier1FamilyMOOP<- gsub(',', '', hc1$TEHBInnTier1FamilyMOOP)
hc1$TEHBInnTier1FamilyMOOP<- gsub('\\$', '', hc1$TEHBInnTier1FamilyMOOP)
hc1$moop<- as.numeric(hc1$TEHBInnTier1FamilyMOOP)
hc1$moop[is.na(hc1$moop)] <- 0
ggplot(hc1, aes(x = hc1$moop)) + geom_histogram()

There’s a lot of plans in there that have a zero family MOOP. That’s not accurate. I will only stick to plans that actually have a dollar amount.

moop <- subset(hc1, moop > 0)

I’m going to map this to see which states have the worst MOOP on average for a family. I used a function that turns state abbreviations to a format that choropleth can actually use.

df <- aggregate(hc1$moop, list(hc1$StateCode), mean)
df$region<-stateFromLower(df$Group.1)
df$value <- df$x
choro = StateChoropleth$new(df)
choro$title = "Average Max Out of Pocket"
choro$set_num_colors(1)
myPalette <- colorRampPalette(brewer.pal(9, "Reds"))
choro$ggplot_polygon = geom_polygon(aes(fill = value), color = NA)
choro$ggplot_scale = scale_fill_gradientn(name = "MOOP", colours = myPalette(9))
choro$render()

Idaho is easily the worst. Along with Arizona and New Mexico. Things look pretty uniform throughout the rest of the country, however. I want to look how MOOP has changed over time as well. I only have two years for the ACA: 2014 and 2015. I would like to see if MOOP has gotten higher.

moop14 <- subset(moop, BusinessYear == "2014")
dim(moop14)
## [1] 11763   177
moop15 <- subset(moop, BusinessYear == "2015")
dim(moop15)
## [1] 22314   177

One thing to note here: there are LOTS more total plans in 2015. Almost twice as many, actually. That means I need to think about how to display this visually so I don’t mislead.

table(moop14$moop)
## 
##   400   500   600   700   800   900   950  1000  1100  1150  1200  1240 
##     2     1     1     2    40     3     3   178     5     8    39     1 
##  1300  1400  1500  1508  1600  1660  1700  1900  2000  2200  2300  2350 
##    29    44   115    16    12     4     5     1   266    25    14    12 
##  2400  2500  2600  2700  2800  2900  3000  3100  3200  3300  3400  3500 
##    29   126    14     6    65    39   385     2     1     4    20    35 
##  3600  3700  3750  3800  3900  4000  4200  4230  4300  4400  4500  4600 
##    13    14     1    19     1   302    30     8    12     9   533     6 
##  4700  5000  5200  5300  5350  5400  5500  5600  5800  5840  5900  6000 
##     5   212    28     7    10    12    19    13    12     3     2   485 
##  6250  6300  6400  6500  6600  6750  6800  6900  7000  7050  7200  7300 
##     2     3    21    24     2     1     4     2   409    10    26    16 
##  7400  7500  7600  7700  7800  7900  8000  8200  8250  8300  8400  8500 
##     8    22    27     1     2     4   380     5     8     3    30    23 
##  8700  8800  9000  9100  9200  9300  9400  9500  9600  9700  9750  9800 
##     3    16   303     4     4     4     6    56    13     3    42    10 
## 10000 10160 10200 10300 10338 10360 10400 10500 10600 11000 11200 11400 
##   741     4    18     2     4     4   402    83     7   181     2     2 
## 11500 11600 12000 12200 12400 12500 12600 12650 12675 12700 
##    10    31   530     4     2   332   341     3     2  4253
table(moop15$moop)
## 
##   300   400   500   700   800   850   900   950  1000  1050  1100  1150 
##     1     4     1     3    51     2    17     6   469    10    43     8 
##  1200  1240  1250  1300  1350  1400  1450  1500  1508  1520  1600  1650 
##   100     1     1    46     4    64     3   263    23     2    15     2 
##  1660  1700  1750  1800  1900  2000  2100  2200  2300  2400  2500  2600 
##     2    10     2     1     1   387     1    40    21    52   143    36 
##  2700  2800  2900  2950  3000  3050  3100  3200  3250  3300  3400  3450 
##    24    82   174     1   688     1     5    23     3    10    24     1 
##  3500  3600  3650  3700  3750  3800  3850  3900  4000  4100  4150  4200 
##    62    47     1    47     9    12     2    11   691     6     4    88 
##  4250  4300  4400  4500  4600  4800  5000  5100  5200  5300  5350  5400 
##     6    12    23   668     9    15   232     5    29    20     2     9 
##  5500  5600  5800  5840  5900  6000  6150  6200  6250  6350  6400  6500 
##    98    50    25     6     4   611     3     9     4     3    16    33 
##  6600  6700  6750  6800  6900  6950  7000  7050  7100  7200  7300  7400 
##    17     6     2     7     9     2   835     6     1    50    33     9 
##  7450  7500  7600  7650  7700  7800  7850  7900  8000  8100  8150  8200 
##     4    15    53     2     1    10     2    10   593     2     1    16 
##  8250  8300  8400  8500  8600  8700  8800  8850  8900  9000  9050  9100 
##    13     7    39    99     1     5    17     2     1   451     1     6 
##  9150  9200  9300  9400  9450  9500  9600  9650  9700  9750  9800  9850 
##     1    64    12    27     1    79    31     2    42   229    25     5 
##  9900  9950 10000 10050 10100 10200 10300 10338 10360 10400 10500 10600 
##     9     3  1053     2     2    46     6     4     4   668   283    19 
## 10700 10800 10850 10950 11000 11050 11100 11150 11200 11250 11300 11400 
##     3     9     3     8   435     3     6     2    12     2   126    18 
## 11450 11500 11550 11600 11700 11750 11800 11900 12000 12050 12200 12400 
##     5    46     3    40     2     3    18     6   963     3     7    17 
## 12450 12500 12550 12600 12650 12700 12800 12900 13000 13100 13200 
##     3   304     3   558     3  4558   108   559   391     9  3390

Now, this is where things get very interesting. The max MOOP in 2014 was $12,700 and there are many plans with that MOOP. 4253 in total. That’s about a third of all plans at the max MOOP. But then in 2015 things change. The max MOOP goes up to $13,200. And now many plans have higher MOOPs. Now over 9000 plans have family MOOPs of $12,700+. That’s a huge increase. As the MOOP ceiling has gone up, health insurers have moved their MOOP up as well. That’s a worrying trend.

Remebering that the difference between the total count of plans in 2014 and 2015 is large, I don’t want to use raw numbers. Instead, I want to use percentages to display the information in a way that makes sense.

g1<-ggplot(moop14, aes(x = moop14$moop)) +
geom_histogram(aes(y = (..count..)/sum(..count..)), binwidth = 500, col= "red", fill= "blue") +
## version 3.0.9
# scale_y_continuous(labels = percent_format())
## version 3.1.0
scale_y_continuous(labels=percent) + labs(title = "MOOP in 2014") + labs(x="Family MOOP", y= "Percent of Total Plans") +  theme(text=element_text(size=16, family="Georgia"))
g2 <- ggplot(moop15, aes(x = moop15$moop)) +
geom_histogram(aes(y = (..count..)/sum(..count..)), binwidth = 500, col= "red", fill= "blue") +
## version 3.0.9
# scale_y_continuous(labels = percent_format())
## version 3.1.0
scale_y_continuous(labels=percent) + labs(title = "MOOP in 2015") + labs(x="Family MOOP", y= "Percent of Total Plans") +  theme(text=element_text(size=16, family="Georgia"))
grid.arrange(g1, g2, ncol = 2)

While the ACA has obviously been a huge benefit to families who need it, it’s a little scary to note that that many plans offer the poorest coverage possible under the ACA. It will be interesting to see what happens over the next five years. Will HHS keep allowing MOOP to rise or will they push back? If this data is any indication, health insurers will continue to raise MOOP if they are allowed.