RegressionAnalysis

Health insurance companies which manage medicaid and medicare are required to report quality measures to government agency. These qaulity measures are important for the companies rating and millions of dollars are spent to improve them. Also, the higher the quality the more likely a consumer will choose a particular health insurance plan. This will has very visible impact on the revenue of the health insurance companies.

The following analysis examine the relationship of a health plan’s quality rating and its membership count. The data source is open source goverment data from this URL (https://www.healthdata.gov/dataset/managed-care-regional-consumer-guide)

Two separate files are downloaded, merged and aggregated to produce a final csv file.

Step 1. Read in the raw csv data file.

filepath <- "https://github.com/angus001/Data605/raw/master/HealthPlanQuality_Raw_20171108.csv"

healthquality <- read.csv(filepath, header = T, sep = ",")

Step 2. Check a sample of the data read in.

head(healthquality)

##   Plan.ID             Plan.Name Avg..Domain.Rating Plan.MemberCount
## 1 1040678    Univera Healthcare              3.094           971664
## 2 1050178    HIP (EmblemHealth)              2.593         79269965
## 3 1070680    Independent Health              3.344          3180056
## 4 1080383       MVP Health Care              3.306         26087568
## 5 1090384                 CDPHP              3.903         27137232
## 6 1130185 MetroPlus Health Plan              3.525         16864460

Step 3. Plot a histogram of the membercount by health plans.

hist(healthquality$Plan.MemberCount)

Step 4. Produce a linear regression model.

lmmodel <-lm(healthquality$Avg..Domain.Rating ~ healthquality$Plan.MemberCount)
summary(lmmodel)

## 
## Call:
## lm(formula = healthquality$Avg..Domain.Rating ~ healthquality$Plan.MemberCount)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3226 -0.2747  0.0831  0.3929  1.1086 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    2.753e+00  1.768e-01  15.571 1.21e-12 ***
## healthquality$Plan.MemberCount 1.528e-09  1.702e-09   0.898     0.38    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6243 on 20 degrees of freedom
## Multiple R-squared:  0.03873,    Adjusted R-squared:  -0.00933 
## F-statistic: 0.8059 on 1 and 20 DF,  p-value: 0.38

Step 5. Create a scatter plot and draw the regression line based on the model in step 4. Notice the curly bracket to group the codes together

{plot(healthquality$Plan.MemberCount, healthquality$Avg..Domain.Rating, col = 'navyblue', pch = 16, # cex =1.3,
     xlab = "Member Counts", ylab = "Quality Rating")
abline(lmmodel, col="red")}

Notes The following is an example of choosing color options in plot.

colors()[grep("blue",colors())]

##  [1] "aliceblue"       "blue"            "blue1"          
##  [4] "blue2"           "blue3"           "blue4"          
##  [7] "blueviolet"      "cadetblue"       "cadetblue1"     
## [10] "cadetblue2"      "cadetblue3"      "cadetblue4"     
## [13] "cornflowerblue"  "darkblue"        "darkslateblue"  
## [16] "deepskyblue"     "deepskyblue1"    "deepskyblue2"   
## [19] "deepskyblue3"    "deepskyblue4"    "dodgerblue"     
## [22] "dodgerblue1"     "dodgerblue2"     "dodgerblue3"    
## [25] "dodgerblue4"     "lightblue"       "lightblue1"     
## [28] "lightblue2"      "lightblue3"      "lightblue4"     
## [31] "lightskyblue"    "lightskyblue1"   "lightskyblue2"  
## [34] "lightskyblue3"   "lightskyblue4"   "lightslateblue" 
## [37] "lightsteelblue"  "lightsteelblue1" "lightsteelblue2"
## [40] "lightsteelblue3" "lightsteelblue4" "mediumblue"     
## [43] "mediumslateblue" "midnightblue"    "navyblue"       
## [46] "powderblue"      "royalblue"       "royalblue1"     
## [49] "royalblue2"      "royalblue3"      "royalblue4"     
## [52] "skyblue"         "skyblue1"        "skyblue2"       
## [55] "skyblue3"        "skyblue4"        "slateblue"      
## [58] "slateblue1"      "slateblue2"      "slateblue3"     
## [61] "slateblue4"      "steelblue"       "steelblue1"     
## [64] "steelblue2"      "steelblue3"      "steelblue4"

RegressionAnalysis

Huang, Angus

November 8, 2017