Pain Points Analysis

Carlos Mercado

August 24, 2017

Synopsis

Data was provided by the Center for Medicaid and Medicare Services (CMS) from the Open Enrollment Period (9 million Americans enrolled).

Outline

Downloaded Data and General Summaries

csv <- "Business Analyst Data Analysis Presentation - Open Enrollment Help Page Comments - Comments.csv"

medicare <- read.csv(csv, stringsAsFactors = FALSE)
#reads as 2,179 observations of 2 columns (URL and comment)
## [1] 51
## [1] 2175

Approach Reasoning - NLP

The reasons I chose Natural Language Processing in R:

Demographics and other valuable information would be helpful in the future for making actual product recommendations for the system.

Common Feedback Categories

## # A tibble: 51 x 2
##                                                               URL Comment
##                                                             <chr>   <int>
##  1                           help/what-health-coverage-do-i-have/     213
##  2                  help/parent-and-caretaker-relative-questions/     162
##  3                                         help/add-other-income/     128
##  4 help/i-am-having-trouble-logging-in-to-my-marketplace-account/     114
##  5                                      help/deduction-questions/     108
##  6                                     help/automatic-enrollment/     107
##  7                                     help/disability-questions/     103
##  8                          help/found-not-eligible-for-medicaid/     101
##  9                                   help/losing-health-coverage/     101
## 10                                  help/information-on-medicare/      95
## # ... with 41 more rows

Common Feedback Stats

## # A tibble: 51 x 2
##                                                               URL Comment
##                                                             <chr>   <int>
##  1                           help/what-health-coverage-do-i-have/     213
##  2                  help/parent-and-caretaker-relative-questions/     162
##  3                                         help/add-other-income/     128
##  4 help/i-am-having-trouble-logging-in-to-my-marketplace-account/     114
##  5                                      help/deduction-questions/     108
##  6                                     help/automatic-enrollment/     107
##  7                                     help/disability-questions/     103
##  8                          help/found-not-eligible-for-medicaid/     101
##  9                                   help/losing-health-coverage/     101
## 10                                  help/information-on-medicare/      95
## # ... with 41 more rows

For feasibility, it may be prudent to only seek to solve the most common pain points (for example, those with 80 or more comments).

Questions

80/20 Specific Analysis of Top 11 Categories

## # A tibble: 11 x 2
##                                                               URL Comment
##                                                             <chr>   <int>
##  1                           help/what-health-coverage-do-i-have/     213
##  2                  help/parent-and-caretaker-relative-questions/     162
##  3                                         help/add-other-income/     128
##  4 help/i-am-having-trouble-logging-in-to-my-marketplace-account/     114
##  5                                      help/deduction-questions/     108
##  6                                     help/automatic-enrollment/     107
##  7                                     help/disability-questions/     103
##  8                          help/found-not-eligible-for-medicaid/     101
##  9                                   help/losing-health-coverage/     101
## 10                                  help/information-on-medicare/      95
## 11                              help/reconciling-your-tax-credit/      89

Specific Analysis, Within Groups

count3[6,]
## # A tibble: 1 x 3
## # Groups:   URL [1]
##                          URL       trigram     n
##                        <chr>         <chr> <int>
## 1 help/automatic-enrollment/ how to cancel    15

Common Word Groupings

Common Word Groupings

## # A tibble: 35,181 x 2
##                          quadgram     n
##                             <chr> <int>
##  1 individual insurance non group    26
##  2   insurance non group coverage    25
##  3                  what to do if    19
##  4             how to answer this    17
##  5                it is not clear    15
##  6                 the end of the    14
##  7                   to do if you    14
##  8                i don't know if    13
##  9            it would be helpful    13
## 10        to answer this question    13
## # ... with 35,171 more rows

Specific Analysis, Within Groups

Looking at the broadest tested case: 4 and 5 grams

For example, in the parent and caretaker questions several comments including that a feature is missing “there is no option for” or seek extra advice, “19 but…”

Specific Analysis, Within Groups

Most common 5-word groups in the parent, caretaker, relative category:

Specific Analysis of Top 11

Looking at the most common problems based on counting the different word pairs (or triplets, or quadruplets) we see a few things:

This is different than feeling like a feature should exist, but doesn’t or that an interface is too difficult to use.

Final Words

Things to consider with more time:

Final Words

Due to time constraints I decided against removing words or engaging in sentiment analysis, i.e. Do users give more “negative” feedback in certain categories compared to others?

AUTHOR’s NOTE: Detailed Presentation with full annotations and code are available.