Problem Statement: In the absence of Universal Health Care Coverage in the United States, Medicare and Medicaid play a critical role in providing access to medical services for elderly, disabled and low-income patients (1). Financial and budget considerations nudge policymakers to limit these services further restricting Medicare eligibility age (2,3) or tightening general requirements for Medicaid coverage. According to a study by Wilper et al., in the United States inadequate health coverage will increase the risk of mortality by 40% as it will limit access to primary and preventive care and delay seeking treatment (4). In the study by Pandey et al, increasing the Medicare coverage age from 65 to 67 with the rate of 2 month per year would result in an estimated extra 17,244 death between 2023 and 2028.(1).

The current political atmosphere concerns public health experts and medical community. It demands reevaluation of Medicare and Medicaid role in public health, prediction and implications of policy changes on health loss, especially mortality and morbidity (1). Mortality and morbidity for various conditions can vary significantly in the U.S. based on the state and funding coverage for federal health programs. This variation provides an opportunity to investigate current coverage of Medicare and Medicaid across different states in elderly (age > 65 years old) and evaluate the trend of mortality based on level of access, coverage and cost among current CMS beneficiaries. In this project we evaluated how access to and funding of CMS programs affected mortality among the most fragile adult U.S. population – senior citizens.

Research Question: Group Problem Statement: State -level variations in elderly total mortality rate and cancer mortality rate in elderly (age > 65 years old) is associated with Medicaid/ Medicare coverage, utilization and spending.

Methods and Material:

Description of dataset: The Kaiser Family Foundation (KFF) provides data for programs covered by Centers for Medicare & Medicaid Services (CMS), we can source these data from Medicare & Medicaid. We have 2 datasets as follows:

1- Medicare enrollment, payments and utilization in total for total Medicare and 2- different Medicare plans across states 2- Medicaid spending and enrollment in total and for different Medicaid categories (seniors, children, newly eligible adults, and adults) across states

CDC wonder is publicly available provided by CDC’s National Center for Health Statistics (NCHS).

We have 2 datasets from this source:

1-cdc_wonder_population: Age stratified population across states 2-cdc_wonder_mortality: crude mortality rates across states stratified by age and ICD 10 cause of death.

Analytical software and Methods.

We used R studio software and related software packages. We imported population, mortality, Medicare and Medicaid datasets and applied different data cleaning techniques  summarizing and filtering seniors > 65 years as one group,  selecting total mortality and cancer mortality as outcome of interest,  decoding variable state and using residence state as key variable across tables,  handling missing data,  converting payment and spending to numerical data type for analysis,  renaming variables in “snake” format

Crucially, we joined different tables and calculated total mortality rate, cancer mortality rate, Medicare and Medicaid coverage rate as well as spending rate per member.

We then graphed mortality and coverage across states which clearly showed variation.

Finally, we performed a trend visualization for mortality as well as cancer mortality vs Medicare/Medicaid coverage and spending across the U.S.

Results:

Total and malignancy related mortality among seniors > 65 yo as well as mean, SD, median of this population distribution demonstrated in the table (tab. Mortality Summary). Data on enrollment, utilization, and utilization per enrollee is presented in separate tables for Medicare and Medicaid populations respectively.

Table “ Mortality Counts and Rates for Seniors (States Compared to US)” highlights combined deaths, deaths per 100K among seniors both for malignancy and benign neoplasm for all states. As evident from the bar graph “mortality rate per 100K seniors”, variability in malignant-related mortality rate across states is less than variability in total mortality. Overall tendencies on Medicare enrollment can be viewed in “Medicare and Medicaid Coverage and spending for Seniors (States Compared to US). As an example, three states (Florida, Hawaii, Minnesota) have Medicare enrollment rate lower than 15% below U.S. average. These “outlier” states however are not among the ones with worst cancer or total mortality rates per 100k. Finally, our trend analysis can be viewed in the last section. Medicare enrollment has a slight positive trend to be associated with lower mortality. Medicaid enrollement has a stronger positive trend to have a lower mortality. However, increased spending per enrollee has a tendency to be assosiated with higher mortality in Medicare beneficiary as opposed to Medicaid beneficiary. Slightly different trends observed with cancer-related mortality. Respective graphic for Medicaid enrollment/spending per Medicaid enrollee as independent variables and cancer mortality as dependent variable showed no trend (parallel to the x-axis). Same graphic for Medicare enrollment/spending per Medicare enrollee as independent variables and cancer-related mortality as dependent variable showed positive correlation.

Discussion:

We demonstrated that enrollment in CMS programs and spending per person enrolled can differ substatially in various states. This however does not necessarily lead to improved cancer-related or total mortality. It might be that various factors in the states (geographic, financial, political, cultural, ecocomic) that are independent of CMS enrollment or spending can significantly affect or contribute to mortality rates.

What can be concluded from the trend analysis? There is a trend for better total mortality in the states with better enrollment for Medicare and especially Medicaid beneficiaries. The latter likely represent more socially disadvantageous economic group for which health coverage is essential for a better outcome. Spending vs total mortality trend is different direction in Medicare vs Medicaid and likly reflect some additional state-wide factors.

As for cancer-related mortality, interpretation must account for the fact that cancer by its nature can be very aggressive disease and remains one of the leading causes of death overall. Increased spending per Medicare enrollee may reflect high overall morbidity and subsequent higher mortality from cancer. In other words, sicker Medicare individuals are spent more money on and outcomes are worse. No apparent direction is evident in trend analysis of Medicaid beneficiaries and cancer related mortality. Finally, “outlier” states with significant less spending or enrollment than the U.S. average don’t have a clear direction in analysed associations either.

In summary, states with better Medicare and Medicaid enrollment and spending per enrollee have better overall mortality rates however this is not reflected in cancer-related mortality.

reference:

1- Pandey A, Fitzpatrick MC, Singer BH, Galvani AP. Mortality and morbidity ramifications of proposed retractions in healthcare coverage for the United States. Proc Natl Acad Sci U S A. 2024 Apr 30;121(18) 2- 1.Van de Water P. N., Raising Medicare’s eligibility age would increase overall health spending and shift costs to seniors, states, and employers. Center on Budget and Policy Priorities (2011) (16 January 2024).  3- 2.Waidmann T. A., Potential effects of raising Medicare’s eligibility age. Health Aff. 17, 156–164 (1998).  4- Wilper A. P., et al. , Health insurance and mortality in US adults. Am. J. Public Health 99, 2289–2295 (2009)

rpub link - https://rpubs.com/dzmitryberkeley/milestone4DmitrySanaz

Milestone #5 Data analysis and Dissemination

1- we selected > 65 years old age group as age group of interest. This is compatible between seniors in medicaid and medicare population. We will look into total mortality as well as mortality for malignancies and unknown neoplasm. 2- Variable state is decoded to residence_state_code and residence state to separate sate from state code, then new variables trimmed and formatted to title format we are going to use residence_state as a key across all data-sets 3- some state codes skipped. We are re-assigning residence_state_code 01-51 including DC with state name in alphabetical order 4- we define data type for all variables to align with rest of tables

5- we create a new variable called population_65 for population >65 and summarize to just show 1 row per state for >65 population by sum 65-74 years, 75-84 years and>85 years for each state 6- All variables names renamed based on snake format 7- We created descriptive ststisrtics tsable

We define data types for all varaibles to align with other tables

cdc_population_cleaned <- cdc_population_controled %>%
 filter(ten_year_age_groups %in% c("85+ years", "75-84 years", "65-74 years")) %>%
  group_by(residence_state_code, residence_state) %>%
  summarize(

    population_65 = sum(population, na.rm = TRUE)
  ) %>%
  select(
    residence_state_code, residence_state, population_65
  ) %>%
  ungroup() %>%
  
  arrange(residence_state_code)
## `summarise()` has grouped output by 'residence_state_code'. You can override
## using the `.groups` argument.
head(cdc_population_cleaned)
## # A tibble: 6 × 3
##   residence_state_code residence_state population_65
##   <chr>                <chr>                   <int>
## 1 01                   Alabama                888817
## 2 02                   Alaska                  97663
## 3 03                   Arizona               1333046
## 4 04                   Arkansas               528101
## 5 05                   California            5957092
## 6 06                   Colorado               879653
cdc_population_cleaned
## # A tibble: 51 × 3
##    residence_state_code residence_state      population_65
##    <chr>                <chr>                        <int>
##  1 01                   Alabama                     888817
##  2 02                   Alaska                       97663
##  3 03                   Arizona                    1333046
##  4 04                   Arkansas                    528101
##  5 05                   California                 5957092
##  6 06                   Colorado                    879653
##  7 07                   Connecticut                 649235
##  8 08                   Delaware                    201646
##  9 09                   District Of Columbia         85838
## 10 10                   Florida                    4598386
## # ℹ 41 more rows

1- residence_state_code we already renamed columns with snake format, removed NA, separated residence_state and residence_state and residence_state_code and ordered by residence_state_code and ten_year_age_groups some state codes skipped, we reasigned codes as characters after ordering state names alphabetically

2- residence_state we already renamed columns with snake format, trimmed, removed NA Moreover, in medicare and medicaid dataset we converted them to title format (first letter capital, rest lower case) to align with population and mortality datasets

3- population we already renamed columns with snake format, removed NA, ordered by residence_state_code and ten_year_age_groups

# Description for population
population_65_summary <- cdc_population_cleaned %>%
  
  summarise(
   
    mean_population_65 = round(mean(population_65, na.rm = TRUE), digits = 2),
    sd_population_65 = round(sd(population_65, na.rm = TRUE), digits = 2),
    median_population_65 = median(population_65, na.rm = TRUE),
    min_population_65 = min(population_65, na.rm = TRUE),
    max_population_65 = max(population_65, na.rm = TRUE),
    
  ) 
population_65_summary <- population_65_summary %>%
  pivot_longer(
    cols = everything(),         
    names_to = "statistic",       
    values_to = "Population_65",
    names_transform = function(x) str_remove(x, "_population_65")
  )

population_65_summary
## # A tibble: 5 × 2
##   statistic Population_65
##   <chr>             <dbl>
## 1 mean           1095058.
## 2 sd             1185180.
## 3 median          770260 
## 4 min              85838 
## 5 max            5957092

cleaning mortality data set You have the freedom to define the age category and mortality grouping of interest. The age group must be comparable between chosen Medicare/Medicaid and mortality metrics (i.e. adult, child, senior, or a combination). The mortality grouping must include at least two ICD sub-chapters, but may not include all. Mortality :Mortality Recode common variable(s) to be combined with other datasets Limit dataset to rows with age group and mortality counts of interest (recommend creating an intermediate variable to indicate which rows to keep) Create a new variable that includes a state level sum and summarize to only include one row per state

1- we renamed all variable with snake format with name and type of variables in alignment across all tables for example_residence_state_code changed to character 2- we selected > 65 years old age group as age group of interest. This is compatible between seniors in medicaid and medicare population. We will look into total mortality as well as mortality for neoplasms unknown and malignant 3- residence_state_code and residence state trimmed we are going to use residence_state as a key variable across all data-sets 4- some state codes skipped. We are re-assigning residence_state_code 01-51 including DC with state name in alphabetical order

5- we created an intermediate variable called keep_raw as logic variable which becomes true when age>65 and we filtered for keep_raw =TRUE 6- we create new variables called total_death_count_65, malignant_death_count_65, unknown_neoplasm_death_count_65 for total mortality and 2 main sub_chapter of mortality in >65 years old population and summarize to just show 1 row per state for >65 population by sum 65-74 years, 75-84 years and>85 years for each state we summarized total mortality>65 years old as well mortality for malignant and unknown neoplasm sub chapters
7- we created descriptive statistics table

Age_65 <- c("85+ years", "75-84 years", "65-74 years")
cdc_mortality_cleaned <- cdc_mortality_controled %>%
  mutate(
    keep_raw = ten_year_age_groups %in% Age_65
  )%>%
 filter(keep_raw == TRUE) %>%
  group_by(residence_state_code, residence_state) %>%
  summarize(

    total_death_count_65 = sum(death_count, na.rm = TRUE),
    malignant_death_count_65 = sum (death_count 
                            [str_detect(ucd_icd_sub_chapter_code, "C00-C97")
                            & str_detect(ucd_icd_sub_chapter, "Malignant Neoplasms")
                                    ],
                                  na.rm = TRUE
    ), 
    unknown_neoplasm_death_count_65 = sum( death_count
                                  [str_detect(ucd_icd_sub_chapter_code, "D37-D48")
                                    & str_detect(ucd_icd_sub_chapter, "Neoplasms Of Uncertain Or Unknown Behaviour")
                                    ],
                                  na.rm = TRUE
    )
  ) %>%
  select(
    residence_state_code, residence_state, total_death_count_65,
    malignant_death_count_65, unknown_neoplasm_death_count_65
  ) %>%
  ungroup() %>%
  
  arrange(residence_state_code) %>%
  select(-residence_state_code)
## `summarise()` has grouped output by 'residence_state_code'. You can override
## using the `.groups` argument.
head(cdc_mortality_cleaned)
## # A tibble: 6 × 4
##   residence_state total_death_count_65 malignant_death_count_65
##   <chr>                          <int>                    <int>
## 1 Alabama                        46667                     7440
## 2 Alaska                          3365                      807
## 3 Arizona                        57512                     9723
## 4 Arkansas                       27283                     4697
## 5 California                    238984                    44095
## 6 Colorado                       33294                     6035
## # ℹ 1 more variable: unknown_neoplasm_death_count_65 <int>
cdc_mortality_cleaned
## # A tibble: 51 × 4
##    residence_state      total_death_count_65 malignant_death_count_65
##    <chr>                               <int>                    <int>
##  1 Alabama                             46667                     7440
##  2 Alaska                               3365                      807
##  3 Arizona                             57512                     9723
##  4 Arkansas                            27283                     4697
##  5 California                         238984                    44095
##  6 Colorado                            33294                     6035
##  7 Connecticut                         25397                     4986
##  8 Delaware                             7863                     1641
##  9 District Of Columbia                 3353                      710
## 10 Florida                            192675                    35521
## # ℹ 41 more rows
## # ℹ 1 more variable: unknown_neoplasm_death_count_65 <int>

5- ttotal_death_65”, “malignant_mortality_65”, and “unknown_neoplasm_mortality_65” already renamed columns with snake format, removed NA, we summed 65-74, 75-84 and 85+ groups to create > 65 ordered by residence_state_code total death count and death count for malignant an unknown neoplasm for each state already calculated

custom_order <- c("total_death_count_65", 
                  "malignant_death_count_65", 
                  "unknown_neoplasm_death_count_65")
mortality_summary <- cdc_mortality_cleaned %>%
  pivot_longer(
    cols = c("total_death_count_65", 
             "malignant_death_count_65", 
             "unknown_neoplasm_death_count_65"), 
    names_to = "death_chapter", 
    values_to = "death_count_65" 
  ) %>%
  mutate(
    
    death_chapter = fct_relevel(death_chapter, custom_order)
  )  %>%
  arrange(residence_state, death_chapter) %>%
  group_by(death_chapter)%>%

  summarise(
    mean_death_count_65 = round(mean(death_count_65, na.rm = TRUE), digits = 2),
    sd_death_count_65 = round(sd(death_count_65, na.rm = TRUE), digits = 2),
    median_death_count_65 = median(death_count_65, na.rm = TRUE),
    min_death_count_65 = min(death_count_65, na.rm = TRUE),
    max_death_count_65 = max(death_count_65, na.rm = TRUE),
    .groups = 'drop'
  )
mortality_summary
## # A tibble: 3 × 6
##   death_chapter      mean_death_count_65 sd_death_count_65 median_death_count_65
##   <fct>                            <dbl>             <dbl>                 <int>
## 1 total_death_count…              48405.            50340.                 34881
## 2 malignant_death_c…               8752.             8997.                  6505
## 3 unknown_neoplasm_…                240.              235.                   159
## # ℹ 2 more variables: min_death_count_65 <int>, max_death_count_65 <int>
mortality_summary <- mortality_summary %>%
  pivot_longer(
    cols = c("mean_death_count_65",
             "sd_death_count_65",
             "median_death_count_65",
             "min_death_count_65",
             "max_death_count_65"),
    names_to = "statistic",
    values_to = "Count",
    names_transform = function(x) str_remove(x, "_death_count_65")
  )
mortality_summary <- mortality_summary %>%
pivot_wider(
    names_from = death_chapter, 
    values_from = Count,
   
  ) 

mortality_summary
## # A tibble: 5 × 4
##   statistic total_death_count_65 malignant_death_count_65 unknown_neoplasm_dea…¹
##   <chr>                    <dbl>                    <dbl>                  <dbl>
## 1 mean                    48405.                    8752.                   240.
## 2 sd                      50340.                    8997.                   235.
## 3 median                  34881                     6505                    159 
## 4 min                      3353                      710                      0 
## 5 max                    238984                    44095                   1022 
## # ℹ abbreviated name: ¹​unknown_neoplasm_death_count_65

cleaning medicare data set

1- We will proceed with both Medicaid for seniors and Medicare 2- we renamed all variable with snake format with name and type of variables in alignment across all tables for example spending changes to numeric 2- we selected > 65 years old age group as age group of interest. This is compatible between seniors in medicaid and medicare population.

3- residence state trimmed and formatted as title format, United State filtered we are going to use residence_state as a key variable across all data-sets

4- All payments changed to numeric in US dollars 5- we changed different forms of n/a to NA 6- we create new variables total_per_enrollee_payments by dividing seniors_medicaid_spending/seniors_medicaid_enrollment and total_medicare_per_enrollee_utilization 7- we created descriptive statistics table

cdc_medicare_controled <- cdc_medicare %>%
  rename(
    
residence_state = Location,
total_medicare_enrollees = 
  Total.Traditional.Medicare.Part.A.and.or.Part.B.Enrollees,
total_medicare_utilization = 
  Total.Medicare.Part.A.and.or.Part.B.Persons.With.Utilization,
total_medicare_payments = Total.Medicare.Part.A.and.or.Part.B.Program.Payments,
parta_medicare_enrollees = Total.Traditional.Medicare.Part.A.Enrollees,
parta_medicare_utilization = Total.Medicare.Part.A.Persons.With.Utilization,
parta_medicare_payments = Total.Medicare.Part.A.Program.Payments,
partb_medicare_enrollees = Total.Traditional.Medicare.Part.B.Enrollees,
partb_medicare_utilization = Total.Medicare.Part.B.Persons.With.Utilization,
partb_medicare_payments = Total.Medicare.Part.B.Program.Payments,

    ) %>%
  mutate(residence_state = str_to_title(str_trim(residence_state))) %>%
  
      mutate(across(where(is.character), ~ na_if(., "N/A"))) %>%
      mutate(across(where(is.character), ~ na_if(., "NSD"))) %>%
      mutate(across(where(is.character), ~ na_if(., "$0"))) %>%
  filter(residence_state != "United States") %>%
  mutate(
    across(
      .cols = ends_with(c("payments")),
      .fns = ~ str_remove_all(., pattern = "\\$|,") %>% as.numeric()
    )
  )  %>%
  mutate(
    total_medicare_payments_per_enrollee = round(total_medicare_payments /
                                        total_medicare_enrollees, digits = 2),
    total_medicare_utilization_per_enrollee = round(total_medicare_utilization /
                                        total_medicare_enrollees, digits = 2),
    total_medicare_payment_per_utilization = round(total_medicare_payments /
                                        total_medicare_utilization, digits = 2),
    
  )
str(cdc_medicare_controled)
## 'data.frame':    51 obs. of  13 variables:
##  $ residence_state                        : chr  "Alabama" "Alaska" "Arizona" "Arkansas" ...
##  $ total_medicare_enrollees               : int  528983 105831 767513 430724 3436742 518796 363399 171134 70224 2359900 ...
##  $ total_medicare_utilization             : int  512627 90717 720762 409441 3009051 467294 326323 162925 57112 2272148 ...
##  $ total_medicare_payments                : num  5.65e+09 1.05e+09 7.75e+09 4.27e+09 4.33e+10 ...
##  $ parta_medicare_enrollees               : int  522820 105399 757650 429009 3311257 510771 361543 170321 69203 2353861 ...
##  $ parta_medicare_utilization             : int  116144 13055 135153 81568 588308 81335 70774 29004 11547 506165 ...
##  $ parta_medicare_payments                : num  2.68e+09 4.68e+08 3.35e+09 1.95e+09 2.18e+10 ...
##  $ partb_medicare_enrollees               : int  456357 93234 662898 384774 2876785 440280 297814 154839 55171 2064027 ...
##  $ partb_medicare_utilization             : int  499789 89885 703776 403599 2933568 458267 320526 161216 55774 2226320 ...
##  $ partb_medicare_payments                : num  2.97e+09 5.84e+08 4.39e+09 2.32e+09 2.14e+10 ...
##  $ total_medicare_payments_per_enrollee   : num  10688 9939 10092 9919 12586 ...
##  $ total_medicare_utilization_per_enrollee: num  0.97 0.86 0.94 0.95 0.88 0.9 0.9 0.95 0.81 0.96 ...
##  $ total_medicare_payment_per_utilization : num  11029 11594 10747 10434 14375 ...
head(cdc_medicare_controled)
##   residence_state total_medicare_enrollees total_medicare_utilization
## 1         Alabama                   528983                     512627
## 2          Alaska                   105831                      90717
## 3         Arizona                   767513                     720762
## 4        Arkansas                   430724                     409441
## 5      California                  3436742                    3009051
## 6        Colorado                   518796                     467294
##   total_medicare_payments parta_medicare_enrollees parta_medicare_utilization
## 1              5653598984                   522820                     116144
## 2              1051813241                   105399                      13055
## 3              7745898078                   757650                     135153
## 4              4272207750                   429009                      81568
## 5             43255054191                  3311257                     588308
## 6              4904130297                   510771                      81335
##   parta_medicare_payments partb_medicare_enrollees partb_medicare_utilization
## 1              2684356995                   456357                     499789
## 2               467738024                    93234                      89885
## 3              3352520020                   662898                     703776
## 4              1954272166                   384774                     403599
## 5             21825304810                  2876785                    2933568
## 6              2113982177                   440280                     458267
##   partb_medicare_payments total_medicare_payments_per_enrollee
## 1              2969241988                             10687.68
## 2               584075217                              9938.61
## 3              4393378059                             10092.20
## 4              2317935584                              9918.67
## 5             21429749381                             12586.06
## 6              2790148120                              9452.91
##   total_medicare_utilization_per_enrollee
## 1                                    0.97
## 2                                    0.86
## 3                                    0.94
## 4                                    0.95
## 5                                    0.88
## 6                                    0.90
##   total_medicare_payment_per_utilization
## 1                               11028.68
## 2                               11594.44
## 3                               10746.82
## 4                               10434.25
## 5                               14374.98
## 6                               10494.74
cdc_medicare_cleaned <- cdc_medicare_controled  %>%
  select(residence_state, 
        total_medicare_enrollees,
        total_medicare_utilization,
        total_medicare_utilization_per_enrollee,
        total_medicare_payments,
        total_medicare_payments_per_enrollee,
        total_medicare_payment_per_utilization)
cdc_medicare_cleaned
##         residence_state total_medicare_enrollees total_medicare_utilization
## 1               Alabama                   528983                     512627
## 2                Alaska                   105831                      90717
## 3               Arizona                   767513                     720762
## 4              Arkansas                   430724                     409441
## 5            California                  3436742                    3009051
## 6              Colorado                   518796                     467294
## 7           Connecticut                   363399                     326323
## 8              Delaware                   171134                     162925
## 9  District Of Columbia                    70224                      57112
## 10              Florida                  2359900                    2272148
## 11              Georgia                   953980                     900795
## 12               Hawaii                   145637                     116113
## 13                Idaho                   215741                     201173
## 14             Illinois                  1507029                    1395073
## 15              Indiana                   775334                     745458
## 16                 Iowa                   465570                     456550
## 17               Kansas                   414903                     390372
## 18             Kentucky                   536961                     529922
## 19            Louisiana                   473575                     459953
## 20                Maine                   187420                     171388
## 21             Maryland                   898929                     779476
## 22        Massachusetts                   973092                     881664
## 23             Michigan                  1024125                     984680
## 24            Minnesota                   509904                     532333
## 25          Mississippi                   431483                     421242
## 26             Missouri                   707230                     661529
## 27              Montana                   188073                     172143
## 28             Nebraska                   272226                     259753
## 29               Nevada                   316990                     273357
## 30        New Hampshire                   234830                     208051
## 31           New Jersey                  1075739                     959117
## 32           New Mexico                   252475                     229007
## 33             New York                  1996240                    1817771
## 34       North Carolina                  1148967                    1106895
## 35         North Dakota                   105943                     118795
## 36                 Ohio                  1227265                    1180834
## 37             Oklahoma                   528377                     492453
## 38               Oregon                   459062                     412891
## 39         Pennsylvania                  1493883                    1393506
## 40         Rhode Island                   111983                     101231
## 41       South Carolina                   712041                     690388
## 42         South Dakota                   136475                     151688
## 43            Tennessee                   766844                     732420
## 44                Texas                  2347931                    2119870
## 45                 Utah                   242369                     220690
## 46              Vermont                   124384                     114169
## 47             Virginia                  1112731                    1016486
## 48           Washington                   865220                     775317
## 49        West Virginia                   254195                     255668
## 50            Wisconsin                   622437                     633184
## 51              Wyoming                   110505                     102034
##    total_medicare_utilization_per_enrollee total_medicare_payments
## 1                                     0.97              5653598984
## 2                                     0.86              1051813241
## 3                                     0.94              7745898078
## 4                                     0.95              4272207750
## 5                                     0.88             43255054191
## 6                                     0.90              4904130297
## 7                                     0.90              4433665245
## 8                                     0.95              1954121238
## 9                                     0.81               794142024
## 10                                    0.96             28720918314
## 11                                    0.94             10195174819
## 12                                    0.80              1088203917
## 13                                    0.93              1926353213
## 14                                    0.93             17004301474
## 15                                    0.96              8340501710
## 16                                    0.98              4587577906
## 17                                    0.94              4370255665
## 18                                    0.99              5434363793
## 19                                    0.97              5517151179
## 20                                    0.91              1716550370
## 21                                    0.87             10696224308
## 22                                    0.91             11347566027
## 23                                    0.96             11141909988
## 24                                    1.04              5858445932
## 25                                    0.98              4931158545
## 26                                    0.94              7409955196
## 27                                    0.92              1626583833
## 28                                    0.95              2904540192
## 29                                    0.86              3525162484
## 30                                    0.89              2200212231
## 31                                    0.89             13333090391
## 32                                    0.91              2187724333
## 33                                    0.91             26228103828
## 34                                    0.96             11611543435
## 35                                    1.12              1197587193
## 36                                    0.96             13045771320
## 37                                    0.93              5988040719
## 38                                    0.90              3952432859
## 39                                    0.93             15821015593
## 40                                    0.90              1090410347
## 41                                    0.97              7303580438
## 42                                    1.11              1555704395
## 43                                    0.96              7583702280
## 44                                    0.90             28089834378
## 45                                    0.91              2369071426
## 46                                    0.92              1145092454
## 47                                    0.91             10542973901
## 48                                    0.90              7559019903
## 49                                    1.01              2671150063
## 50                                    1.02              6356566138
## 51                                    0.92              1102331666
##    total_medicare_payments_per_enrollee total_medicare_payment_per_utilization
## 1                              10687.68                               11028.68
## 2                               9938.61                               11594.44
## 3                              10092.20                               10746.82
## 4                               9918.67                               10434.25
## 5                              12586.06                               14374.98
## 6                               9452.91                               10494.74
## 7                              12200.54                               13586.74
## 8                              11418.66                               11993.99
## 9                              11308.70                               13904.99
## 10                             12170.40                               12640.43
## 11                             10686.99                               11317.97
## 12                              7472.03                                9371.94
## 13                              8929.01                                9575.61
## 14                             11283.33                               12188.83
## 15                             10757.30                               11188.43
## 16                              9853.68                               10048.36
## 17                             10533.20                               11195.11
## 18                             10120.59                               10255.03
## 19                             11650.01                               11995.03
## 20                              9158.84                               10015.58
## 21                             11898.85                               13722.33
## 22                             11661.35                               12870.62
## 23                             10879.44                               11315.26
## 24                             11489.31                               11005.23
## 25                             11428.40                               11706.24
## 26                             10477.43                               11201.26
## 27                              8648.68                                9449.03
## 28                             10669.59                               11181.93
## 29                             11120.74                               12895.82
## 30                              9369.38                               10575.35
## 31                             12394.35                               13901.42
## 32                              8665.11                                9553.09
## 33                             13138.75                               14428.72
## 34                             10106.07                               10490.19
## 35                             11304.07                               10081.12
## 36                             10629.95                               11047.93
## 37                             11332.89                               12159.62
## 38                              8609.80                                9572.58
## 39                             10590.53                               11353.39
## 40                              9737.28                               10771.51
## 41                             10257.25                               10578.95
## 42                             11399.19                               10255.95
## 43                              9889.50                               10354.31
## 44                             11963.65                               13250.73
## 45                              9774.65                               10734.84
## 46                              9206.11                               10029.80
## 47                              9474.86                               10371.98
## 48                              8736.53                                9749.59
## 49                             10508.27                               10447.73
## 50                             10212.38                               10039.05
## 51                              9975.40                               10803.57

6- # Description for medicare

medicare_summary <- cdc_medicare_cleaned %>%
  summarise(
    across(
      .cols = everything() & !residence_state, 
      .fns = list(
        mean = ~ mean(.x, na.rm = TRUE),
        sd = ~ sd(.x, na.rm = TRUE),
        median = ~ median(.x, na.rm = TRUE),
        min = ~ min(.x, na.rm = TRUE),
        max = ~ max(.x, na.rm = TRUE)
      ) 

    )
  )

medicare_summary <- medicare_summary %>%
  pivot_longer(
    cols = everything(), 
     names_to = c("Name", "statistic"),
    names_pattern = "(.*)_(.*)", 
    values_to = "Value"
  )%>%
pivot_wider(
    names_from = Name,
    values_from = Value
  )
medicare_summary
## # A tibble: 5 × 7
##   statistic total_medicare_enrol…¹ total_medicare_utili…² total_medicare_utili…³
##   <chr>                      <dbl>                  <dbl>                  <dbl>
## 1 mean                     699634.                650860.                 0.936 
## 2 sd                       673035.                610270.                 0.0587
## 3 median                   509904                 467294                  0.93  
## 4 min                       70224                  57112                  0.8   
## 5 max                     3436742                3009051                  1.12  
## # ℹ abbreviated names: ¹​total_medicare_enrollees, ²​total_medicare_utilization,
## #   ³​total_medicare_utilization_per_enrollee
## # ℹ 3 more variables: total_medicare_payments <dbl>,
## #   total_medicare_payments_per_enrollee <dbl>,
## #   total_medicare_payment_per_utilization <dbl>

Description for medicaid senior enrollment

we already renamed columns with snake format, replaced all different forms of 0 or NA/NSD with NA

1- We will proceed with both Medicaid for seniors and Medicare 2- we renamed all variable with snake format with name and type of variables in alignment across all tables for example spending changes to numeric 2- we selected > 65 years old age group as age group of interest. This is compatible between seniors in medicaid and medicare population.

3- residence state trimmed and formatted as title format, United State filtered out we are going to use residence_state as a key variable across all data-sets

4- All spending and eligible adult enrollment changed to numeric 5- we change different forms of n/a, 0 to NA, as missing data 6- we create new variables total_per_senior_enrollee_spending by dividing seniors_medicaid_spending/seniors_medicaid_enrollement 7- we created descriptive statistics table

cdc_medicaid_controled <- cdc_medicaid %>%
  rename(
    
    residence_state = Location,
    seniors_medicaid_spending = Seniors.Spending,
    disability_medicaid_spending = Individuals.with.Disabilities.Spending,
    adult_medicaid_spending = Adult.Spending,
    children_medicaid_spending = Children.Spending,
    new_eligible_adult_medicaid_spending = Newly.Eligible.Adult.Spending,
    total_medicaide_spending = Total.Spending,
    
    seniors_medicaid_enrollement = Seniors.Enrollment,
    disability_medicaid_enrollement = Individuals.with.Disabilities.Enrollment,
    adult_medicaide_enrollement = Adult.Enrollment,
    children_medicaide_enrollement = Children.Enrollment,
    new_eligible_adult_medicaide_enrollement = Newly.Eligible.Adult.Enrollment,
    total_medicaid_enrollement = Total.Enrollment,

    ) %>%
      mutate(
        residence_state = str_to_title(str_trim(residence_state)),
        new_eligible_adult_medicaide_enrollement = 
          as.integer(new_eligible_adult_medicaide_enrollement)) %>%
      mutate(across(where(is.character), ~ na_if(., "N/A"))) %>%
      mutate(across(where(is.character), ~ na_if(., "NSD"))) %>%
      mutate(across(where(is.character), ~ na_if(., "$0"))) %>%
  filter(residence_state != "United States") %>%
 mutate(
    across(
      .cols = ends_with("spending"),
      .fns = ~ str_remove_all(., pattern = "\\$|,") %>% as.numeric()
    )
  ) %>%
  mutate(
    total_medicaid_spending_per_senior_enrollee = 
    round(seniors_medicaid_spending/seniors_medicaid_enrollement, digits = 2) 
    
  )
## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `new_eligible_adult_medicaide_enrollement =
##   as.integer(new_eligible_adult_medicaide_enrollement)`.
## Caused by warning:
## ! NAs introduced by coercion
str(cdc_medicaid_controled)
## 'data.frame':    51 obs. of  14 variables:
##  $ residence_state                            : chr  "Alabama" "Alaska" "Arizona" "Arkansas" ...
##  $ seniors_medicaid_spending                  : num  1.47e+09 2.59e+08 1.35e+09 1.38e+09 1.90e+10 ...
##  $ disability_medicaid_spending               : num  2.70e+09 5.32e+08 3.84e+09 2.81e+09 2.52e+10 ...
##  $ adult_medicaid_spending                    : num  2.73e+08 3.11e+08 1.78e+09 4.83e+08 8.67e+09 ...
##  $ children_medicaid_spending                 : num  1.04e+09 5.33e+08 2.18e+09 1.56e+09 1.10e+10 ...
##  $ new_eligible_adult_medicaid_spending       : num  NA 4.92e+08 3.40e+09 4.52e+08 2.09e+10 ...
##  $ total_medicaide_spending                   : num  5.48e+09 2.13e+09 1.26e+10 6.69e+09 8.47e+10 ...
##  $ seniors_medicaid_enrollement               : int  134000 14200 184900 80800 1492400 91500 154900 20600 28800 716900 ...
##  $ disability_medicaid_enrollement            : int  220900 17300 183700 151700 978400 117400 72600 27100 35800 655200 ...
##  $ adult_medicaide_enrollement                : int  209400 52300 465800 77900 3606700 229200 192800 50700 57300 945300 ...
##  $ children_medicaide_enrollement             : int  582500 101200 834000 380200 4507000 543000 361500 105800 80600 2425800 ...
##  $ new_eligible_adult_medicaide_enrollement   : int  NA 61200 566400 341700 4584700 500400 318700 80500 72800 17700 ...
##  $ total_medicaid_enrollement                 : int  1146700 246200 2234800 1032200 15169200 1481600 1100500 284800 275300 4760900 ...
##  $ total_medicaid_spending_per_senior_enrollee: num  10945 18271 7275 17110 12698 ...
head(cdc_medicaid_controled)
##   residence_state seniors_medicaid_spending disability_medicaid_spending
## 1         Alabama                1466624200                   2701743400
## 2          Alaska                 259441700                    532187200
## 3         Arizona                1345177900                   3838947200
## 4        Arkansas                1382463900                   2811310300
## 5      California               18951080000                  25165191200
## 6        Colorado                1617023900                   3162528300
##   adult_medicaid_spending children_medicaid_spending
## 1               272943800                 1036320400
## 2               311076400                  532837000
## 3              1784350200                 2176630700
## 4               482659300                 1560364400
## 5              8674119900                10984482900
## 6              1027262500                 1404759800
##   new_eligible_adult_medicaid_spending total_medicaide_spending
## 1                                   NA               5477631800
## 2                            491576900               2127119200
## 3                           3404972100              12550078100
## 4                            452025900               6688823800
## 5                          20881686900              84656561000
## 6                           2292057200               9503631700
##   seniors_medicaid_enrollement disability_medicaid_enrollement
## 1                       134000                          220900
## 2                        14200                           17300
## 3                       184900                          183700
## 4                        80800                          151700
## 5                      1492400                          978400
## 6                        91500                          117400
##   adult_medicaide_enrollement children_medicaide_enrollement
## 1                      209400                         582500
## 2                       52300                         101200
## 3                      465800                         834000
## 4                       77900                         380200
## 5                     3606700                        4507000
## 6                      229200                         543000
##   new_eligible_adult_medicaide_enrollement total_medicaid_enrollement
## 1                                       NA                    1146700
## 2                                    61200                     246200
## 3                                   566400                    2234800
## 4                                   341700                    1032200
## 5                                  4584700                   15169200
## 6                                   500400                    1481600
##   total_medicaid_spending_per_senior_enrollee
## 1                                    10944.96
## 2                                    18270.54
## 3                                     7275.16
## 4                                    17109.70
## 5                                    12698.39
## 6                                    17672.39
cdc_medicaid_cleaned <- cdc_medicaid_controled %>%
  select(residence_state, seniors_medicaid_enrollement, 
         seniors_medicaid_spending, total_medicaid_spending_per_senior_enrollee)
cdc_medicaid_cleaned
##         residence_state seniors_medicaid_enrollement seniors_medicaid_spending
## 1               Alabama                       134000                1466624200
## 2                Alaska                        14200                 259441700
## 3               Arizona                       184900                1345177900
## 4              Arkansas                        80800                1382463900
## 5            California                      1492400               18951080000
## 6              Colorado                        91500                1617023900
## 7           Connecticut                       154900                2195407000
## 8              Delaware                        20600                 349472800
## 9  District Of Columbia                        28800                 562522900
## 10              Florida                       716900                5978750800
## 11              Georgia                       249600                1927935100
## 12               Hawaii                        38000                 445085900
## 13                Idaho                        31000                 348395800
## 14             Illinois                       300900                3969177900
## 15              Indiana                       135800                1567340100
## 16                 Iowa                        49400                 822322400
## 17               Kansas                        43100                 743754600
## 18             Kentucky                       108900                 819633200
## 19            Louisiana                       153200                1242447900
## 20                Maine                        56700                 612090700
## 21             Maryland                       110100                1620141000
## 22        Massachusetts                       232900                4764951600
## 23             Michigan                       194800                3270458200
## 24            Minnesota                        92800                2596565200
## 25          Mississippi                       102000                1101635400
## 26             Missouri                       108700                1783013300
## 27              Montana                        18400                 271941400
## 28             Nebraska                        25900                 494097000
## 29               Nevada                        56900                 331559600
## 30        New Hampshire                        18900                 378080800
## 31           New Jersey                       171400                3320161500
## 32           New Mexico                        70100                 569619400
## 33             New York                       875000               18328779300
## 34       North Carolina                       217700                2577071900
## 35         North Dakota                        10400                 393545600
## 36                 Ohio                       246500                4079519900
## 37             Oklahoma                        74900                 929641700
## 38               Oregon                       101600                1499082000
## 39         Pennsylvania                       307600                8553622200
## 40         Rhode Island                        34600                 502304100
## 41       South Carolina                       112000                 951213300
## 42         South Dakota                        13600                 188888800
## 43            Tennessee                       159100                1710915900
## 44                Texas                       553900                7530567600
## 45                 Utah                        21800                 308655800
## 46              Vermont                        22600                 340713700
## 47             Virginia                       135800                2439919900
## 48           Washington                       142300                2345867400
## 49        West Virginia                        51400                 971740500
## 50            Wisconsin                       148600                1417385000
## 51              Wyoming                         9200                 127008300
##    total_medicaid_spending_per_senior_enrollee
## 1                                     10944.96
## 2                                     18270.54
## 3                                      7275.16
## 4                                     17109.70
## 5                                     12698.39
## 6                                     17672.39
## 7                                     14173.06
## 8                                     16964.70
## 9                                     19532.05
## 10                                     8339.73
## 11                                     7724.10
## 12                                    11712.79
## 13                                    11238.57
## 14                                    13191.02
## 15                                    11541.53
## 16                                    16646.20
## 17                                    17256.49
## 18                                     7526.48
## 19                                     8109.97
## 20                                    10795.25
## 21                                    14715.18
## 22                                    20459.22
## 23                                    16788.80
## 24                                    27980.23
## 25                                    10800.35
## 26                                    16403.07
## 27                                    14779.42
## 28                                    19077.10
## 29                                     5827.06
## 30                                    20004.28
## 31                                    19370.84
## 32                                     8125.81
## 33                                    20947.18
## 34                                    11837.72
## 35                                    37840.92
## 36                                    16549.78
## 37                                    12411.77
## 38                                    14754.74
## 39                                    27807.61
## 40                                    14517.46
## 41                                     8492.98
## 42                                    13888.88
## 43                                    10753.71
## 44                                    13595.54
## 45                                    14158.52
## 46                                    15075.83
## 47                                    17967.01
## 48                                    16485.36
## 49                                    18905.46
## 50                                     9538.26
## 51                                    13805.25
medicaid_senior_summary <- cdc_medicaid_cleaned %>%
  summarise(
    across(
      .cols = everything() & !residence_state, 
      .fns = list(
        mean = ~ round(mean(.x, na.rm = TRUE), digits =2),
        sd = ~ round(sd(.x, na.rm = TRUE), digits = 2),
        median = ~ median(.x, na.rm = TRUE),
        min = ~ min(.x, na.rm = TRUE),
        max = ~ max(.x, na.rm = TRUE)
      ) 

    )
  )

medicaid_senior_summary <- medicaid_senior_summary %>%
  pivot_longer(
    cols = everything(), 
     names_to = c("Name", "statistic"),
    names_pattern = "(.*)_(.*)", 
    values_to = "Value"
  )%>%
pivot_wider(
    names_from = Name,
    values_from = Value
  )
medicaid_senior_summary
## # A tibble: 5 × 4
##   statistic seniors_medicaid_enr…¹ seniors_medicaid_spe…² total_medicaid_spend…³
##   <chr>                      <dbl>                  <dbl>                  <dbl>
## 1 mean                     167198.            2398133647.                 14949.
## 2 sd                       252726.            3772715680.                  5780.
## 3 median                   102000             1345177900                  14517.
## 4 min                        9200              127008300                   5827.
## 5 max                     1492400            18951080000                  37841.
## # ℹ abbreviated names: ¹​seniors_medicaid_enrollement,
## #   ²​seniors_medicaid_spending, ³​total_medicaid_spending_per_senior_enrollee

********************Milestone #4,*********************

1- we joined cdc_population_cleaned, cdc_mortality_cleaned, cdc_medicare_cleaned, cdc_medicaid_cleaned

2- we calculated medicare utilization per enrollee and medicare per utilization payment as well as senior death rate per 100000, senior malignancy death rate per 100000, total medicare enrollment rate as well as senior medicaid enrollment rate

3- we reordered data in data set

4- we integrated and added US summaries for comparison purposes in our visualization 5- we combined descriptive statistics

combine_state_data <- cdc_population_cleaned  %>%
  left_join(cdc_mortality_cleaned, by = "residence_state") %>%
  
  left_join(cdc_medicare_cleaned, by = "residence_state") %>%
  
 
  left_join(cdc_medicaid_cleaned, by = "residence_state") %>%
 
  arrange(residence_state) %>%
  mutate( 
   
    
    total_senior_death_rate_per_100k = 
      round((total_death_count_65/ population_65) *100000, digits= 2),
    total_senior_malignant_death_rate_per_100k = 
      round((malignant_death_count_65/ population_65) *100000, digits= 2),
    total_senior_unknow_neoplasm_death_rate_per_100k = 
      round((unknown_neoplasm_death_count_65/ population_65) *100000, digits= 2),
   total_medicare_enrollement_rate = 
      round((total_medicare_enrollees/ population_65) *100000, digits= 2),
    seniors_medicaid_enrollement_rate = 
      round((seniors_medicaid_enrollement/ population_65) *100000, digits= 2),
  )  %>%


  
  relocate(
    total_senior_death_rate_per_100k, .after = total_death_count_65
    )  %>%
  relocate(
    total_senior_malignant_death_rate_per_100k, .after = malignant_death_count_65
    ) %>%
  relocate(
    total_senior_unknow_neoplasm_death_rate_per_100k, .after = 
      unknown_neoplasm_death_count_65
    )%>%
  relocate(
    total_medicare_enrollement_rate, .after =  total_medicare_enrollees
    )%>%
  relocate(
    seniors_medicaid_enrollement_rate, .after =  seniors_medicaid_enrollement
    )

combine_state_data
## # A tibble: 51 × 20
##    residence_state_code residence_state      population_65 total_death_count_65
##    <chr>                <chr>                        <int>                <int>
##  1 01                   Alabama                     888817                46667
##  2 02                   Alaska                       97663                 3365
##  3 03                   Arizona                    1333046                57512
##  4 04                   Arkansas                    528101                27283
##  5 05                   California                 5957092               238984
##  6 06                   Colorado                    879653                33294
##  7 07                   Connecticut                 649235                25397
##  8 08                   Delaware                    201646                 7863
##  9 09                   District Of Columbia         85838                 3353
## 10 10                   Florida                    4598386               192675
## # ℹ 41 more rows
## # ℹ 16 more variables: total_senior_death_rate_per_100k <dbl>,
## #   malignant_death_count_65 <int>,
## #   total_senior_malignant_death_rate_per_100k <dbl>,
## #   unknown_neoplasm_death_count_65 <int>,
## #   total_senior_unknow_neoplasm_death_rate_per_100k <dbl>,
## #   total_medicare_enrollees <int>, total_medicare_enrollement_rate <dbl>, …
str(combine_state_data)
## tibble [51 × 20] (S3: tbl_df/tbl/data.frame)
##  $ residence_state_code                            : chr [1:51] "01" "02" "03" "04" ...
##  $ residence_state                                 : chr [1:51] "Alabama" "Alaska" "Arizona" "Arkansas" ...
##  $ population_65                                   : int [1:51] 888817 97663 1333046 528101 5957092 879653 649235 201646 85838 4598386 ...
##  $ total_death_count_65                            : int [1:51] 46667 3365 57512 27283 238984 33294 25397 7863 3353 192675 ...
##  $ total_senior_death_rate_per_100k                : num [1:51] 5250 3446 4314 5166 4012 ...
##  $ malignant_death_count_65                        : int [1:51] 7440 807 9723 4697 44095 6035 4986 1641 710 35521 ...
##  $ total_senior_malignant_death_rate_per_100k      : num [1:51] 837 826 729 889 740 ...
##  $ unknown_neoplasm_death_count_65                 : int [1:51] 159 0 341 111 1022 159 179 36 0 1003 ...
##  $ total_senior_unknow_neoplasm_death_rate_per_100k: num [1:51] 17.9 0 25.6 21 17.2 ...
##  $ total_medicare_enrollees                        : int [1:51] 528983 105831 767513 430724 3436742 518796 363399 171134 70224 2359900 ...
##  $ total_medicare_enrollement_rate                 : num [1:51] 59515 108363 57576 81561 57692 ...
##  $ total_medicare_utilization                      : int [1:51] 512627 90717 720762 409441 3009051 467294 326323 162925 57112 2272148 ...
##  $ total_medicare_utilization_per_enrollee         : num [1:51] 0.97 0.86 0.94 0.95 0.88 0.9 0.9 0.95 0.81 0.96 ...
##  $ total_medicare_payments                         : num [1:51] 5.65e+09 1.05e+09 7.75e+09 4.27e+09 4.33e+10 ...
##  $ total_medicare_payments_per_enrollee            : num [1:51] 10688 9939 10092 9919 12586 ...
##  $ total_medicare_payment_per_utilization          : num [1:51] 11029 11594 10747 10434 14375 ...
##  $ seniors_medicaid_enrollement                    : int [1:51] 134000 14200 184900 80800 1492400 91500 154900 20600 28800 716900 ...
##  $ seniors_medicaid_enrollement_rate               : num [1:51] 15076 14540 13870 15300 25052 ...
##  $ seniors_medicaid_spending                       : num [1:51] 1.47e+09 2.59e+08 1.35e+09 1.38e+09 1.90e+10 ...
##  $ total_medicaid_spending_per_senior_enrollee     : num [1:51] 10945 18271 7275 17110 12698 ...
us_data <- combine_state_data %>%
  summarise(
    residence_state_code = "52", 
    residence_state = "United State Sum/Average",
    population_65 = sum(population_65, na.rm = TRUE),
    total_death_count_65 = sum(total_death_count_65, na.rm = TRUE),
    total_senior_death_rate_per_100k = 
      round((total_death_count_65 / population_65) * 100000, digits = 2),
    malignant_death_count_65 = sum(malignant_death_count_65, na.rm = TRUE),
    total_senior_malignant_death_rate_per_100k = 
      round((malignant_death_count_65 / population_65) * 100000, digits = 2),
    unknown_neoplasm_death_count_65 = sum(unknown_neoplasm_death_count_65, na.rm = TRUE),
    total_senior_unknow_neoplasm_death_rate_per_100k = 
      round((unknown_neoplasm_death_count_65 / population_65) * 100000, digits = 2),
    total_medicare_enrollees = sum(total_medicare_enrollees, na.rm = TRUE),
      total_medicare_enrollement_rate = 
      round((total_medicare_enrollees / population_65) * 100000, digits = 2),
    total_medicare_utilization = sum(total_medicare_utilization, na.rm = TRUE),
    total_medicare_utilization_per_enrollee = 
      round((total_medicare_enrollees / total_medicare_utilization), digits = 2),
    total_medicare_payments = sum(total_medicare_payments, na.rm = TRUE),
    total_medicare_payments_per_enrollee = 
      round((total_medicare_payments / total_medicare_enrollees), digits = 2),
    total_medicare_payment_per_utilization = 
      round((total_medicare_payments / total_medicare_utilization), digits = 2),
    seniors_medicaid_enrollement = sum(seniors_medicaid_enrollement, na.rm = TRUE),
    seniors_medicaid_enrollement_rate = 
      round((seniors_medicaid_enrollement / population_65) * 100000, digits = 2),
    seniors_medicaid_spending = sum(seniors_medicaid_spending, na.rm = TRUE),
    total_medicaid_spending_per_senior_enrollee = 
      round((seniors_medicaid_spending / seniors_medicaid_enrollement), digits = 2),
      
  )
us_data
## # A tibble: 1 × 20
##   residence_state_code residence_state        population_65 total_death_count_65
##   <chr>                <chr>                          <int>                <int>
## 1 52                   United State Sum/Aver…      55847953              2468669
## # ℹ 16 more variables: total_senior_death_rate_per_100k <dbl>,
## #   malignant_death_count_65 <int>,
## #   total_senior_malignant_death_rate_per_100k <dbl>,
## #   unknown_neoplasm_death_count_65 <int>,
## #   total_senior_unknow_neoplasm_death_rate_per_100k <dbl>,
## #   total_medicare_enrollees <int>, total_medicare_enrollement_rate <dbl>,
## #   total_medicare_utilization <int>, …
combine_state_data_us <- bind_rows(combine_state_data, us_data)
combine_state_data_us
## # A tibble: 52 × 20
##    residence_state_code residence_state      population_65 total_death_count_65
##    <chr>                <chr>                        <int>                <int>
##  1 01                   Alabama                     888817                46667
##  2 02                   Alaska                       97663                 3365
##  3 03                   Arizona                    1333046                57512
##  4 04                   Arkansas                    528101                27283
##  5 05                   California                 5957092               238984
##  6 06                   Colorado                    879653                33294
##  7 07                   Connecticut                 649235                25397
##  8 08                   Delaware                    201646                 7863
##  9 09                   District Of Columbia         85838                 3353
## 10 10                   Florida                    4598386               192675
## # ℹ 42 more rows
## # ℹ 16 more variables: total_senior_death_rate_per_100k <dbl>,
## #   malignant_death_count_65 <int>,
## #   total_senior_malignant_death_rate_per_100k <dbl>,
## #   unknown_neoplasm_death_count_65 <int>,
## #   total_senior_unknow_neoplasm_death_rate_per_100k <dbl>,
## #   total_medicare_enrollees <int>, total_medicare_enrollement_rate <dbl>, …
head(combine_state_data_us)
## # A tibble: 6 × 20
##   residence_state_code residence_state population_65 total_death_count_65
##   <chr>                <chr>                   <int>                <int>
## 1 01                   Alabama                888817                46667
## 2 02                   Alaska                  97663                 3365
## 3 03                   Arizona               1333046                57512
## 4 04                   Arkansas               528101                27283
## 5 05                   California            5957092               238984
## 6 06                   Colorado               879653                33294
## # ℹ 16 more variables: total_senior_death_rate_per_100k <dbl>,
## #   malignant_death_count_65 <int>,
## #   total_senior_malignant_death_rate_per_100k <dbl>,
## #   unknown_neoplasm_death_count_65 <int>,
## #   total_senior_unknow_neoplasm_death_rate_per_100k <dbl>,
## #   total_medicare_enrollees <int>, total_medicare_enrollement_rate <dbl>,
## #   total_medicare_utilization <int>, …
str(combine_state_data_us)
## tibble [52 × 20] (S3: tbl_df/tbl/data.frame)
##  $ residence_state_code                            : chr [1:52] "01" "02" "03" "04" ...
##  $ residence_state                                 : chr [1:52] "Alabama" "Alaska" "Arizona" "Arkansas" ...
##  $ population_65                                   : int [1:52] 888817 97663 1333046 528101 5957092 879653 649235 201646 85838 4598386 ...
##  $ total_death_count_65                            : int [1:52] 46667 3365 57512 27283 238984 33294 25397 7863 3353 192675 ...
##  $ total_senior_death_rate_per_100k                : num [1:52] 5250 3446 4314 5166 4012 ...
##  $ malignant_death_count_65                        : int [1:52] 7440 807 9723 4697 44095 6035 4986 1641 710 35521 ...
##  $ total_senior_malignant_death_rate_per_100k      : num [1:52] 837 826 729 889 740 ...
##  $ unknown_neoplasm_death_count_65                 : int [1:52] 159 0 341 111 1022 159 179 36 0 1003 ...
##  $ total_senior_unknow_neoplasm_death_rate_per_100k: num [1:52] 17.9 0 25.6 21 17.2 ...
##  $ total_medicare_enrollees                        : int [1:52] 528983 105831 767513 430724 3436742 518796 363399 171134 70224 2359900 ...
##  $ total_medicare_enrollement_rate                 : num [1:52] 59515 108363 57576 81561 57692 ...
##  $ total_medicare_utilization                      : int [1:52] 512627 90717 720762 409441 3009051 467294 326323 162925 57112 2272148 ...
##  $ total_medicare_utilization_per_enrollee         : num [1:52] 0.97 0.86 0.94 0.95 0.88 0.9 0.9 0.95 0.81 0.96 ...
##  $ total_medicare_payments                         : num [1:52] 5.65e+09 1.05e+09 7.75e+09 4.27e+09 4.33e+10 ...
##  $ total_medicare_payments_per_enrollee            : num [1:52] 10688 9939 10092 9919 12586 ...
##  $ total_medicare_payment_per_utilization          : num [1:52] 11029 11594 10747 10434 14375 ...
##  $ seniors_medicaid_enrollement                    : int [1:52] 134000 14200 184900 80800 1492400 91500 154900 20600 28800 716900 ...
##  $ seniors_medicaid_enrollement_rate               : num [1:52] 15076 14540 13870 15300 25052 ...
##  $ seniors_medicaid_spending                       : num [1:52] 1.47e+09 2.59e+08 1.35e+09 1.38e+09 1.90e+10 ...
##  $ total_medicaid_spending_per_senior_enrollee     : num [1:52] 10945 18271 7275 17110 12698 ...
combine_statistics <- population_65_summary  %>%
  left_join(mortality_summary, by = "statistic") %>%
  
  left_join(medicare_summary, by = "statistic") %>%
  
 
  left_join(medicaid_senior_summary, by = "statistic")
 
  

combine_statistics
## # A tibble: 5 × 14
##   statistic Population_65 total_death_count_65 malignant_death_count_65
##   <chr>             <dbl>                <dbl>                    <dbl>
## 1 mean           1095058.               48405.                    8752.
## 2 sd             1185180.               50340.                    8997.
## 3 median          770260                34881                     6505 
## 4 min              85838                 3353                      710 
## 5 max            5957092               238984                    44095 
## # ℹ 10 more variables: unknown_neoplasm_death_count_65 <dbl>,
## #   total_medicare_enrollees <dbl>, total_medicare_utilization <dbl>,
## #   total_medicare_utilization_per_enrollee <dbl>,
## #   total_medicare_payments <dbl>, total_medicare_payments_per_enrollee <dbl>,
## #   total_medicare_payment_per_utilization <dbl>,
## #   seniors_medicaid_enrollement <dbl>, seniors_medicaid_spending <dbl>,
## #   total_medicaid_spending_per_senior_enrollee <dbl>

1- We generated a table to show mortality rate per 100,000 among seniors in all 50 states+ DC and compare it with average U.S. mortality rates. Our table includes both total mortality rate and malignancy-related mortality rate. We highlighted total mortality state rates between 10% and 20% above U.S. average in yellow and more than 20 % above U.S. average in red.

2 - We created a bar graph with overlay to show both total and cancer mortality rates per 100,000 among seniors in all 50 states+ DC and compare it with average U.S. mortality rates.

coord_flip() was used to swap x and y axis for better visualization.

We added interpretation both to the table and graph.

3 - We created a table to show Medicare and Medicaid coverage rate among seniors in all 50 states + DC and compare it with average U.S. rate. We differentially highlighted Medicare coverage rate below 85% of U.S. coverage in red, between 85%-90% of U.S. coverage in yellow. We added interpretation both to the table and the graph.

library(pacman)
p_load(tidyverse, DT, formattable)
library (scales)
# total mortality pull
total_mortality_rate_standard_us <- combine_state_data_us%>%
  filter(residence_state == "United State Sum/Average") %>% 
  pull(total_senior_death_rate_per_100k)

total_mortality_yellow_limit <- total_mortality_rate_standard_us * 1.1 
total_mortality_red_limit <- total_mortality_rate_standard_us * 1.2

# cancer mortality pull
cancer_mortality_rate_standard_us <- combine_state_data_us%>%
  filter(residence_state == "United State Sum/Average") %>% 
  pull(total_senior_malignant_death_rate_per_100k)

cancer_mortality_yellow_limit <- cancer_mortality_rate_standard_us * 1.1 
cancer_mortality_red_limit <- cancer_mortality_rate_standard_us * 1.2

# benign mortality pull
benign_mortality_rate_standard_us <- combine_state_data_us%>%
  filter(residence_state == "United State Sum/Average") %>% 
  pull(total_senior_unknow_neoplasm_death_rate_per_100k)

benign_mortality_yellow_limit <- benign_mortality_rate_standard_us * 1.1 
benign_mortality_red_limit <- benign_mortality_rate_standard_us * 1.2


table_mortality <- combine_state_data_us %>%
  mutate(is_us = ifelse(residence_state == "United State Sum/Average", TRUE, FALSE)) %>%
  select(
    "State Name" = residence_state,
    "Total Deaths Count (65+)" = total_death_count_65,
    "Death Rate per 100k (65+)" = total_senior_death_rate_per_100k,  
    "Malignancy Deaths Count (65+)" = malignant_death_count_65,
    "Malignancy Death Rate per 100k (65+)" = total_senior_malignant_death_rate_per_100k, 
    "Benign neoplasm Deaths Count (65+)" = unknown_neoplasm_death_count_65,
    "Benign Neoplasm Death Rate per 100k (65+)" = total_senior_unknow_neoplasm_death_rate_per_100k,
    is_us
) 
   

mortality_output <- datatable(table_mortality,
          options = list(
            pageLength = 52, 
            autoWidth = TRUE),
          rownames = FALSE,
          caption = "Mortality Counts and Rates for Seniors (States Compared to US)") %>%
# Highlight  total mortality rate
  formatStyle(
    columns = "Death Rate per 100k (65+)",
    valueColumns = "Death Rate per 100k (65+)",
    backgroundColor = styleInterval(
      cuts = c(total_mortality_rate_standard_us, 
               total_mortality_yellow_limit, total_mortality_red_limit),
      values = c("#CCFFCC", "white", "#FFFFCC", "#FFCCCC")
    )
  )%>%

# Highlight  cancer mortality rate
  formatStyle(
    columns = "Malignancy Death Rate per 100k (65+)",
    valueColumns = "Malignancy Death Rate per 100k (65+)",
    backgroundColor = styleInterval(
      cuts = c(cancer_mortality_rate_standard_us, 
               cancer_mortality_yellow_limit, cancer_mortality_red_limit),
      values = c("#CCFFCC", "white", "#FFFFCC", "#FFCCCC")
    )
  )%>%
  
  
# Highlight  benign mortality rate
  formatStyle(
    columns = "Benign Neoplasm Death Rate per 100k (65+)",
    valueColumns = "Benign Neoplasm Death Rate per 100k (65+)",
    backgroundColor = styleInterval(
      cuts = c(benign_mortality_rate_standard_us, 
               benign_mortality_yellow_limit, benign_mortality_red_limit),
      values = c("#CCFFCC", "white", "#FFFFCC", "#FFCCCC")
    )
  )
print(mortality_output)
paste0(cat("Colors indicate rates 10% (yellow) and 20% (red) above the average."))
## Colors indicate rates 10% (yellow) and 20% (red) above the average.
## character(0)

#Green is equal or below U.S. average; white is up to 10% above U.S. average; #yellow is 10% to 20% above U.S. average; red is more than 20% above U.S. # average

#Three states (Mississipi, Ohlahoma, West Virginia) have the mortality rate #20% above U.S. average mortality rate. #Seven states (Alabama, Arkansas,Indiana, Kentucky, Louisiana, Ohio, Tennessee) #have the mortality rate 10% to 20% above than U.S. average mortality rate.

library(pacman)
p_load(tidyverse, plotly)
library (scales)
library(ggplot2)
library(dplyr)
library(forcats)

graph_mortality <- combine_state_data_us %>%
  select(
    state_name= residence_state,
   total_death_rate = total_senior_death_rate_per_100k,  
   malignancy_death_rate = total_senior_malignant_death_rate_per_100k 
    
)%>%
   mutate(
      state_name = factor(state_name),
     state_name = fct_rev(state_name),
    state_name = fct_relevel(state_name, "United State Sum/Average", after = Inf)
  )
   
plot_mortality <- ggplot(graph_mortality, aes(x =  state_name)) +
                           geom_bar(aes(y= malignancy_death_rate ), 
                                    stat = "identity", 
                                    fill = "#FFCCCC", 
                                    alpha = 0.8) +
  geom_bar(aes(y= total_death_rate),
           stat =  "identity", fill = "#CCFFCC", alpha = 0.4) + 
  coord_flip() +
  labs(
    title = "Total vs. Malignancy Mortality Rates per 100k Seniors",
    x = "State",
    y = "Mortality Rate per 100k Seniors",
   caption= "light bars show total death rate and darker bars show malignancy death rate. 
   Variability in malignancy mortality rates in seniors across states is less than 
   variabiliy in total mortality rates in seniors across states. "
  ) +
  theme_minimal() +
    theme(axis.text.y = element_text(size = 8),
          plot.margin = margin(t = 10, r = 10, b = 20, l = 10, unit = "pt"),
          plot.caption = element_text(hjust = 0.5))

print(plot_mortality) 

##

library(pacman)
p_load(tidyverse, DT, formattable)
library (scales)

# total medicare enrollement pull
medicare_coverage_rate_standard_us <- combine_state_data_us%>%
  filter(residence_state == "United State Sum/Average") %>% 
  pull(total_medicare_enrollement_rate)

medicare_emrollement_yellow_limit<- medicare_coverage_rate_standard_us * 0.90
medicare_emrollement_red_limit <- medicare_coverage_rate_standard_us * 0.85


#  medicare payment/ enrollee  pull
medicare_payment_per_enrollee_standard_us  <- combine_state_data_us%>% 
 filter(residence_state =="United State Sum/Average") %>% 
  pull(total_medicare_payments_per_enrollee)

medicare_payment_per_enrollee_yellow_limit<- medicare_payment_per_enrollee_standard_us * 0.90
medicare_payment_per_enrollee_red_limit <- medicare_payment_per_enrollee_standard_us * 0.85

# medicaid coverage rate for seniors pull
seniors_medicaid_enrollement_rate_standard_us <- combine_state_data_us%>%
  filter(residence_state == "United State Sum/Average") %>% 
  pull(seniors_medicaid_enrollement_rate)

seniors_medicaid_enrollement_rate_yellow_limit <- seniors_medicaid_enrollement_rate_standard_us * 0.90
seniors_medicaid_enrollement_rate_red_limit <- seniors_medicaid_enrollement_rate_standard_us * 0.85

# medicaid spending pull
total_medicaid_spending_per_senior_enrollee_standard_us <- combine_state_data_us%>%
  filter(residence_state == "United State Sum/Average") %>% 
  pull(total_medicaid_spending_per_senior_enrollee)

total_medicaid_spending_per_senior_enrollee_yellow_limit <- total_medicaid_spending_per_senior_enrollee_standard_us * 0.90
total_medicaid_spending_per_senior_enrollee_red_limit <- total_medicaid_spending_per_senior_enrollee_standard_us *  0.85

table_coverage <- combine_state_data_us %>%
  select(
    "State Name" = residence_state,
    "TotalMedicare Enrollees" = total_medicare_enrollees,
    "Total Medicare Enrollment Rate" = total_medicare_enrollement_rate,
    "Total_Medicare_Payments/Enrollee" = total_medicare_payments_per_enrollee, 
    "Total Senior Medicaid Enrollees" = seniors_medicaid_enrollement,
    "Senior Medicaid Enrollment Rate" = seniors_medicaid_enrollement_rate,
    "Medicaid Spending / Senior Enrollee" = total_medicaid_spending_per_senior_enrollee
  )
   

coverage_output <- datatable(table_coverage,
          options = list(pageLength = 52, autoWidth = TRUE),
          rownames = FALSE,
          caption = "Medicare and Medicaid Coverage and spending for Seniors (States Compared to US)") %>%
  
  formatStyle(
    columns =  "Total Medicare Enrollment Rate",
    valueColumns =  "Total Medicare Enrollment Rate",
    backgroundColor = styleInterval(
      cuts = c(medicare_emrollement_red_limit,
               medicare_emrollement_yellow_limit,
               medicare_coverage_rate_standard_us),
      values = c("#FFCCCC", "#FFFFCC", "white", "#CCFFCC")
    )
  )%>%

formatStyle(
    columns =  "Total_Medicare_Payments/Enrollee",
    valueColumns =  "Total_Medicare_Payments/Enrollee",
    backgroundColor = styleInterval(
      cuts = c(medicare_payment_per_enrollee_red_limit, 
               medicare_payment_per_enrollee_yellow_limit,
               medicare_payment_per_enrollee_standard_us),
      values = c("#FFCCCC", "#FFFFCC", "white", "#CCFFCC")
    )
  )%>%
  
  formatStyle(
    columns =  "Senior Medicaid Enrollment Rate",
    valueColumns =  "Senior Medicaid Enrollment Rate",
    backgroundColor = styleInterval(
      cuts = c(seniors_medicaid_enrollement_rate_red_limit, 
               seniors_medicaid_enrollement_rate_yellow_limit,
               seniors_medicaid_enrollement_rate_standard_us),
      values = c("#FFCCCC", "#FFFFCC", "white", "#CCFFCC")
    )
  ) %>%
  
  formatStyle(
    columns =  "Medicaid Spending / Senior Enrollee",
    valueColumns =  "Medicaid Spending / Senior Enrollee",
    backgroundColor = styleInterval(
      cuts = c(total_medicaid_spending_per_senior_enrollee_red_limit,
               total_medicaid_spending_per_senior_enrollee_yellow_limit,
               total_medicaid_spending_per_senior_enrollee_standard_us),
      values = c("#FFCCCC", "#FFFFCC", "white", "#CCFFCC")
    )
  )
print(coverage_output)
paste0(cat("Colors indicate rates 15% (red) and 10% (yellow) above the average US."))
## Colors indicate rates 15% (red) and 10% (yellow) above the average US.
## character(0)

# Green color highlights rate equals or more than U.S. average White color highlights rate up to 10% less than U.S. average Yellow color highlights rate 10% to 15 % less than U.S. average Red color highlights rate lower than 15% below U.S. average

Notably, Three states (Florida, Hawaii, Minnesota) have Medicare enrollment rate lower than 15% below U.S. average.

library(pacman)
p_load(tidyverse, plotly)
library (scales)
library(ggplot2)
library(dplyr)
library(forcats)

graph_coverage <- combine_state_data_us %>%
  select(
    state_name= residence_state,
   medicare_coverage_rate = total_medicare_enrollement_rate,  
   medicaid_seniors_coverage_rate= seniors_medicaid_enrollement_rate
    
)%>%
   mutate(
     state_name = factor(state_name),
     state_name = fct_rev(state_name),
    state_name = fct_relevel(state_name, "United State Sum/Average", after = Inf)
  )
   
plot_coverage <- ggplot(graph_coverage, aes(x =  state_name)) +
                           geom_bar(aes(y= medicare_coverage_rate), 
                                    stat = "identity", 
                                    fill = "#FFCCCC", 
                                    alpha = 0.8) +
  geom_bar(aes(y= medicaid_seniors_coverage_rate),
           stat =  "identity", fill = "#CCFFCC", alpha = 0.4) + 
  coord_flip() +
  labs(
    title = "Medicare vs Medicaid Coverage Rates per 100k Seniors",
    x = "State",
    y = "Rate per 100k Seniors",
   caption= "Pink bars show Medicare coverage rate and green bars show 
   medicaid coverage rate for seniors
   Notable variability exists in emrollment both for Medicare and Medicaid
   across different states."
  ) +
  theme_minimal() +
    theme(axis.text.y = element_text(size = 8),
          plot.margin = margin(t = 10, r = 10, b = 20, l = 10, unit = "pt"),
          plot.caption = element_text(hjust = 0.5))

print(plot_coverage) 

library(pacman)
p_load(tidyverse, plotly, DT, formattable)
library (scales)
library(ggplot2)
library(dplyr)
library(forcats)

graph_spending <- combine_state_data_us %>%
  select(
    state_name= residence_state,
   medicare_payment_rate = total_medicare_payments_per_enrollee,  
   medicaid_seniors_spending_rate= total_medicaid_spending_per_senior_enrollee
    
)%>%
   mutate(
     state_name = factor(state_name),
     state_name = fct_rev(state_name),
    state_name = fct_relevel(state_name, "United State Sum/Average", after = Inf)
  )
   
plot_spending <- ggplot(graph_spending, aes(x =  state_name)) +
                           geom_bar(aes(y= medicare_payment_rate), 
                                    stat = "identity", 
                                    fill = "#FFCCCC", 
                                    alpha = 0.8) +
  geom_bar(aes(y= medicaid_seniors_spending_rate),
           stat =  "identity", fill = "#CCFFCC", alpha = 0.4) + 
  coord_flip() +
  labs(
    title = "Medicare vs Medicaid Spending Rates per Enrollee",
    x = "State",
    y = "Rate per Senior Enrollee",
   caption= "Pink bars show Medicare payment rate and green bars show 
   medicaid spending rate for seniors
   Notable variability exists in emrollment both for Medicare and Medicaid
   across different states."
  ) +
  theme_minimal() +
    theme(axis.text.y = element_text(size = 8),
          plot.margin = margin(t = 10, r = 10, b = 20, l = 10, unit = "pt"),
          plot.caption = element_text(hjust = 0.5))

print(plot_spending) 

library(pacman)
p_load(tidyverse, DT, formattable)
library (scales)



table_trend <- combine_state_data_us %>%
  select(
    "State Name" = residence_state,
    "Death Rate per 100k (65+)" = total_senior_death_rate_per_100k,  
    "Malignancy Death Rate per 100k (65+)" = total_senior_malignant_death_rate_per_100k, 
    "Benign Neoplasm Death Rate per 100k (65+)" = total_senior_unknow_neoplasm_death_rate_per_100k,
    "Total Medicare Enrollment Rate" = total_medicare_enrollement_rate,
    "Total_Medicare_Payments/Enrollee" = total_medicare_payments_per_enrollee, 
    "Senior Medicaid Enrollment Rate" = seniors_medicaid_enrollement_rate,
    "Medicaid Spending / Senior Enrollee" = total_medicaid_spending_per_senior_enrollee
  )
   

trend_output <- datatable(table_trend,
          options = list(pageLength = 52, autoWidth = TRUE),
          rownames = FALSE,
          caption = "Mortality trend vs Medicare and Medicaid Coverage and spending for Seniors (States Compared to US)"
          )
print(trend_output)
paste0(cat("Colors indicate rates 15% (red) and 10% (yellow) above the average US."))
## Colors indicate rates 15% (red) and 10% (yellow) above the average US.
## character(0)
library(pacman)
p_load(tidyverse, plotly, DT, formattable)
library (scales)
library(ggplot2)
library(dplyr)
library(forcats)
library(patchwork)
states_only_data <- combine_state_data_us %>%
  filter(residence_state != "United State Sum/Average")

us_only_data <- combine_state_data_us %>%
  filter(residence_state == "United State Sum/Average")


medicare_enrollment_trend <- ggplot(states_only_data, 
       aes(x = total_medicare_enrollement_rate, 
           y = total_senior_death_rate_per_100k)) +
  geom_point(aes(color = "States"), alpha = 0.6, size = 3) +
  geom_smooth(method = "lm", se = FALSE, color = "blue", linetype = "dashed") + 
  geom_point(data = us_only_data, aes(color = "US"), size = 5, shape = 18) + 
  labs(
    title = "A: Medicare Enrollement Rate",

    x = "Total Medicare Enrollment Rate",
    y = "Moratality Rate per 100k (65+)",
  
  ) + scale_color_manual(values = c("States" = "black", "US" = "green")) +
  theme_minimal() +
  theme(plot.title = element_text(size = 12, hjust = 0.5, face = "bold"))

medicare_payment_trend <- ggplot(states_only_data, 
       aes(x = total_medicare_payments_per_enrollee, 
           y = total_senior_death_rate_per_100k)) +
  geom_point(aes(color = "States"), alpha = 0.6, size = 3) +
  geom_smooth(method = "lm", se = FALSE, color = "blue", linetype = "dashed") + 
  geom_point(data = us_only_data, aes(color = "US"), size = 5, shape = 18) + 
  labs(
    title = "B: Medicare Payment Rate",

    x = "Total_Medicare_Payments/Enrollee",
    y = "Moratality Rate per 100k (65+)",
  
  ) + scale_color_manual(values = c("States" = "black", "US" = "green")) +
  theme_minimal() +
  theme(plot.title = element_text(size = 12, hjust = 0.5, face = "bold")) 


seniors_medicaid_enrollment_trend <- ggplot(states_only_data, 
       aes(x = seniors_medicaid_enrollement_rate, 
           y = total_senior_death_rate_per_100k)) +
  geom_point(aes(color = "States"), alpha = 0.6, size = 3) +
  geom_smooth(method = "lm", se = FALSE, color = "blue", linetype = "dashed") + 
  geom_point(data = us_only_data, aes(color = "US"), size = 5, shape = 18) + 
  labs(
    title = "C: Medicaid Seniors Enrollement Rate",

    x = "Senior Medicaid Enrollment Rate",
    y = "Moratality Rate per 100k (65+)",
  
  ) + scale_color_manual(values = c("States" = "black", "US" = "green")) +
  theme_minimal() +
  theme(plot.title = element_text(size = 12, hjust = 0.5, face = "bold"))

seniors_medicaid_spending_trend <- ggplot(states_only_data, 
       aes(x = total_medicaid_spending_per_senior_enrollee, 
           y = total_senior_death_rate_per_100k)) +
  geom_point(aes(color = "States"), alpha = 0.6, size = 3) +
  geom_smooth(method = "lm", se = FALSE, color = "blue", linetype = "dashed") + 
  geom_point(data = us_only_data, aes(color = "US"), size = 5, shape = 18) + 
  labs(
    title = "D:  Medicaid Seniors Spending Rate",

    x = "Total_Medicaid_Payments/Enrollee",
    y = "Moratality Rate per 100k (65+)",
  
  ) + scale_color_manual(values = c("States" = "black", "US" = "green")) +
  theme_minimal() +
  theme(plot.title = element_text(size = 12, hjust = 0.5, face = "bold")) 

(medicare_enrollment_trend | medicare_payment_trend) / (seniors_medicaid_enrollment_trend | seniors_medicaid_spending_trend) +
  plot_annotation(
    title = "Mortality Trend Analysis vs Public Health Coverage Metrics across US States"
  
  ) &
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 8, margin = margin(t = 5)), 
    axis.title.y = element_text(size = 12),
    axis.text.y = element_text(size = 12))

library(pacman)
p_load(tidyverse, plotly, DT, formattable)
library (scales)
library(ggplot2)
library(dplyr)
library(forcats)
library(patchwork)
states_only_data <- combine_state_data_us %>%
  filter(residence_state != "United State Sum/Average")

us_only_data <- combine_state_data_us %>%
  filter(residence_state == "United State Sum/Average")


medicare_enrollment_cancer_trend <- ggplot(states_only_data, 
       aes(x = total_medicare_enrollement_rate, 
           y = total_senior_malignant_death_rate_per_100k)) +
  geom_point(aes(color = "States"), alpha = 0.6, size = 3) +
  geom_smooth(method = "lm", se = FALSE, color = "blue", linetype = "dashed") + 
  geom_point(data = us_only_data, aes(color = "US"), size = 5, shape = 18) + 
  labs(
    title = "A: Medicare Enrollement Rate",

    x = "Total Medicare Enrollment Rate",
    y = "Cancer Moratality Rate per 100k (65+)",
  
  ) + scale_color_manual(values = c("States" = "black", "US" = "green")) +
  theme_minimal() +
  theme(plot.title = element_text(size = 12, hjust = 0.5, face = "bold"))

medicare_payment_cancer_trend <- ggplot(states_only_data, 
       aes(x = total_medicare_payments_per_enrollee, 
           y = total_senior_malignant_death_rate_per_100k)) +
  geom_point(aes(color = "States"), alpha = 0.6, size = 3) +
  geom_smooth(method = "lm", se = FALSE, color = "blue", linetype = "dashed") + 
  geom_point(data = us_only_data, aes(color = "US"), size = 5, shape = 18) + 
  labs(
    title = "B: Medicare Payment Rate",

    x = "Total_Medicare_Payments/Enrollee",
    y = "Cancer Moratality Rate per 100k (65+)",
  
  ) + scale_color_manual(values = c("States" = "black", "US" = "green")) +
  theme_minimal() +
  theme(plot.title = element_text(size = 12, hjust = 0.5, face = "bold")) 


seniors_medicaid_enrollment_cancer_trend <- ggplot(states_only_data, 
       aes(x = seniors_medicaid_enrollement_rate, 
           y = total_senior_malignant_death_rate_per_100k)) +
  geom_point(aes(color = "States"), alpha = 0.6, size = 3) +
  geom_smooth(method = "lm", se = FALSE, color = "blue", linetype = "dashed") + 
  geom_point(data = us_only_data, aes(color = "US"), size = 5, shape = 18) + 
  labs(
    title = "C: Medicaid Seniors Enrollement Rate",

    x = "Senior Medicaid Enrollment Rate",
    y = "Cancer Moratality Rate per 100k (65+)",
  
  ) + scale_color_manual(values = c("States" = "black", "US" = "green")) +
  theme_minimal() +
  theme(plot.title = element_text(size = 12, hjust = 0.5, face = "bold"))

seniors_medicaid_spending_cancer_trend <- ggplot(states_only_data, 
       aes(x = total_medicaid_spending_per_senior_enrollee, 
           y = total_senior_malignant_death_rate_per_100k)) +
  geom_point(aes(color = "States"), alpha = 0.6, size = 3) +
  geom_smooth(method = "lm", se = FALSE, color = "blue", linetype = "dashed") + 
  geom_point(data = us_only_data, aes(color = "US"), size = 5, shape = 18) + 
  labs(
    title = "D:  Medicaid Seniors Spending Rate",

    x = "Total_Medicaid_Payments/Enrollee",
    y = "Cancer Moratality Rate per 100k (65+)",
  
  ) + scale_color_manual(values = c("States" = "black", "US" = "green")) +
  theme_minimal() +
  theme(plot.title = element_text(size = 12, hjust = 0.5, face = "bold")) 

(medicare_enrollment_cancer_trend | medicare_payment_cancer_trend) / (seniors_medicaid_enrollment_cancer_trend | seniors_medicaid_spending_cancer_trend) +
  plot_annotation(
    title = "Cancer Mortality Trend Analysis vs Public Health Coverage Metrics across US States"
  
  
  )&
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 8, margin = margin(t = 5)), 
    plot.title = element_text(size = 12, hjust = 0.5, face = "bold"),
    axis.title.y = element_text(size = 10),
    axis.text.y = element_text(size = 10))

library(pacman)
p_load(tidyverse, DT, formattable)
library (scales)



table_statistics <- combine_statistics %>%
  select(
    everything()
  )
   

statistics_output <- datatable(table_statistics,
          options = list(pageLength = 52, autoWidth = TRUE),
          rownames = FALSE,
          caption = "Statistics Metrics for Mortlaity an Public Healthcare Coverage for Seniors in US"
          )
print(statistics_output)