__________________________________________________________________________

This is a Sample Analysis of the Top 25 Inpatient Providers Who charged the most money to Medicare in 2020

My client represented Medicare which provides medical benefits under the United States government. With over 165,000 entries in the 2020 data set, it was all but impossible to figure out or comb through that much data in order to reveal certain trends of billions of dollars paid for medical care. The job was to pull out some sample information from the data set to help the stakeholders have a better picture of their spending.

They wanted to know five trends from their data and I would look for more trends during my analysis, if possible, to give additional insights and help them make some logical and informed decisions.

  1. What were the Top 25 providers who charged the most Money for Medicare inpatient in 2020?
  2. Did the number of discharged patients validate the average submitted charges?
  3. How much of the submitted charges did the company pay?
  4. What were the states where the most payments were made?
  5. What are the Diagnosis Related Groups (DRGs) that are the most expensive?

Medicare had recently observed a steady decline in the amount of data that they collected from the medical providers for the past five years which, presumably, should coincide with a decline in submitted charges and payments to those medical providers. The company wanted to investigate the charges trends in order to make payment adjustment decisions.

The data needed for this analysis was the Medicare Inpatient Hospitals payments by providers from 2016 to 2020 medical data. However, my work description was to analyze the 2020 dataset while other team members worked on the previous years. Up to that point, it didn’t seem that the organization had received updated data for the fiscal year 2021 while this analysis was being done in 2022.

The data was collected from the Medicare data.cms.gov website. Click Here.

The audience for this data analysis case study included the following stakeholders: the United States government, the Executive staff members and board of directors of Medicare and the Managing Staff. For this analysis to be properly conducted and effectively presented to the stakeholders, I had to use the Spreadsheets to merge and clean the data, and RStudio to analyze, visualize, and summarize it.

__________________________________________________________________________

Here are the plots:

ggplot(data = mdcr_pmt_2020_4)+
  geom_line(mapping=aes(x=Prvdr_Rank, y=Avg_Chrg),color="orange", size=2)+
  geom_point(mapping=aes(x=Prvdr_Rank, y=Avg_Chrg, color = Prvdr_Name, size=2))+
  labs(title = "Average Submitted Charges in Descending Order", 
       subtitle = "Providers Who Charged the Most Amount to Medicare in 2020", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")+
  annotate("text", x=15, y=3580000, label="Providers who charged the most amount",
           color="blue")

I ran the first analysis to get an insight of who were the inpatient medical providers who charged the most amount, to Medicare, in 2020. It turned out that some providers made the list more than twice. It could be because they have multiple locations or because the charges were for different Diagnosis Related Group (DRG) codes.

ggplot(data = mdcr_pmt_2020_4)+
  geom_density_2d(mapping=aes(x=Prvdr_Rank, y=Avg_Chrg),color="orange", size=2)+
  geom_point(mapping=aes(x=Prvdr_Rank, y=Avg_Chrg, color = Prvdr_Name, size=2))+
  labs(title = "Density of Average Submitted Charges", 
       subtitle = "Density of Most Amount Charged to Medicare in 2020", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")+
  annotate("text", x=15, y=3580000, label="Most amount charged by density",
           color="blue")

This graph shows the density mapping of the top providers who charged the most amount of money to Medicare in 2020. There were more providers charging less money at the mid to lower range than at the top range. Which means that fewer top providers charged more to Medicare, in 2020, on average.

ggplot(data = mdcr_pmt_2020_4)+
  geom_line(mapping=aes(x=Tot_Dschrgs, y=Avg_Chrg),color="orange", size=2)+
  geom_point(mapping=aes(x=Tot_Dschrgs, y=Avg_Chrg, color = Tot_Dschrgs, size=2))+
  labs(title = "Average Submitted Charges vs. Total Discharges", 
       subtitle = "Do the High Charges to Medicare in 2020 Correspond to the Number of Discharged Patients?", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")+
  annotate("text", x=30, y=3580000, label="High charges does not imply high discharges", color="blue")

ggplot(data = mdcr_pmt_2020_4)+
  geom_step(mapping=aes(x=Tot_Dschrgs, y=Avg_Chrg),color="orange", size=2)+
  geom_point(mapping=aes(x=Tot_Dschrgs, y=Avg_Chrg, color = Tot_Dschrgs), size = 4)+
  labs(title = "Average Submitted Charges vs. Total Discharges", 
       subtitle = "Do the High Charges to Medicare in 2020 Correspond to the Number of Discharged Patients?", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")+
  annotate("text", x=30, y=3580000, label="High charges does not imply high discharges", color="blue")

I also investigated the number of patients discharged by those providers to see if there were some type of correlation between the number of discharged patients and the average amount they charged for those patients. Meaning that the provider, who charged the most, could have had and discharged the most patients, and the next one would have discharged less, and so on. But it was not the case. In fact, out of the group, one of the providers who discharged the most patients was one who of those who submitted the least amount for remittance.

ggplot(data = mdcr_pmt_2020_4)+
  geom_line(mapping=aes(x=DRG_Cd, y=Avg_Chrg),color="orange", size=2)+
  geom_point(mapping=aes(x=DRG_Cd, y=Avg_Chrg, color = DRG_Cd, size=4))+
  labs(title = "Average Submitted Charges Based on Diagnosis Related Group (DRG) codes", 
       subtitle = "Sample of Highest Paid Diagnosis Related Group (DRG) codes by Medicare in 2020", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")+
  annotate("text", x=300, y=3580000, label="The highest paid DRG codes are between 1 and 20", color="blue")

And then I went on to investigate the Diagnosis Related Group (DRG) codes. Based on my analysis, it seemed most likely that the diagnostic codes had the most impact on the average charges submitted for payments. Moreover, I also realized that, from that group of providers, the DRGs that have the highest prices are the codes between 1 and 20.

ggplot(data = mdcr_pmt_2020_4)+
  geom_line(mapping=aes(x=Avg_Pymt, y=Avg_Chrg),color="orange", size=2)+
  geom_point(mapping=aes(x=Avg_Pymt, y=Avg_Chrg, color = Avg_Pymt, size=2))+
  labs(title = "Average Submitted Charges vs. Average Medicare Payment Amounts", 
       subtitle = "Comparing the Average Submitted Charges with the Average Payment Amounts by Medicare in 2020", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")+
  annotate("text", x=250000, y=3580000, label="Average payments are less than average charges", color="blue")

Moreover, I analyzed how much of those charges submitted to Medicare did they really pay to get some insights on the gap between the amounts charged and the amounts paid, and if I could give any recommendations based on that analysis. In general, based on the result from the analysis, Medicare paid much less than the average submitted charges to the top 25 providers.

ggplot(data = mdcr_pmt_2020_4)+
  geom_smooth(mapping=aes(x=Avg_Pymt, y=Avg_Chrg),color="orange", size=2, fill="green", weight="25")+
  geom_point(mapping=aes(x=Avg_Pymt, y=Avg_Chrg, color = Avg_Pymt, size=2))+
  labs(title = "Average Submitted Charges vs. Average Medicare Payment Amounts Curved", 
       subtitle = "Curved Graph the Average Submitted Charges compared to the Average Payment Amounts", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")+
  annotate("text", x=250000, y=3580000, label="Higher payments correspond to higher charges", color="blue")

The analysis also showed that the average payments curved up just as the average amount charged to Medicare was also going up. This means that there was a correlation between the two.

ggplot(data = mdcr_pmt_2020_4)+
  geom_line(mapping=aes(x=Tot_Pymt, y=Avg_Chrg),color="orange", size = 2)+
  geom_point(mapping=aes(x=Tot_Pymt, y=Avg_Chrg, color = Tot_Pymt, size = 2))+
  labs(title = "Average Submitted Charges vs. Average Medicare Total Payment Amounts", 
       subtitle = "Comparing the Average Submitted Charges with the Average Total Payment Amounts by Medicare in 2020", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")+
  annotate("text", x=280000, y=3580000, label="Total payments does not equal to average charges", color="blue")

In addition, the data showed that Medicare paid even less when I analyzed the average total payments that they made to those providers. In some cases, they would pay just a fraction of the charges submitted to them. But they might want to improve there too, or figure out a rapprochement between the average charges and the total payments.

ggplot(data = mdcr_pmt_2020_4)+
  geom_smooth(mapping=aes(x=Tot_Pymt, y=Avg_Chrg),color="orange", size=2, fill="green", weight="25")+
  geom_point(mapping=aes(x=Tot_Pymt, y=Avg_Chrg, color = Tot_Pymt, size=2))+
  labs(title = "Average Submitted Charges vs. Average Total Payment Amounts Curved", 
       subtitle = "Curved Graph of the Average Submitted Charges compared to the Total Payment Amounts", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")+
  annotate("text", x=260000, y=3580000, label="Higher total payments correspond to higher charges", color="blue")

Similar to the average payments, the total payments were also curved up, meaning that as the charges were higher, the payments were higher as well.

ggplot(data = mdcr_pmt_2020_4)+
  geom_col(mapping=aes(x=Prvdr_State, y=Avg_Chrg, fill = Prvdr_State))+
  labs(title = "Locations Where the Highest Medical Charges came From", 
       subtitle = "Bar Chart of the States from Which the Highest Charges Were Submitted to Medicare in 2020", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")

Instead of looking into the payments and/or total payments, Medicare might want to review the DRG codes per state because, apparently, there was a small group of states where the highest charges stemmed from. At the very top was California, according to the next graph.

ggplot(data = mdcr_pmt_2020_4)+
  geom_line(mapping=aes(x=Prvdr_State, y=Avg_Chrg),color="orange", size=2)+
  geom_point(mapping=aes(x=Prvdr_State, y=Avg_Chrg, color = Prvdr_State, size=2))+
  labs(title = "Locations Where the Highest Medical Charges are From", 
       subtitle = "The States from Which the Highest Charges Were Submitted to Medicare in 2020", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")

The evidence from the analysis further showed that California had not only the highest charges out of the top inpatient providers in 2020, but also it is where most of the highest charges originated from as well.

ggplot(data = new_providers)+
  geom_point(mapping=aes(x=DRG_Cd, y=Tot_Dschrgs, color = Prvdr_State))+
  facet_grid(~Prvdr_State)+
  labs(title = "Overall DRG Discharge Concentration of the Top Five States", 
       subtitle = "The Overall DRG discharge Concentration of the Five States With the Highest Charges", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")

However, New York had the most Overall DRG discharge Concentration of the five states with the highest charges, and Pennsylvania had the least. The analysis also revealed that the most expensive DRGs of this group (the first 20), don’t have a high volume of discharges in none of those states.

ggplot(data = new_providers)+
  geom_point(mapping=aes(x=Avg_Submtd_Chrg, y=Avg_Tot_Pymt_Amt, color = Prvdr_State))+
  facet_grid(~Prvdr_State)+
  labs(title = "Overall Average Total Payments of the Top Five States", 
       subtitle = "The Overall Average Total Payments Concentration of the Five States With the Highest Charges", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")

On the other hand, California lead the way with the average total payments versus Average submitted charges while Georgia recorded the least, out ot the top five providers.

ggplot(data = new_providers)+
  geom_point(mapping=aes(x=Avg_Submtd_Chrg, y=Avg_Mdcr_Pymt_Amt, color = Prvdr_State))+
  facet_grid(~Prvdr_State)+
  labs(title = "Overall Average Medicare Payment Amounts of the Top Five States", 
       subtitle = "The Overall Average Medicare Payments Concentration of the Five States With the Highest Charges", 
       caption = "Medicare data collected by Fritz Tardieu from cms.gov in September 2022")

Similarly, California had the most concentrated average Medicare payments, and Georgia, once again, had the least, out of the top five providers.

__________________________________________________________________________

In conclusion:

This analysis was by no means a full conclusion of the complete inner works of the Medicare system. What it revealed, however, was very intriguing. While I sampled a very small subset of inpatient providers, I was able to analyze, nonetheless, the highest charges that were submitted to Medicare in 2020. The analysis revealed that those charges were associated with the Diagnosis Related Groups (DRG) codes and that those codes were the ones between 1 and 20 out of hundreds. Likewise, those high charges came from a very small group of states; five to be exact, with California leading the way.

I recommended that the stakeholders focus on negotiating in two key areas in order to lower the providers’ charges.

__________________________________________________________________________

Thank You!