Aplying Apprpriate Packages

library(ggplot2)
library(dplyr)

## Warning: package 'dplyr' was built under R version 3.4.2

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(tidyr)

Introduction

We are looking at information on the Spokane medical providers and want to see the relationships betweeen different characteristics of the providers and the services they provide.

Terms to Define

Average Medicare Allowed= Average of the Medicare allowed amount for the service; this figure is the sum ofthe amount Medicare pays, the deductible and coinsurance amounts that the beneficiary is responsible for paying, and any amounts that a third party is responsible for paying

Average Medicare Charged= Average of the charges that the provider submitted for the service

Average Medicare Payment= Average amount that Medicare paid after deductible and coinsurance amounts have been deducted for the line item service.

Average Medicare Standardized Amount= Average amount that Medicare paid after beneficiary deductible and coinsurance amounts have been deducted for the line item service and after standardization of the Medicare payment has been applied. Standardization removes geographic differences in payment rates for individual services, such as those that account for local wages or input prices and makes Medicare payments across geographic areas comparable, so that differences reflect variation in factors such as physicians’ practice patterns and beneficiaries’ ability and willingness to obtain care.

Number of Distinct Beneficiary Per Day Services= Since a given beneficiary may receive multiple services of the same type (e.g., single vs. multiple cardiac stents) on a single day, this metric removes double-counting from the line service count to identify whether a unique service occurred.

1. Are there differences in the number of services, distinct beneficiary per day services, average medicare allowed, charged, and paid amount, and medicare standardized amount differ as a function of Gender, the Provider Type, and Place of Service.

load("providerspokane.Rda")

Create a dataset that compares number of services,distinct beneficiary per day services, average medicare allowed, charged/paid amount, and medicare standardized ammount with gender.

gender.mean=providerspokane%>%group_by(Gender.of.the.Provider)%>%summarize(numberofservices=mean(Number.of.Services),numberofdailydistinctservices=mean(Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services),medicareallowed=mean(Average.Medicare.Allowed.Amount),medicarecharged=mean(Average.Submitted.Charge.Amount),medicarepayment=mean(Average.Medicare.Payment.Amount),medicarestandarizedamount=mean(Average.Medicare.Standardized.Amount))

Set a filter of less than 1,000 in order to filter out the data without gender.

gender.mean.filter=filter(gender.mean, numberofservices<1000)

gendermeanfiltergather=gather(gender.mean.filter, "Service", "mean", c(2, 3, 4, 5, 6, 7))

## Warning in if (!is.finite(x)) return(FALSE): the condition has length > 1
## and only the first element will be used

Here we created a series of bar graphs that compare gender against the various statistical measures of the provider

ggplot(gendermeanfiltergather,aes(Gender.of.the.Provider,mean, fill=Gender.of.the.Provider))+geom_bar(stat="identity")+
facet_wrap(~Service)

Based on this analysis, we found that gender does not affect these statistical factors that much. For the most part, these factors remain reletively constant. The one outlier being medicare charged.

Next we looked at the same statistical factors, but this time against the Provider Type

With the extensive number of provider types, we decided it would be better to look at each statisical factor individually as opposed to in one figure.

Here we created a data set comparing provider type against the number of services

providertype.mean.numberofservices=providerspokane%>%group_by(Provider.Type)%>%summarize(numberofservices=mean(Number.of.Services))

The data of provider type is still very extensive, so we decided to group them into categories of high, medium, and low number of services.

providertype.mean.numberofservices$numberofservices.factor=ifelse(providertype.mean.numberofservices$numberofservices<=100,"low",ifelse(providertype.mean.numberofservices$numberofservices>300,"high","medium"))

providertype.mean.numberofservices.tier=providertype.mean.numberofservices%>%group_by(numberofservices.factor)%>%summarize(numberofservices=mean(numberofservices))

We then plotted a graph based on the category of provider type against the number of services

ggplot(providertype.mean.numberofservices.tier,aes(numberofservices.factor,numberofservices, fill=numberofservices.factor))+geom_bar(stat="identity")

We then followed the same procedure, but with number of distinct medicare beneficiary per day serices, average medicare allowed, average medicare charged, average medicare paid, and average medicare standarized amount.

providertype.mean.distinctdailyservices=providerspokane%>%group_by(Provider.Type)%>%summarize(distinctdailyservices=mean(Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services))

providertype.mean.distinctdailyservices$distinctdailyservices.factor=ifelse(providertype.mean.distinctdailyservices$distinctdailyservices<=100,"low",ifelse(providertype.mean.distinctdailyservices$distinctdailyservices>300,"high","medium"))

providertype.mean.distinctdailyservices.tier=providertype.mean.distinctdailyservices%>%group_by(distinctdailyservices.factor)%>%summarize(distinctdailyservices=mean(distinctdailyservices))

ggplot(providertype.mean.distinctdailyservices.tier,aes(distinctdailyservices.factor,distinctdailyservices, fill=distinctdailyservices.factor))+geom_bar(stat="identity")

providertype.mean.averagemedicareallowed=providerspokane%>%group_by(Provider.Type)%>%summarize(averagemedicareallowed=mean(Average.Medicare.Allowed.Amount))

providertype.mean.averagemedicareallowed$averagemedicareallowed.factor=ifelse(providertype.mean.averagemedicareallowed$averagemedicareallowed<=100,"low",ifelse(providertype.mean.averagemedicareallowed$averagemedicareallowed>500,"high","medium"))

providertype.mean.averagemedicareallowed.tier=providertype.mean.averagemedicareallowed%>%group_by(averagemedicareallowed.factor)%>%summarize(averagemedicareallowed=mean(averagemedicareallowed))

ggplot(providertype.mean.averagemedicareallowed.tier,aes(averagemedicareallowed.factor,averagemedicareallowed, fill=averagemedicareallowed.factor))+geom_bar(stat="identity")

providertype.mean.averagemedicarecharged=providerspokane%>%group_by(Provider.Type)%>%summarize(averagemedicarecharged=mean(Average.Submitted.Charge.Amount))

providertype.mean.averagemedicarecharged$averagemedicarecharged.factor=ifelse(providertype.mean.averagemedicarecharged$averagemedicarecharged<=200,"low",ifelse(providertype.mean.averagemedicarecharged$averagemedicarecharged>400,"high","medium"))

providertype.mean.averagemedicarecharged.tier=providertype.mean.averagemedicarecharged%>%group_by(averagemedicarecharged.factor)%>%summarize(averagemedicarecharged=mean(averagemedicarecharged))

ggplot(providertype.mean.averagemedicarecharged.tier,aes(averagemedicarecharged.factor,averagemedicarecharged, fill=averagemedicarecharged.factor))+geom_bar(stat="identity")

providertype.mean.averagemedicarepaid=providerspokane%>%group_by(Provider.Type)%>%summarize(averagemedicarepaid=mean(Average.Medicare.Payment.Amount))

providertype.mean.averagemedicarepaid$averagemedicarepaid.factor=ifelse(providertype.mean.averagemedicarepaid$averagemedicarepaid<=50,"low",ifelse(providertype.mean.averagemedicarepaid$averagemedicarepaid>100,"high","medium"))

providertype.mean.averagemedicarepaid.tier=providertype.mean.averagemedicarepaid%>%group_by(averagemedicarepaid.factor)%>%summarize(averagemedicarepaid=mean(averagemedicarepaid))

ggplot(providertype.mean.averagemedicarepaid.tier,aes(averagemedicarepaid.factor,averagemedicarepaid, fill=averagemedicarepaid.factor))+geom_bar(stat="identity")

providertype.mean.averagestandardizedamount=providerspokane%>%group_by(Provider.Type)%>%summarize(averagestandardizedamount=mean(Average.Medicare.Standardized.Amount))

providertype.mean.averagestandardizedamount$averagestandardizedamount.factor=ifelse(providertype.mean.averagestandardizedamount$averagestandardizedamount<=100,"low",ifelse(providertype.mean.averagestandardizedamount$averagestandardizedamount>200,"high","medium"))

providertype.mean.averagestandardizedamount.tier=providertype.mean.averagestandardizedamount%>%group_by(averagestandardizedamount.factor)%>%summarize(averagestandardizedamount=mean(averagestandardizedamount))

ggplot(providertype.mean.averagestandardizedamount.tier,aes(averagestandardizedamount.factor,averagestandardizedamount, fill=averagestandardizedamount.factor))+geom_bar(stat="identity")

Next we looked to see the relationship between the place of service and the various statistical measures.

With respect to this data, place of service falls under two categories: Facility based place of service and Non-facility place of service. Facility based place of service is abbreviated with an F and Non-facility place of service is abbreviated with an O. Non-facility is generally an office setting.

Place.of.Service.mean=providerspokane%>%group_by(Place.of.Service)%>%summarize(numberofservices=mean(Number.of.Services),numberofdailydistinctservices=mean(Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services),medicareallowed=mean(Average.Medicare.Allowed.Amount),medicarecharged=mean(Average.Submitted.Charge.Amount),medicarepayment=mean(Average.Medicare.Payment.Amount),medicarestandarizedamount=mean(Average.Medicare.Standardized.Amount))

Place.of.Service.mean.gather=gather(Place.of.Service.mean, "Service", "mean", c(2, 3, 4, 5, 6, 7))

## Warning in if (!is.finite(x)) return(FALSE): the condition has length > 1
## and only the first element will be used

Here we created a series of bar graphs that compare place of service against the various statistical measures of the provider

ggplot(Place.of.Service.mean.gather,aes(Place.of.Service,mean, fill=Place.of.Service))+geom_bar(stat="identity")+
facet_wrap(~Service)

Based on this analysis, we found that place of service does affect these statistical factors. In general, it will be more costly to go to a facility for treatment, but a non-facility will have more services to offer.

Type.of.Service.mean=providerspokane%>%group_by(HCPCS.Description)%>%summarize(numberofservices=mean(Number.of.Services),numberofdailydistinctservices=mean(Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services),medicareallowed=mean(Average.Medicare.Allowed.Amount),medicarecharged=mean(Average.Submitted.Charge.Amount),medicarepayment=mean(Average.Medicare.Payment.Amount),medicarestandarizedamount=mean(Average.Medicare.Standardized.Amount))

Type.of.Service.mean.gather=gather(Type.of.Service.mean, "Service", "mean", c(2, 3, 4, 5, 6, 7))

## Warning in if (!is.finite(x)) return(FALSE): the condition has length > 1
## and only the first element will be used

ggplot(Type.of.Service.mean.gather,aes(HCPCS.Description,mean))+geom_bar(stat="identity")+
facet_wrap(~Service)

Limitations

The biggest limitation we have is the usefullness and reliablility of our graphs comparing provider type against the various statistical factors. By switching from the name of the provider to the tiers of provider types, we lose the uniqueness of each provider type. The figures in affect say nothing about the provider type and only make reference to the range of values within the graphs.

Conclusion

Based on this exploration, we found that gender does not affect the various statistical factors we examined;they remain reletively constant. On the other hand, place of service is extremely vital to the statistical factors examined. Costs increase in facility based place of services whereas service options increase in non-facility place of service. Based on how we looked at the data, the affect of provider type is still unknown.

ProjectOne

Brian Grob and Ryan Strong

October 10, 2017