Nearly 50% of physicians age 55 and older have reported being sued according to the American Medical Association and in 2022, 31% of physicians have been sued. With the prevalence of these malpractice suits I was interested to see if there is any association between the amount received by the patient and whether they had a private attorney in addition to what type of insurance they had (private, uninsured, or government), severity of damage, as well as age and gender of patient.
Data is from Kaggle. The data set contains information about 79,210 claim payments made.
Some visualization completed in a dashboard and story on Tableau. Other data visualization and the analysis is completed with R.
Load packages
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(janitor)
##
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
Load data and make any adjustments
mm <- read.csv("/Users/elsieyi/Documents/Projects/Kaggle/Medical Malpractice/medicalmalpractice.csv")
mm$severe_range <- cut(mm$Severity, breaks = c(min(mm$Severity),3,6,max(mm$Severity)), include.lowest= TRUE)
mm$age_range <- cut(mm$Age, breaks = c(min(mm$Age), 18, 25, 35, 45, 55, 65, max(mm$Age)), include.lowest=TRUE)
mm$median <- with(mm, Amount > 98131, 'TRUE','FALSE')
There are 79,210 entries in the dataset. Lowest amount of malpractice suit payment is $1576 to a patient age 75 who listed severity as level 3.
min(mm$Amount)
## [1] 1576
mm[mm$Amount == '1576',]
## Amount Severity Age Private.Attorney Marital.Status Specialty
## 5181 1576 3 75 0 4 Anesthesiology
## Insurance Gender severe_range age_range median
## 5181 Unknown Male [1,3] (65,87] FALSE
Highest amount of malpractice suit payment is $926,411 to a patient age 50 who listed severity as level 6.
max(mm$Amount)
## [1] 926411
mm[mm$Amount == '926411',]
## Amount Severity Age Private.Attorney Marital.Status Specialty
## 36699 926411 6 50 1 0 Family Practice
## Insurance Gender severe_range age_range median
## 36699 Private Male (3,6] (45,55] TRUE
52,349 of the entries had a private attorney.
nrow(mm[mm$Private.Attorney == 1,])
## [1] 52349
summary(mm)
## Amount Severity Age Private.Attorney Marital.Status
## Min. : 1576 Min. :1.0 Min. : 0.0 Min. :0.0000 Min. :0.00
## 1st Qu.: 43670 1st Qu.:3.0 1st Qu.:28.0 1st Qu.:0.0000 1st Qu.:1.00
## Median : 98131 Median :4.0 Median :43.0 Median :1.0000 Median :2.00
## Mean :157485 Mean :4.8 Mean :42.7 Mean :0.6609 Mean :1.89
## 3rd Qu.:154675 3rd Qu.:7.0 3rd Qu.:58.0 3rd Qu.:1.0000 3rd Qu.:2.00
## Max. :926411 Max. :9.0 Max. :87.0 Max. :1.0000 Max. :4.00
##
## Specialty Insurance Gender severe_range
## Length:79210 Length:79210 Length:79210 [1,3]:30256
## Class :character Class :character Class :character (3,6]:28699
## Mode :character Mode :character Mode :character (6,9]:20255
##
##
##
##
## age_range median
## [0,18] :10357 Mode :logical
## (18,25]: 7036 FALSE:39605
## (25,35]:11722 TRUE :39605
## (35,45]:13624
## (45,55]:13671
## (55,65]:11343
## (65,87]:11457
Looking at what proportion of those who received a claim payment greater than $98,131 had a private attorney and those who received claim payment less than $98,131. The value 98,131 was chosen as it is the median value of all payments.
11.7% of those who received a payment of more than $98,131 did not have a private attorney whereas 38.3% did.
prop.table(table(mm$median,mm$Private.Attorney))
##
## 0 1
## FALSE 0.2218533 0.2781467
## TRUE 0.1172579 0.3827421
As severity of damage to the patient increases, so does the amount of claim payment.
as_set <- aggregate(Amount ~ severe_range, mm, mean)
ggplot(data=as_set, aes(x = severe_range, y=Amount)) +
geom_bar(stat="identity", width=0.5) +
scale_y_continuous(labels= scales::comma) + ggtitle("Average Claim Payment Amount vs Severity of Damage")
Proportion of those in specified age groups in the damage severity groups. A majority of those in the [18,25], [25,35], [55,65], and [65,87] age group experienced damage in the [1,3] severity level whereas a majority in the [0,18], (35,45], and (45,55] age groups had severity level range of (3,6].
The following boxplot shoes a breakdown of the amount of claim payment received and patient ages and severity rating. There is a greater range seen in amount for those who listed damage severity in the (6,9] group especially amonng those in the [0,18], [35,45], [45,55], and [55,65] age groups.
mean_of_sum <- mm
ggplot(mm, aes(x=age_range, y=Amount, fill=severe_range)) + geom_boxplot() +
labs(fill='Severity Range', x='Age Range', y='Claim Payment Amount') +
scale_y_continuous(labels= scales::comma) +
ggtitle("Claim Payment among Age and Severity Groups")
The top 3 specialties with the most number of malpractice claims are Family Practice, General Surgery, and OBGYN. The lowest 3 specialties with the fewest number of malpractice claims are Pathology, Thoracic Surgery, and Physical Medicine.
mm%>%
tabyl(Specialty) %>%
arrange(desc(n))
## Specialty n percent
## Family Practice 11436 0.144375710
## General Surgery 9412 0.118823381
## OBGYN 8876 0.112056559
## Anesthesiology 8732 0.110238606
## Orthopedic Surgery 7272 0.091806590
## Internal Medicine 5223 0.065938644
## Neurology/Neurosurgery 4737 0.059803055
## Emergency Medicine 4676 0.059032950
## Ophthamology 3289 0.041522535
## Cardiology 2659 0.033568994
## Urological Surgery 2027 0.025590203
## Resident 1983 0.025034718
## Radiology 1979 0.024984219
## Pediatrics 1416 0.017876531
## Dermatology 1384 0.017472541
## Plastic Surgeon 1364 0.017220048
## Occupational Medicine 725 0.009152885
## Pathology 714 0.009014013
## Thoracic Surgery 664 0.008382780
## Physical Medicine 642 0.008105037
Family practice physicians saw the greatest sum of claim payments but also saw 11,436 suits. The second highest specialty is OBGYN abd saw a total iof 8,876 suits.
sp_am <- mm %>%
group_by (Specialty) %>%
summarise(across(c(Amount), sum))
end_point = 0.5 +nrow(sp_am) +nrow(sp_am) -1
ggplot(sp_am, aes(x=Specialty, y=Amount, las=2))+
scale_y_continuous(labels= scales::comma) +
geom_bar(stat = 'identity') + labs(x='Physician Specialty' ,y= 'Total Payment Amount') +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
Looking at average payment amount, we see that the two highest are no longer Family Practice or OBGYN but instead Dermatology and Pedatrics. This is interesting as the AMA reported both pediatrics and dermatologists to be of the three least likely to be sued specialties. Whereas OBGYN was listed as a high risk of being sued. The AMA report did not mention anything related to payment costs.
sp_am01 <- mm %>%
group_by (Specialty) %>%
summarise(across(c(Amount), mean))
end_point = 0.5 +nrow(sp_am) +nrow(sp_am) -1
ggplot(sp_am01, aes(x=Specialty, y=Amount, las=2))+
scale_y_continuous(labels= scales::comma) +
geom_bar(stat = 'identity') + labs(x='Physician Specialty' ,y= 'Average Payment Amount') +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
Without consideration for other variables, it can be seen that having a private attorney is statisitcally significant and is expected to increase the amount one can receive in the malpractice suit by an estimated $106,848.
pa <- lm(Amount ~ Private.Attorney, data = mm)
summary(pa)
##
## Call:
## lm(formula = Amount ~ Private.Attorney, data = mm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -183213 -103752 -59154 13949 837359
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 86870 1137 76.38 <2e-16 ***
## Private.Attorney 106848 1399 76.38 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 186400 on 79208 degrees of freedom
## Multiple R-squared: 0.06859, Adjusted R-squared: 0.06858
## F-statistic: 5833 on 1 and 79208 DF, p-value: < 2.2e-16
The following model adjusts for severity of damage perceived by the patient, the type of insurance, and age of patient. After adjusting for potential confounders, the presence of a private attorney still leads to an expected increase in payment amount. However the expected increase from hiring one is estimated to be $58,386.75, nearly half of what was seen in the unadjusted model.
multiple_model <- lm(Amount ~ severe_range + Private.Attorney + Insurance + Age, data = mm)
summary(multiple_model)
##
## Call:
## lm(formula = Amount ~ severe_range + Private.Attorney + Insurance +
## Age, data = mm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -316033 -98902 -27502 36410 847054
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 35259.91 2373.91 14.853 < 2e-16 ***
## severe_range(3,6] 38296.16 1438.76 26.617 < 2e-16 ***
## severe_range(6,9] 131951.47 1646.46 80.143 < 2e-16 ***
## Private.Attorney 58386.75 1382.09 42.245 < 2e-16 ***
## InsuranceNo Insurance 52506.23 2519.38 20.841 < 2e-16 ***
## InsurancePrivate 113315.43 1895.56 59.779 < 2e-16 ***
## InsuranceUnknown 18276.72 1975.18 9.253 < 2e-16 ***
## InsuranceWorkers Compensation -27301.63 4193.26 -6.511 7.52e-11 ***
## Age -543.35 30.91 -17.580 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 170600 on 79201 degrees of freedom
## Multiple R-squared: 0.2197, Adjusted R-squared: 0.2197
## F-statistic: 2788 on 8 and 79201 DF, p-value: < 2.2e-16
I would like to look further into geographic location if the data is available. Those who file their suit in urban vs rural areas may see a difference in payment amount. I would also like to look at suits made in the dental field with different dental specialties.