Nearly 50% of physicians age 55 and older have reported being sued according to the American Medical Association and in 2022, 31% of physicians have been sued. With the prevalence of these malpractice suits I was interested to see if there is any association between the amount received by the patient and whether they had a private attorney in addition to what type of insurance they had (private, uninsured, or government), severity of damage, as well as age and gender of patient.

Does having a private attorney lead to increased amount in claim payment?

Dataset

Data is from Kaggle. The data set contains information about 79,210 claim payments made.

Some visualization completed in a dashboard and story on Tableau. Other data visualization and the analysis is completed with R.

Load packages

library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(janitor)
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

Load data and make any adjustments

mm <- read.csv("/Users/elsieyi/Documents/Projects/Kaggle/Medical Malpractice/medicalmalpractice.csv")

mm$severe_range <- cut(mm$Severity, breaks = c(min(mm$Severity),3,6,max(mm$Severity)), include.lowest= TRUE)
mm$age_range <- cut(mm$Age, breaks = c(min(mm$Age), 18, 25, 35, 45, 55, 65, max(mm$Age)), include.lowest=TRUE)
mm$median <- with(mm, Amount > 98131, 'TRUE','FALSE')

Initial data insights

Data variables

  • Amount: claim payment in dollars
  • Severity: rating of damage to patient. 1 (emotional trauma) to 9 (death) || adjusted into range later on
  • Age: age of patient in years
  • Private Attorney: 0: not represented by private attorney || 1: represented by private attorney
  • Specialty: physician specialty
  • Insurance: patient’s medical insurance
  • Gender: patient’s gender

There are 79,210 entries in the dataset. Lowest amount of malpractice suit payment is $1576 to a patient age 75 who listed severity as level 3.

min(mm$Amount)
## [1] 1576
mm[mm$Amount == '1576',]
##      Amount Severity Age Private.Attorney Marital.Status      Specialty
## 5181   1576        3  75                0              4 Anesthesiology
##      Insurance Gender severe_range age_range median
## 5181   Unknown   Male        [1,3]   (65,87]  FALSE

Highest amount of malpractice suit payment is $926,411 to a patient age 50 who listed severity as level 6.

max(mm$Amount)
## [1] 926411
mm[mm$Amount == '926411',]
##       Amount Severity Age Private.Attorney Marital.Status       Specialty
## 36699 926411        6  50                1              0 Family Practice
##       Insurance Gender severe_range age_range median
## 36699   Private   Male        (3,6]   (45,55]   TRUE

52,349 of the entries had a private attorney.

nrow(mm[mm$Private.Attorney == 1,])
## [1] 52349
summary(mm)
##      Amount          Severity        Age       Private.Attorney Marital.Status
##  Min.   :  1576   Min.   :1.0   Min.   : 0.0   Min.   :0.0000   Min.   :0.00  
##  1st Qu.: 43670   1st Qu.:3.0   1st Qu.:28.0   1st Qu.:0.0000   1st Qu.:1.00  
##  Median : 98131   Median :4.0   Median :43.0   Median :1.0000   Median :2.00  
##  Mean   :157485   Mean   :4.8   Mean   :42.7   Mean   :0.6609   Mean   :1.89  
##  3rd Qu.:154675   3rd Qu.:7.0   3rd Qu.:58.0   3rd Qu.:1.0000   3rd Qu.:2.00  
##  Max.   :926411   Max.   :9.0   Max.   :87.0   Max.   :1.0000   Max.   :4.00  
##                                                                               
##   Specialty          Insurance            Gender          severe_range 
##  Length:79210       Length:79210       Length:79210       [1,3]:30256  
##  Class :character   Class :character   Class :character   (3,6]:28699  
##  Mode  :character   Mode  :character   Mode  :character   (6,9]:20255  
##                                                                        
##                                                                        
##                                                                        
##                                                                        
##    age_range       median       
##  [0,18] :10357   Mode :logical  
##  (18,25]: 7036   FALSE:39605    
##  (25,35]:11722   TRUE :39605    
##  (35,45]:13624                  
##  (45,55]:13671                  
##  (55,65]:11343                  
##  (65,87]:11457

Exploratory Data Analysis

Looking at what proportion of those who received a claim payment greater than $98,131 had a private attorney and those who received claim payment less than $98,131. The value 98,131 was chosen as it is the median value of all payments.

11.7% of those who received a payment of more than $98,131 did not have a private attorney whereas 38.3% did.

prop.table(table(mm$median,mm$Private.Attorney))
##        
##                 0         1
##   FALSE 0.2218533 0.2781467
##   TRUE  0.1172579 0.3827421

As severity of damage to the patient increases, so does the amount of claim payment.

as_set <- aggregate(Amount ~ severe_range, mm, mean)
ggplot(data=as_set, aes(x = severe_range, y=Amount)) +
  geom_bar(stat="identity", width=0.5) +
  scale_y_continuous(labels= scales::comma) + ggtitle("Average Claim Payment Amount vs Severity of Damage")

Proportion of those in specified age groups in the damage severity groups. A majority of those in the [18,25], [25,35], [55,65], and [65,87] age group experienced damage in the [1,3] severity level whereas a majority in the [0,18], (35,45], and (45,55] age groups had severity level range of (3,6].

The following boxplot shoes a breakdown of the amount of claim payment received and patient ages and severity rating. There is a greater range seen in amount for those who listed damage severity in the (6,9] group especially amonng those in the [0,18], [35,45], [45,55], and [55,65] age groups.

mean_of_sum <- mm
ggplot(mm, aes(x=age_range, y=Amount, fill=severe_range)) + geom_boxplot() +
  labs(fill='Severity Range', x='Age Range', y='Claim Payment Amount') +
  scale_y_continuous(labels= scales::comma) +
  ggtitle("Claim Payment among Age and Severity Groups")

The top 3 specialties with the most number of malpractice claims are Family Practice, General Surgery, and OBGYN. The lowest 3 specialties with the fewest number of malpractice claims are Pathology, Thoracic Surgery, and Physical Medicine.

mm%>%
  tabyl(Specialty) %>%
  arrange(desc(n))
##               Specialty     n     percent
##         Family Practice 11436 0.144375710
##         General Surgery  9412 0.118823381
##                   OBGYN  8876 0.112056559
##          Anesthesiology  8732 0.110238606
##      Orthopedic Surgery  7272 0.091806590
##       Internal Medicine  5223 0.065938644
##  Neurology/Neurosurgery  4737 0.059803055
##      Emergency Medicine  4676 0.059032950
##            Ophthamology  3289 0.041522535
##              Cardiology  2659 0.033568994
##      Urological Surgery  2027 0.025590203
##                Resident  1983 0.025034718
##               Radiology  1979 0.024984219
##              Pediatrics  1416 0.017876531
##             Dermatology  1384 0.017472541
##         Plastic Surgeon  1364 0.017220048
##   Occupational Medicine   725 0.009152885
##               Pathology   714 0.009014013
##        Thoracic Surgery   664 0.008382780
##       Physical Medicine   642 0.008105037

Family practice physicians saw the greatest sum of claim payments but also saw 11,436 suits. The second highest specialty is OBGYN abd saw a total iof 8,876 suits.

sp_am <- mm %>%
  group_by (Specialty) %>%
  summarise(across(c(Amount), sum))

end_point = 0.5 +nrow(sp_am) +nrow(sp_am) -1
ggplot(sp_am, aes(x=Specialty, y=Amount, las=2))+
    scale_y_continuous(labels= scales::comma) +
  geom_bar(stat = 'identity') + labs(x='Physician Specialty' ,y= 'Total Payment Amount') + 
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Looking at average payment amount, we see that the two highest are no longer Family Practice or OBGYN but instead Dermatology and Pedatrics. This is interesting as the AMA reported both pediatrics and dermatologists to be of the three least likely to be sued specialties. Whereas OBGYN was listed as a high risk of being sued. The AMA report did not mention anything related to payment costs.

sp_am01 <- mm %>%
  group_by (Specialty) %>%
  summarise(across(c(Amount), mean))

end_point = 0.5 +nrow(sp_am) +nrow(sp_am) -1
ggplot(sp_am01, aes(x=Specialty, y=Amount, las=2))+
    scale_y_continuous(labels= scales::comma) +
  geom_bar(stat = 'identity') + labs(x='Physician Specialty' ,y= 'Average Payment Amount') + 
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Modeling

Unadjusted Model

Without consideration for other variables, it can be seen that having a private attorney is statisitcally significant and is expected to increase the amount one can receive in the malpractice suit by an estimated $106,848.

pa <- lm(Amount ~ Private.Attorney, data = mm)
summary(pa)
## 
## Call:
## lm(formula = Amount ~ Private.Attorney, data = mm)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -183213 -103752  -59154   13949  837359 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         86870       1137   76.38   <2e-16 ***
## Private.Attorney   106848       1399   76.38   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 186400 on 79208 degrees of freedom
## Multiple R-squared:  0.06859,    Adjusted R-squared:  0.06858 
## F-statistic:  5833 on 1 and 79208 DF,  p-value: < 2.2e-16

Adjusted Model

The following model adjusts for severity of damage perceived by the patient, the type of insurance, and age of patient. After adjusting for potential confounders, the presence of a private attorney still leads to an expected increase in payment amount. However the expected increase from hiring one is estimated to be $58,386.75, nearly half of what was seen in the unadjusted model.

multiple_model <- lm(Amount ~ severe_range + Private.Attorney + Insurance + Age, data = mm)
summary(multiple_model)
## 
## Call:
## lm(formula = Amount ~ severe_range + Private.Attorney + Insurance + 
##     Age, data = mm)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -316033  -98902  -27502   36410  847054 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    35259.91    2373.91  14.853  < 2e-16 ***
## severe_range(3,6]              38296.16    1438.76  26.617  < 2e-16 ***
## severe_range(6,9]             131951.47    1646.46  80.143  < 2e-16 ***
## Private.Attorney               58386.75    1382.09  42.245  < 2e-16 ***
## InsuranceNo Insurance          52506.23    2519.38  20.841  < 2e-16 ***
## InsurancePrivate              113315.43    1895.56  59.779  < 2e-16 ***
## InsuranceUnknown               18276.72    1975.18   9.253  < 2e-16 ***
## InsuranceWorkers Compensation -27301.63    4193.26  -6.511 7.52e-11 ***
## Age                             -543.35      30.91 -17.580  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 170600 on 79201 degrees of freedom
## Multiple R-squared:  0.2197, Adjusted R-squared:  0.2197 
## F-statistic:  2788 on 8 and 79201 DF,  p-value: < 2.2e-16

Future directions

I would like to look further into geographic location if the data is available. Those who file their suit in urban vs rural areas may see a difference in payment amount. I would also like to look at suits made in the dental field with different dental specialties.