library(scatterplot3d)
## Warning: package 'scatterplot3d' was built under R version 3.4.4
AR is modeled using two simple difference equations, with ARI representing the portion of AR that is out to insurance companies, and ARP representing the portion of AR that is out to patients. Patients without insurance are assumed to pay at the time of service and do not contribute to AR in this model.
ARI is the value of daily production that is attributed to patients with insurance, \(\alpha production[t]\), plus the ARI from the previous time step, \(ARI[t]\), minus the value of claims for which EOBs are received, \(\beta ARI[t]\). The value of \(\alpha\) is a point estimate while the value of \(\beta\) is a distribution.
\[ ARI[t+1] = \alpha production[t] + ARI[t] - \beta ARI[t] \]
ARP is the value of claims for which EOBs are received, \(\beta ARI[t]\), which causes bills to be sent to patients, minus the payments received from patients, \(\delta ARP[t]\). The value of \(\gamma\) is a point estimate for the proportion of insurance production that is written off for PPO adjustments, and the value of \(\delta\) is a distribution.
\[ ARP[t+1] = ARP[t] + (\beta ARI[t])(1 - \gamma) - \delta ARP[t] \]
Total AR is the sum of these values.
The number of business days to receive an EOB after a procedure date and the number of days to receive a patient payment after receiving an EOB were recorded for all claims from random samples of days.
setwd('C:/Users/Jack/Documents/MS Stats/568 Industrial')
myData <- read.csv('DistData.csv')
head(myData)
eob <- na.omit(myData$EOBdays)
pat <- na.omit(myData$PATdays)
length(eob)
## [1] 38
length(pat)
## [1] 23
Four days were sampled for EOB times from 01/01/2018-04/30/2018 to generate 38 sample units. Eight days were sample for patient payment times from 09/01/2018-12/31/2018 to generate 23 sample units.
For time until payment, the data are naturally positive and skewed, making the log-normal distribution a natural choice. The parameters for log-normal distributions of days for each process are estimated using the maximum likelihood estimators of the mean and the standard deviation of the log-transformed data.
(mean.eob <- mean(log(eob)))
## [1] 3.119234
(sd.eob <- sd(log(eob)))
## [1] 0.360731
mean(eob)
## [1] 24.15789
mean(eob) + 1.96*sqrt(var(eob)/length(eob))
## [1] 27.14837
mean(eob) - 1.96*sqrt(var(eob)/length(eob))
## [1] 21.16742
REMOVE LATER
set.seed(1)
eob <- rlnorm(38, log(35), .6)
(mean.eob <- mean(log(eob)))
## [1] 3.584051
(sd.eob <- sd(log(eob)))
## [1] 0.5322896
For EOB times, the mean of the ln(number of days) is 3.1192, and the standard deviation of the log(number of days) is 0.3607.
myBreaks <- seq(from = 0, to = 365, by = 10)
x <- seq(0, 365, by = 0.01)
hist(eob,
breaks = myBreaks,
freq = FALSE,
ylim = c(0, 0.06),
col = 'grey',
xlab = 'Number of days',
main = 'Time to receive EOB'
)
lines(x,
dlnorm(x, mean.eob, sd.eob),
col = 'deepskyblue3',
lwd = 2
)
The line shows the probability density function for a log-normal distribution parameteerized using the estimates of the mean and standard deviation from the data above. The fit seems adequate.
The probability of payment is just the inverse of the number of days to get paid:
beta <- 1/eob
mean(beta)
## [1] 0.03232886
This is the distribution that will be sampled at each time step for a value of \(\beta\).
(mean.pat <- mean(log(pat)))
## [1] 3.231312
(sd.pat <- sd(log(pat)))
## [1] 0.8276991
REMOVE LATER
set.seed(1)
pat <- rlnorm(38, log(60), .6)
(mean.pat <- mean(log(pat)))
## [1] 4.123047
(sd.pat <- sd(log(pat)))
## [1] 0.5322896
For patient payment times, the mean of the ln(number of days) is 3.2313, and the standard deviation of the log(number of days) is 0.8277. The two means are very similar, probably times are subject to similar processes of sending correspondence and receiving mail. The standard deviation for the patient payments, however, is almost three times larger than for the EOBs.
myBreaks <- seq(from = 0, to = 365, by = 10)
x <- seq(0, 365, 0.01)
hist(pat,
breaks = myBreaks,
freq = FALSE,
ylim = c(0, 0.06),
col = 'grey',
xlab = 'Number of days',
main = 'Time to receive patient payment'
)
lines(x,
dlnorm(x, mean.pat, sd.pat),
col = 'deepskyblue3',
lwd = 2
)
The fit here is not as good, but the estimates are used to parameterize the model anyway because of the logical connections between the process and the log-normal distribution.
The probability of payment is just the inverse of the number of days to get paid:
delta <- 1/pat
mean(delta)
## [1] 0.0188585
mean(pat)
## [1] 69.94152
mean(pat) + 1.96*sqrt(var(pat)/length(pat))
## [1] 80.69921
mean(pat) - 1.96*sqrt(var(pat)/length(pat))
## [1] 59.18383
This is the distribution that will be sampled at each time step for a value of \(\delta\).
hist(delta)
The collection ratio, \(\alpha\) was estimated by comparing gross production for patients with insurance to total gross production for the period 01/01/2018-04/30/2018.
\[ \hat\alpha = \frac{148904.06}{369328.69} = 0.4032 \] The proportion of gross insurance production that is written off for PPO price contracts, \(\gamma\) was estimated by comparing the total PPO writeoffs to the gross production for patients with insurance for the period 01/01/2018-04/30/2018.
\[ \hat\gamma = \frac{58402.24}{148904.06} = 0.3922 \]
Total daily gross production data was retrieved for the period 01/01/2017-04/30/2018.
daily.prod <- read.csv('DailyProd.csv', header = TRUE)
head(daily.prod)
prod <- as.numeric(daily.prod$Production)
hist(prod, col = 'grey', xlab = 'Dollars', freq = FALSE, main = 'Daily gross production')
There are many small production days because the office typically only sees emergencies on Fridays.
By ignoring production days lower than $1000, the distribution appears to be fit well by a log-normal distribution.
hist(prod[prod > 1000],
col = 'grey',
xlab = 'Dollars',
freq = FALSE,
main = 'Daily gross production',
ylim = c(0, 0.0003)
)
x <- seq(1000, 14000, 1)
lines(x,
dlnorm(x, mean(log(prod[prod > 1000])), sd(log(prod[prod > 1000]))),
col = 'deepskyblue3',
lwd = 2
)
The model is run using the historical production time series with the parameters estimated above.
set.seed(1)
days <- length(prod) # model will run over the historical data for production
alpha <- 0.4032
beta.draws <- sample(beta, size = days, replace = TRUE)
gamma <- 0.3922
delta.draws <- sample(delta, size = days, replace = TRUE)
ARI <- rep(NA, days)
ARP <- rep(NA, days)
ARI[1] <- 50000
ARP[1] <- 50000
for(i in 1:days){ARI[i + 1] <- prod[i] * alpha + ARI[i] - beta.draws[i] * ARI[i]}
for(i in 1:days){ARP[i + 1] <- ARP[i] + ((beta.draws[i] * ARI[i])*(1 - gamma)) - delta.draws[i] * ARP[i]}
AR <- ARI + ARP
plot(1:(days + 1), AR, xlab = 'Day')
This gives a picture of the current situation. The actual AR is close to the model results, so the model seems adequate.
round(
c(quantile(AR[50:298], probs = 0.05, na.rm = TRUE),
Mean = mean(AR[50:298]),
quantile(AR[50:298], probs = 0.95, na.rm = TRUE)
)
)
## 5% Mean 95%
## 88908 99011 105545
Next the model is run with target parameters, to get a distribution for AR with improved processes.
set.seed(1)
newprod <- sample(as.numeric(daily.prod$Production), 2600, replace = TRUE)
days <- length(newprod)
alpha <- 0.4032 # this cannot change without a change in the dental practice overall strategy
beta.draws <- 1 / rlnorm(days, log(35), .3607)
gamma <- 0.3922 # this cannot change without a change in the dental practice overall strategy
delta.draws <- 1 / rlnorm(days, log(60), .8277)
ARI <- rep(NA, days)
ARP <- rep(NA, days)
ARI[1] <- 40000
ARP[1] <- 40000
for(i in 1:days){ARI[i + 1] <- newprod[i] * alpha + ARI[i] - beta.draws[i] * ARI[i]}
for(i in 1:days){ARP[i + 1] <- ARP[i] + ((beta.draws[i] * ARI[i]) * (1 - gamma)) - delta.draws[i] * ARP[i]}
AR1 <- ARI + ARP
for(i in 1:days){ARI[i + 1] <- newprod[i] * alpha + ARI[i] - beta.draws[i] * ARI[i]}
AR2 <- ARI
plot(1:(days + 1), AR1, type = 'l', ylim = c(0, 120000), xlab = 'Day', ylab = 'AR', col = 'red')
lines(1:(days + 1), AR2, col = 'darkgreen')
legend('bottomright', legend = list('Status quo', 'Autopay'), fill = c('red', 'darkgreen'))
round(
c(quantile(AR, probs = 0.05, na.rm = TRUE),
Mean = mean(AR),
quantile(AR, probs = 0.95, na.rm = TRUE)
)
)
## 5% Mean 95%
## 89515 101000 109925
ARsim <- function(n, mn.eob, sd.eob, mn.pat, sd.pat){
newprod <- sample(as.numeric(daily.prod$Production[daily.prod$Production > 0]), n, replace = TRUE)
days <- length(newprod)
alpha <- 0.4032 # this cannot change without a change in the dental practice overall strategy
beta.draws <- 1 / rlnorm(days, mn.eob, sd.eob)
gamma <- 0.3922 # this cannot change without a change in the dental practice overall strategy
delta.draws <- 1 / rlnorm(days, mn.pat, sd.pat)
ARI <- rep(NA, days)
ARP <- rep(NA, days)
ARI[1] <- 40000
ARP[1] <- 40000
for(i in 1:days){ARI[i + 1] <- newprod[i] * alpha + ARI[i] - beta.draws[i] * ARI[i]}
for(i in 1:days){ARP[i + 1] <- ARP[i] + ((beta.draws[i] * ARI[i]) * (1 - gamma)) - delta.draws[i] * ARP[i]}
AR <- ARI + ARP
round(
c(quantile(AR, probs = 0.05, na.rm = TRUE),
Mean = mean(AR),
quantile(AR, probs = 0.95, na.rm = TRUE)
)
)
}
set.seed(1)
ARsim(n = 2600, mn.eob = log(35), sd.eob = .3607, mn.pat = log(60), sd.pat = .8277)
## 5% Mean 95%
## 88313 101731 113794
ARsim(n = 2600, mn.eob = log(35), sd.eob = .3607, mn.pat = 0, sd.pat = .8277)
## 5% Mean 95%
## 50249 58782 68170
ARsim(n = 2600, mn.eob = log(35), sd.eob = .3607, mn.pat = log(60), sd.pat = .8277)[2] -
ARsim(n = 2600, mn.eob = log(35), sd.eob = .3607, mn.pat = 0, sd.pat = .8277)[2]
## Mean
## 44874
mn.eob <- ((7*2):(7*8))
mn.pat <- ((7*2):(7*8))
sim.pars <- expand.grid(list(mn.eob, mn.pat))
nrow(sim.pars)
## [1] 1849
for(i in 1:nrow(sim.pars)){
sim.pars[i, 3] <- ARsim(n = 2600, mn.eob = log(sim.pars[i, 1]), sd.eob = .3607, mn.pat = log(sim.pars[i, 2]), sd.pat = .8277)[2]
}
s3d <-scatterplot3d(sim.pars, pch = 16, highlight.3d = TRUE,
type = "p", main = "",
xlab = 'Mean days EOB',
ylab = 'Mean days patient',
zlab = 'AR x 1000',
angle = 70,
z.ticklabs = seq(0, 120, by = 20)
)
fit <- lm(sim.pars[, 3] ~ sim.pars[, 1] + sim.pars[, 2])
s3d$plane3d(fit)
summary(lm(sim.pars[, 3] ~ sim.pars[, 1] + sim.pars[, 2] + sim.pars[, 1]:sim.pars[, 2]))
##
## Call:
## lm(formula = sim.pars[, 3] ~ sim.pars[, 1] + sim.pars[, 2] +
## sim.pars[, 1]:sim.pars[, 2])
##
## Residuals:
## Min 1Q Median 3Q Max
## -4324.2 -730.0 0.0 721.7 5297.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1117.87183 240.05892 4.657 3.44e-06 ***
## sim.pars[, 1] 1615.04464 6.46451 249.832 < 2e-16 ***
## sim.pars[, 2] 755.93738 6.46451 116.937 < 2e-16 ***
## sim.pars[, 1]:sim.pars[, 2] -0.05868 0.17408 -0.337 0.736
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1153 on 1845 degrees of freedom
## Multiple R-squared: 0.9973, Adjusted R-squared: 0.9973
## F-statistic: 2.264e+05 on 3 and 1845 DF, p-value: < 2.2e-16
Every increase in mean time to receive an insurance payment by one day increases the AR by $2,361. Every increase in mean time to receive a patient payment by one day increases the AR by $747.
ARsim(n = 2600, mn.eob = 3.1192, sd.eob = .3607, mn.pat = 0, sd.pat = 0)
## 5% Mean 95%
## 32162 38364 44931