FYI: If someone was told to go to the ED then we make their business days until appoint == 0.
NPI | record_id | N |
---|---|---|
NA | NA | NA |
—: | ———: | –: |
NPI | reason_for_exclusions | insurance | business_days_until_appointment |
---|---|---|---|
NA | NA | NA | NA |
—: | :——————— | :——— | ——————————-: |
NPI | calls_count |
---|---|
NA | NA |
—: | ———–: |
NPI | id_number | reason_for_exclusions | business_days_until_appointment |
---|---|---|---|
NA | NA | NA | NA |
—: | ———: | :——————— | ——————————-: |
NPI | id_number | reason_for_exclusions | business_days_until_appointment |
---|---|---|---|
NA | NA | NA | NA |
—: | ———: | :——————— | ——————————-: |
NPI | id_number | reason_for_exclusions | business_days_until_appointment |
---|---|---|---|
NA | NA | NA | NA |
—: | ———: | :——————— | ——————————-: |
The data is not normally distributed. Plus it is count data. t-test assumes that data is normally distributed, and comparing the means of counts data is also not appropriate, we can check the incidence rate ratio for comparison of business_days_until_appointment among the categories of insurance. Better to use Poisson regression.
This Q-Q plot displays the distribution of the
business_days_until_appointment
variable against a
theoretical normal distribution. Here’s an interpretation based on the
plot’s characteristics:
Heavy Right Tail (Positive Skew): The data
points deviate upward from the reference line on the right side,
indicating that the business_days_until_appointment
distribution has a heavy right tail or positive skew. This suggests that
while most appointments are scheduled within a typical range, there are
a few cases where the wait time is significantly longer.
Departure from Normality: The points deviate from the reference line at both ends, especially at the upper end (right tail). This indicates that the data does not follow a normal distribution closely. Instead, it appears to have a skewed, possibly exponential or log-normal distribution, given the pattern of points rising sharply at higher values.
Outliers: The data point at the top right, well above the line, is likely an outlier with a much longer wait time than the majority. This extreme value contributes to the non-normality and might need consideration, depending on the analysis goals.
In summary, the business_days_until_appointment
variable
is not normally distributed and shows positive skewness with some
outliers, especially toward longer wait times.
## Starting normality check and summary calculation for variable: business_days_until_appointment
## Data extracted for variable: business_days_until_appointment
## Shapiro-Wilk normality test completed with p-value: 0.0000000000000000000000000000023401989208789
## The p-value is less than or equal to 0.05, indicating that the data is not normally distributed.
## Histogram with Density Plot created.
## Q-Q Plot created.
## Data is NOT normally distributed. Use non-parametric measures like median: 8, IQR: 26
## $median
## [1] 8
##
## $iqr
## [1] 26
## Summary calculation completed for variable: business_days_until_appointment
## $median
## [1] 8
##
## $iqr
## [1] 26
In interpreting this output:
business_days_until_appointment
), Poisson regression is
indeed more suitable than a Kruskal-Wallis test. The Kruskal-Wallis test
would only indicate if there is a statistically significant difference
across groups in insurance
but would not provide specific
information on the effect size or direction of differences, which the
Poisson model offers.-0.008725
, suggests a
slight (but statistically insignificant) reduction in the log count of
days until the appointment for Medicaid patients compared to the
baseline insurance group. The p-value of 0.659 shows this effect is not
statistically significant, meaning we don’t have enough evidence to
conclude that Medicaid influences wait time compared to the baseline
insurance category.insurance
as a predictor does not improve the
model’s fit substantially. This suggests that insurance
may
not be a strong predictor of
business_days_until_appointment
.insurance
type does not significantly influence the wait time for an appointment
(business_days_until_appointment
) based on the p-value and
the similarity in deviance values.In summary, while Poisson regression provides more detailed insights
than a Kruskal-Wallis test, this model suggests that
insurance
type does not significantly affect the wait time
for an appointment.
##
## Call:
## glm(formula = business_days_until_appointment ~ as.factor(insurance),
## family = "poisson", data = df)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.893030 0.012880 224.615 <0.0000000000000002
## as.factor(insurance)Medicaid -0.008725 0.019781 -0.441 0.659
##
## (Intercept) ***
## as.factor(insurance)Medicaid
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 16741 on 581 degrees of freedom
## Residual deviance: 16741 on 580 degrees of freedom
## (558 observations deleted due to missingness)
## AIC: 18609
##
## Number of Fisher Scoring iterations: 6
## The baseline rate of business_days_until_appointment (intercept) is estimated to be 18.05 times the reference category, with a 95% confidence interval ranging from 17.6 to 18.51 . For Medicaid compared to the reference category (BCBS), the rate of business_days_until_appointment is approximately 0.99 times lower . The 95% confidence interval for this estimate ranges between 0.95 and 1.03 , meaning that the waiting time for an appointment is estimated to be about 0.99 times shorter for Medicaid patients than for those with BCBS insurance.
## # A tibble: 139 × 6
## state income_quartile income_range physicians_in_quartile total_physicians
## <chr> <chr> <chr> <int> <int>
## 1 AK Q3 $61,107 - $81… 8 8
## 2 AL Q1 (Lowest) $7,609 - $42,… 6 13
## 3 AL Q4 (Highest) $66,830 - $17… 5 13
## 4 AR Q2 $40,891 - $50… 2 8
## 5 AR Q4 (Highest) $61,128 - $25… 2 8
## 6 AZ Q2 $46,851 - $62… 6 35
## 7 AZ Q3 $62,108 - $80… 12 35
## 8 AZ Q4 (Highest) $80,324 - $17… 13 35
## 9 CA Q1 (Lowest) $2,499 - $63,… 24 68
## 10 CA Q2 $63,980 - $86… 10 68
## 11 CA Q3 $86,216 - $11… 16 68
## 12 CA Q4 (Highest) $114,510 - $2… 12 68
## 13 CO Q1 (Lowest) $18,125 - $57… 8 29
## 14 CO Q2 $57,934 - $75… 3 29
## 15 CO Q3 $75,458 - $97… 10 29
## 16 CO Q4 (Highest) $97,288 - $17… 6 29
## 17 CT Q1 (Lowest) $14,852 - $79… 6 38
## 18 CT Q2 $79,811 - $10… 8 38
## 19 CT Q3 $101,458 - $1… 14 38
## 20 CT Q4 (Highest) $121,560 - $2… 10 38
## 21 DE Q3 $77,454 - $90… 2 4
## 22 FL Q1 (Lowest) $12,894 - $53… 26 70
## 23 FL Q2 $53,856 - $66… 9 70
## 24 FL Q3 $66,302 - $82… 13 70
## 25 FL Q4 (Highest) $82,948 - $25… 18 70
## 26 GA Q1 (Lowest) $14,306 - $46… 2 35
## 27 GA Q2 $46,696 - $59… 4 35
## 28 GA Q3 $59,036 - $75… 3 35
## 29 GA Q4 (Highest) $75,180 - $25… 13 35
## 30 HI Q1 (Lowest) $35,221 - $73… 10 18
## 31 HI Q2 $73,118 - $87… 2 18
## 32 HI Q3 $87,926 - $10… 2 18
## 33 IA Q1 (Lowest) $17,452 - $60… 8 16
## 34 IA Q3 $69,375 - $81… 6 16
## 35 IA Q4 (Highest) $81,250 - $16… 2 16
## 36 ID Q4 (Highest) $76,010 - $14… 2 6
## 37 IL Q1 (Lowest) $12,663 - $57… 6 49
## 38 IL Q2 $57,692 - $70… 6 49
## 39 IL Q3 $70,304 - $87… 20 49
## 40 IL Q4 (Highest) $87,349 - $25… 15 49
## 41 IN Q1 (Lowest) $22,677 - $56… 2 21
## 42 IN Q2 $56,954 - $66… 2 21
## 43 IN Q3 $66,519 - $78… 8 21
## 44 IN Q4 (Highest) $78,594 - $15… 5 21
## 45 KS Q2 $53,314 - $63… 4 11
## 46 KS Q3 $63,327 - $76… 4 11
## 47 KS Q4 (Highest) $76,673 - $20… 3 11
## 48 KY Q1 (Lowest) $2,499 - $41,… 3 17
## 49 KY Q2 $41,636 - $52… 2 17
## 50 KY Q3 $52,804 - $65… 6 17
## 51 KY Q4 (Highest) $65,042 - $25… 6 17
## 52 LA Q4 (Highest) $67,596 - $16… 10 22
## 53 MA Q1 (Lowest) $20,202 - $78… 12 39
## 54 MA Q2 $78,849 - $10… 13 39
## 55 MA Q4 (Highest) $126,844 - $2… 12 39
## 56 MD Q1 (Lowest) $19,722 - $77… 2 30
## 57 MD Q2 $77,875 - $10… 12 30
## 58 MD Q3 $100,573 - $1… 6 30
## 59 MD Q4 (Highest) $130,008 - $2… 6 30
## 60 ME Q1 (Lowest) $21,161 - $52… 2 6
## 61 ME Q2 $52,258 - $61… 2 6
## 62 ME Q3 $61,470 - $77… 2 6
## 63 MI Q2 $53,727 - $63… 8 39
## 64 MI Q3 $63,472 - $78… 8 39
## 65 MI Q4 (Highest) $78,063 - $18… 13 39
## 66 MN Q1 (Lowest) $14,107 - $62… 2 20
## 67 MN Q2 $62,500 - $72… 8 20
## 68 MN Q3 $72,469 - $87… 4 20
## 69 MN Q4 (Highest) $87,688 - $22… 6 20
## 70 MO Q1 (Lowest) $2,499 - $48,… 2 41
## 71 MO Q2 $48,556 - $58… 2 41
## 72 MO Q3 $58,333 - $71… 22 41
## 73 MO Q4 (Highest) $71,942 - $25… 11 41
## 74 MS Q1 (Lowest) $2,499 - $36,… 4 18
## 75 MS Q2 $36,698 - $46… 2 18
## 76 MS Q3 $46,736 - $58… 4 18
## 77 MS Q4 (Highest) $58,120 - $16… 6 18
## 78 MT Q3 $61,250 - $74… 2 4
## 79 NC Q1 (Lowest) $2,499 - $49,… 6 38
## 80 NC Q2 $49,157 - $59… 4 38
## 81 NC Q3 $59,413 - $72… 8 38
## 82 NC Q4 (Highest) $72,752 - $21… 18 38
## 83 NE Q1 (Lowest) $23,393 - $58… 1 21
## 84 NE Q2 $58,438 - $68… 6 21
## 85 NE Q3 $68,125 - $80… 2 21
## 86 NE Q4 (Highest) $80,078 - $17… 12 21
## 87 NH Q1 (Lowest) $31,750 - $72… 2 2
## 88 NJ Q1 (Lowest) $23,780 - $84… 8 30
## 89 NJ Q2 $84,466 - $10… 2 30
## 90 NJ Q3 $106,339 - $1… 12 30
## 91 NJ Q4 (Highest) $138,235 - $2… 8 30
## 92 NM Q1 (Lowest) $16,096 - $37… 2 14
## 93 NM Q4 (Highest) $63,839 - $25… 10 14
## 94 NV Q2 $55,935 - $75… 6 28
## 95 NV Q3 $75,089 - $93… 6 28
## 96 NY Q1 (Lowest) $2,499 - $61,… 3 40
## 97 NY Q3 $76,046 - $10… 13 40
## 98 NY Q4 (Highest) $101,250 - $2… 22 40
## 99 OH Q2 $53,634 - $65… 6 37
## 100 OH Q3 $65,619 - $79… 9 37
## 101 OH Q4 (Highest) $79,724 - $25… 18 37
## 102 OK Q2 $46,944 - $55… 2 14
## 103 OK Q3 $55,341 - $67… 8 14
## 104 OK Q4 (Highest) $67,915 - $17… 4 14
## 105 OR Q1 (Lowest) $2,499 - $55,… 5 13
## 106 OR Q3 $66,506 - $85… 4 13
## 107 OR Q4 (Highest) $85,440 - $15… 4 13
## 108 PA Q1 (Lowest) $14,319 - $55… 6 42
## 109 PA Q2 $55,979 - $67… 6 42
## 110 PA Q3 $67,750 - $82… 10 42
## 111 PA Q4 (Highest) $82,341 - $25… 18 42
## 112 SC Q2 $45,084 - $55… 4 8
## 113 SC Q3 $55,095 - $69… 2 8
## 114 SC Q4 (Highest) $69,025 - $17… 2 8
## 115 SD Q1 (Lowest) $2,499 - $55,… 2 2
## 116 TN Q1 (Lowest) $2,499 - $47,… 6 35
## 117 TN Q2 $47,208 - $56… 9 35
## 118 TN Q3 $56,804 - $69… 10 35
## 119 TN Q4 (Highest) $69,372 - $18… 10 35
## 120 TX Q1 (Lowest) $2,499 - $52,… 2 58
## 121 TX Q2 $52,264 - $64… 8 58
## 122 TX Q3 $64,792 - $82… 19 58
## 123 TX Q4 (Highest) $82,312 - $25… 19 58
## 124 UT Q1 (Lowest) $16,685 - $62… 4 6
## 125 UT Q2 $62,464 - $74… 2 6
## 126 VA Q1 (Lowest) $4,016 - $53,… 2 19
## 127 VA Q2 $53,246 - $70… 4 19
## 128 VA Q4 (Highest) $96,605 - $25… 11 19
## 129 VT Q4 (Highest) $87,078 - $14… 4 4
## 130 WA Q1 (Lowest) $26,823 - $62… 8 26
## 131 WA Q2 $62,500 - $76… 2 26
## 132 WA Q3 $76,683 - $96… 4 26
## 133 WA Q4 (Highest) $96,627 - $25… 4 26
## 134 WI Q1 (Lowest) $17,746 - $61… 12 22
## 135 WI Q4 (Highest) $82,467 - $14… 10 22
## 136 WV Q4 (Highest) $62,942 - $25… 8 8
## 137 WY Q1 (Lowest) $25,809 - $57… 2 4
## 138 WY Q2 $57,412 - $69… 2 4
## 139 US Q4 (Highest) $85,313 - $25… 38 1140
## # ℹ 1 more variable: percent_in_quartile <dbl>
These acceptance rates reflect the proportion of physicians who were successfully contacted, accepted the respective insurance, and provided an appointment to the patient.
Medicaid Acceptance Rate: Out of the total number of physicians assigned Medicaid insurance (573), 179 physicians accepted Medicaid and provided an appointment, resulting in an acceptance rate of 75.2%.
Blue Cross/Blue Shield Acceptance Rate: Among the physicians assigned Blue Cross/Blue Shield insurance (567), 238 accepted this insurance and provided an appointment, yielding an acceptance rate of 73%.
scenario_type | n | percent |
---|---|---|
Emergent | 70 | 43.47826 |
Urgent | 91 | 56.52174 |
## For the 161 patients who were told to go to the Emergency Department, 43.5% were in the Emergent scenario type (n = 70 / N = 161) and 56.5% were in the Urgent scenario type (n = 91 / N = 161).
## Our sample included 1140 calls to physician offices from 49 states excluding North Dakota and Rhode Island . We made calls to 567 unique physicians that accepted Blue Cross/Blue Shield. One Hundred Seventy-Nine physician offices accepted Medicaid, giving a 75.2 % Medicaid acceptance rate for OBGYN practices (n = 179 /N = 238 ). Physicians offices accepted Blue Cross/Blue Shield at a rate of 73 % (n = 238 /N = 326 ).
## # A tibble: 6 × 34
## NPI age age_category gender Med_sch Grd_yr academic ACOG_District
## <dbl> <dbl> <ord> <fct> <fct> <dbl> <fct> <fct>
## 1 1265759062 53 50 to 59 years … Female US Sen… 2010 Private… District V
## 2 1265759062 53 50 to 59 years … Female US Sen… 2010 Private… District V
## 3 1083000731 36 Less than 40 ye… Female Intern… 2015 Private… District II
## 4 1083000731 36 Less than 40 ye… Female Intern… 2015 Private… District II
## 5 1144207358 51 50 to 59 years … Female US Sen… 1998 Private… District V
## 6 1144207358 51 50 to 59 years … Female US Sen… 1998 Private… District V
## # ℹ 26 more variables: cbsatype10 <fct>, scenario <fct>, scenario_type <fct>,
## # insurance <fct>, including_this_physician_in_the_study <fct>,
## # told_to_go_to_the_emergency_department <fct>,
## # offered_a_clinic_appointment_to_be_seen <fct>, reason_for_exclusions <fct>,
## # central_number <fct>, number_of_transfers <fct>, call_time_minutes <dbl>,
## # hold_time_minutes <dbl>, Provider.Enumeration.Date <dbl>,
## # day_of_the_week <ord>, business_days_until_appointment <dbl>, …
## # A tibble: 2 × 3
## insurance n percent
## <fct> <int> <dbl>
## 1 Blue Cross/Blue Shield 567 49.7
## 2 Medicaid 573 50.3
## # A tibble: 2 × 3
## insurance n percent
## <fct> <dbl> <dbl>
## 1 Blue Cross/Blue Shield 567 49.7
## 2 Medicaid 573 50.3
The median physician age was 53(IQR 25th percentile 44 to 75th percentile 61).
Wait Time with single predictor
Median_business_days_until_appointment | Q1 | Q3 |
---|---|---|
8 | 0 | 26 |
The median wait time across all insurance was 8 business days, with an interquartile range (IQR) of 0 to 26.
Use tyler::generate_latex_equation
functions.
\[ \begin{{align*}} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{{e^{{-\lambda}} \cdot \lambda^x}}{{x!}} \\sqrt{{\lambda}} &= \beta_0 \& + \beta_1 \cdot \underline{{\mathbf{{\large{{\textPatient Scenario}}}}}} \& + ( 1 | \text{{Physician NPI}}) \end{{align*}} \]
## Logging inputs...
## Model Object: glm lm
## Specs: ~scenario | scenario
## Variable of Interest: scenario
## Color By: scenario
## Output Directory: Melanie/Figures
## Y-Axis Min: 12
## Y-Axis Max: 24
## Using existing output directory: Melanie/Figures
## Computing estimated marginal means...
## Logging estimated marginal means data...
## # A tibble: 4 × 6
## scenario rate SE df asymp.LCL asymp.UCL
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Prior trip to ED and was found to have … 19.2 0.366 Inf 18.5 19.9
## 2 Positive pregnancy test after a tubal l… 17.0 0.341 Inf 16.4 17.7
## 3 Acute cystitis 13.4 0.308 Inf 12.8 14.0
## 4 Recurrent/Treatment resistant vaginitis 22.0 0.382 Inf 21.3 22.8
## Range of estimated marginal means with CIs: 12.7995 22.80813
## Creating the plot...
## Plot created successfully.
## Saving plot to: Melanie/Figures/interaction_scenario_comparison_plot_20241111_203055.png
## Plot saved successfully to: Melanie/Figures/interaction_scenario_comparison_plot_20241111_203055.png
## Returning the estimated data and plot object.
## There were 1140 calls made with senarios having to do with 284 positive pregnancy test after a tubal ligation, 287 prior trip to ED and was found to have a 6 cm TOA, 282 Acute cystitis, and 287 with Recurrent/Treatment resistant vaginitis.
scenario | Median_business_days_until_appointment | Q1 | Q3 |
---|---|---|---|
Prior trip to ED and was found to have a 6 cm TOA | 9 | 0 | 26 |
Positive pregnancy test after a tubal ligation | 9 | 1 | 22 |
Acute cystitis | 2 | 0 | 20 |
Recurrent/Treatment resistant vaginitis | 12 | 1 | 34 |
business_days_until_appointment ~ scenario
\[ \begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \underline{\mathbf{\large{\text{{Number of Offices Contacted}}}}} \\ & + ( 1 | \text{{Physician NPI}}) \end{align*} \]
scenario | count |
---|---|
Prior trip to ED and was found to have a 6 cm TOA | 137 |
Positive pregnancy test after a tubal ligation | 144 |
Acute cystitis | 138 |
Recurrent/Treatment resistant vaginitis | 145 |
##
## Call:
## glm(formula = business_days_until_appointment ~ as.factor(scenario),
## family = "poisson", data = df)
##
## Coefficients:
## Estimate
## (Intercept) 2.95360
## as.factor(scenario)Positive pregnancy test after a tubal ligation -0.11759
## as.factor(scenario)Acute cystitis -0.35908
## as.factor(scenario)Recurrent/Treatment resistant vaginitis 0.13955
## Std. Error
## (Intercept) 0.01910
## as.factor(scenario)Positive pregnancy test after a tubal ligation 0.02764
## as.factor(scenario)Acute cystitis 0.02991
## as.factor(scenario)Recurrent/Treatment resistant vaginitis 0.02579
## z value
## (Intercept) 154.663
## as.factor(scenario)Positive pregnancy test after a tubal ligation -4.255
## as.factor(scenario)Acute cystitis -12.007
## as.factor(scenario)Recurrent/Treatment resistant vaginitis 5.411
## Pr(>|z|)
## (Intercept) < 0.0000000000000002
## as.factor(scenario)Positive pregnancy test after a tubal ligation 0.0000209144
## as.factor(scenario)Acute cystitis < 0.0000000000000002
## as.factor(scenario)Recurrent/Treatment resistant vaginitis 0.0000000626
##
## (Intercept) ***
## as.factor(scenario)Positive pregnancy test after a tubal ligation ***
## as.factor(scenario)Acute cystitis ***
## as.factor(scenario)Recurrent/Treatment resistant vaginitis ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 16741 on 581 degrees of freedom
## Residual deviance: 16412 on 578 degrees of freedom
## (558 observations deleted due to missingness)
## AIC: 18284
##
## Number of Fisher Scoring iterations: 6
## The median wait time across all scenarios was 8 business days, with an interquartile range (IQR) of 0 to 26 days. Specifically, the median wait time was 9 days (IQR: 0 to 26) for 'Prior trip to ED and was found to have a 6 cm TOA', 9 days (IQR: 1 to 22) for 'Positive pregnancy test after a tubal ligation', 2 days (IQR: 0 to 20) for 'Acute cystitis', and 12 days (IQR: 1 to 34) for 'Recurrent/Treatment resistant vaginitis'. The p-value for the difference between 'Positive pregnancy test after a tubal ligation' and 'Prior trip to ED and was found to have a 6 cm TOA' scenarios was <0.01, for 'Acute cystitis' and 'Prior trip to ED and was found to have a 6 cm TOA', it was <0.01, and for 'Recurrent/Treatment resistant vaginitis' and 'Prior trip to ED and was found to have a 6 cm TOA', it was <0.01.
\[ \begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \underline{\mathbf{\large{\text{{Patient Insurance}}}}} \\ & + ( 1 | \text{{Physician NPI}}) \end{align*} \]
insurance | Median_business_days_until_appointment | Q1 | Q3 |
---|---|---|---|
Blue Cross/Blue Shield | 9.0 | 0 | 26 |
Medicaid | 7.5 | 0 | 26 |
## Medicaid patients experienced a 0.87 % shorter wait for a new patient appointment compared to patients with BCBS (Incidence Rate Ratio: 0.991 ; CI: 1 - 1 ; p = 0.66 ) with median wait times of 7.5 business days (IQR: 25th percentile 0 - 75th percentile 26 ) and 9 business days (IQR: 25th percentile 0 - 75th percentile 26 ) respectively.
## Of the total 1140 phones calls made, 871 (76%) successfully reached a representative, while 269 calls (24%) did not yield a connection even after two attempts. For the unsuccessful connections, 73 (27%) were redirected to voicemail, 138 (51%) listed an incorrect telephone number, and 58 (22%) reached a busy signal. For successful connections, the reasons for exclusion were 39 (4%) requiring a prior referral,63 (7%) reported that they were not currently accepting new patients and, 179 physician offices (21%) put the caller on hold for more than five minutes.
Graph each variable
## Plots saved to: output/density_plot_20241111_203058.tiff and output/density_plot_20241111_203058.png
Blue Cross/Blue Shield (N=568) | Medicaid (N=15) | Total (N=583) | p value | |
---|---|---|---|---|
Age (years) | 0.06 | |||
- Less than 50 years old | 215 (39.2%) | 4 (26.7%) | 219 (38.9%) | |
- 50 to 55 years old | 88 (16.1%) | 4 (26.7%) | 92 (16.3%) | |
- 56 to 60 years old | 88 (16.1%) | 0 (0.0%) | 88 (15.6%) | |
- 61 to 65 years old | 69 (12.6%) | 5 (33.3%) | 74 (13.1%) | |
- Greater than 65 years old | 88 (16.1%) | 2 (13.3%) | 90 (16.0%) | |
Gender | 0.43 | |||
- Female | 360 (63.4%) | 8 (53.3%) | 368 (63.1%) | |
- Male | 208 (36.6%) | 7 (46.7%) | 215 (36.9%) | |
Medical School Training | 0.92 | |||
- Allopathic training | 522 (93.5%) | 13 (92.9%) | 535 (93.5%) | |
- Osteopathic training | 36 (6.5%) | 1 (7.1%) | 37 (6.5%) | |
Medical School Location | 0.40 | |||
- US Senior Medical Student | 414 (81.8%) | 11 (73.3%) | 425 (81.6%) | |
- International Medical Graduate | 92 (18.2%) | 4 (26.7%) | 96 (18.4%) | |
Academic Affiliation | 0.20 | |||
- Private Practice | 511 (90.0%) | 15 (100.0%) | 526 (90.2%) | |
- University | 57 (10.0%) | 0 (0.0%) | 57 (9.8%) | |
Rurality | 0.60 | |||
- Metropolitan area | 515 (90.7%) | 13 (86.7%) | 528 (90.6%) | |
- Rural area | 53 (9.3%) | 2 (13.3%) | 55 (9.4%) | |
Number of Phone Transfers | 0.60 | |||
- No transfers | 358 (63.5%) | 11 (78.6%) | 369 (63.8%) | |
- One transfer | 158 (28.0%) | 3 (21.4%) | 161 (27.9%) | |
- Two transfers | 37 (6.6%) | 0 (0.0%) | 37 (6.4%) | |
- More than two transfers | 11 (2.0%) | 0 (0.0%) | 11 (1.9%) | |
age_category | 0.20 | |||
- N-Miss | 20 | 0 | 20 | |
- Less than 40 years old | 76 (13.9%) | 2 (13.3%) | 78 (13.9%) | |
- 40. to 49 years old | 139 (25.4%) | 2 (13.3%) | 141 (25.0%) | |
- 50 to 59 years old | 176 (32.1%) | 4 (26.7%) | 180 (32.0%) | |
- 60 to 69 years old | 121 (22.1%) | 5 (33.3%) | 126 (22.4%) | |
- 70 years and greater | 36 (6.6%) | 2 (13.3%) | 38 (6.7%) | |
American College of OBGYNs Districts | 0.28 | |||
- District I | 35 (6.2%) | 2 (13.3%) | 37 (6.3%) | |
- District II | 33 (5.8%) | 0 (0.0%) | 33 (5.7%) | |
- District III | 38 (6.7%) | 1 (6.7%) | 39 (6.7%) | |
- District IV | 65 (11.4%) | 3 (20.0%) | 68 (11.7%) | |
- District V | 55 (9.7%) | 2 (13.3%) | 57 (9.8%) | |
- District VI | 65 (11.4%) | 0 (0.0%) | 65 (11.1%) | |
- District VII | 96 (16.9%) | 1 (6.7%) | 97 (16.6%) | |
- District VIII | 95 (16.7%) | 4 (26.7%) | 99 (17.0%) | |
- District IX | 31 (5.5%) | 0 (0.0%) | 31 (5.3%) | |
- District XI | 20 (3.5%) | 2 (13.3%) | 22 (3.8%) | |
- District XII | 35 (6.2%) | 0 (0.0%) | 35 (6.0%) | |
scenario | 0.01 | |||
- Acute cystitis | 137 (24.1%) | 9 (60.0%) | 146 (25.0%) | |
- Positive pregnancy test after a tubal ligation | 143 (25.2%) | 2 (13.3%) | 145 (24.9%) | |
- Prior trip to ED and was found to have a 6 cm TOA | 143 (25.2%) | 3 (20.0%) | 146 (25.0%) | |
- Recurrent/Treatment resistant vaginitis | 145 (25.5%) | 1 (6.7%) | 146 (25.0%) | |
scenario_type | 0.19 | |||
- Emergent | 286 (50.4%) | 5 (33.3%) | 291 (49.9%) | |
- Urgent | 282 (49.6%) | 10 (66.7%) | 292 (50.1%) | |
told_to_go_to_the_emergency_department | 0.10 | |||
- Yes | 91 (16.1%) | 0 (0.0%) | 91 (15.7%) | |
- No | 473 (83.9%) | 14 (100.0%) | 487 (84.3%) | |
Central scheduling | 0.21 | |||
- Yes, central scheduling number | 202 (35.6%) | 3 (20.0%) | 205 (35.2%) | |
- No | 366 (64.4%) | 12 (80.0%) | 378 (64.8%) | |
call_time_minutes | 0.01 | |||
- n | 451 | 9 | 460 | |
- Median (Q1, Q3) | 2.0 (1.0, 3.5) | 0.9 (0.8, 1.7) | 2.0 (1.0, 3.5) | |
hold_time_minutes | 0.12 | |||
- n | 423 | 8 | 431 | |
- Median (Q1, Q3) | 0.4 (0.0, 1.8) | 0.0 (0.0, 0.4) | 0.4 (0.0, 1.7) | |
Day of the week Called | 0.76 | |||
- N-Miss | 1 | 0 | 1 | |
- Thursday | 133 (23.5%) | 3 (20.0%) | 136 (23.4%) | |
- Tuesday | 114 (20.1%) | 3 (20.0%) | 117 (20.1%) | |
- Wednesday | 174 (30.7%) | 4 (26.7%) | 178 (30.6%) | |
- Friday | 84 (14.8%) | 4 (26.7%) | 88 (15.1%) | |
- Monday | 61 (10.8%) | 1 (6.7%) | 62 (10.7%) | |
- Saturday | 1 (0.2%) | 0 (0.0%) | 1 (0.2%) | |
business_days_until_appointment | 0.57 | |||
- n | 333 | 3 | 336 | |
- Median (Q1, Q3) | 14.0 (2.0, 32.0) | 13.0 (12.5, 28.0) | 13.5 (2.0, 32.0) |
Waiting time in Days (Log Scale) for Blue Cross/Blue Shield versus Medicaid. The code you provided will create a scatter plot with points representing the relationship between the insurance variable (x-axis) and the days variable (y-axis). Additionally, it includes a line plot that connects points with the same npi value.
## Plots saved to: Melanie/Figures/urgent_GYN_vs_insurance_20241111_203102.tiff and Melanie/Figures/urgent_GYN_vs_insurance_20241111_203102.png
Here we show a scatterplot that compares the Private and Medicaid times. Notice that the graph is in logarithmic scale. Points above the diagonal line are providers for whom the Medicaid waiting time was longer than the private insurance waiting time.
We also see a strong linear association, indicating that providers with longer waiting time for private insurance tend to also have longer waiting times for Medicaid.
## Plots saved to: Melanie/Figures/urgent_gyn_vs_insurance_none_20241111_203103.tiff and Melanie/Figures/urgent_gyn_vs_insurance_none_20241111_203103.png
## Plots saved to: Melanie/Figures/urgent_GYN_vs_insurance_density_20241111_203104.tiff and Melanie/Figures/urgent_GYN_vs_insurance_density_20241111_203104.png
Waiting time in Days (Log Scale) for Blue Cross/Blue Shield versus Medicaid. The code you provided will create a scatter plot with points representing the relationship between the scenario variable (x-axis) and the days variable (y-axis). Additionally, it includes a line plot that connects points with the same NPI name value.
## Plots saved to: Melanie/Figures/urgent_GYN_vs_scenario_20241111_203105.tiff and Melanie/Figures/urgent_GYN_vs_scenario_20241111_203105.png
Here we show a scatterplot that compares the hip, knee, and shoulder times. Notice that the graph is in logarithmic scale.
## Plots saved to: Melanie/Figures/urgent_GYN_vs_scenario_none_20241111_203106.tiff and Melanie/Figures/urgent_GYN_vs_scenario_none_20241111_203106.png
A density plot is a smoothed version of a histogram that shows the distribution of a continuous variable. It represents the relative frequency of data points in different ranges of values, with areas under the curve corresponding to proportions of the data.
How to Read the Density Plot: 1. Shape of the Distribution: - The shape of each curve tells you about the distribution of waiting times within each insurance group. - A peak indicates the most common waiting times for that group. - A wider curve indicates a more spread-out distribution, meaning the waiting times vary more within that group. - A narrower curve indicates that waiting times are more concentrated around the peak.
## Plots saved to: Melanie/Figures/urgent_GYN_vs_scenario_density_20241111_203106.tiff and Melanie/Figures/urgent_GYN_vs_scenario_density_20241111_203106.png
Consider the following scenario:
When fitting a regression model with waiting time as the dependent variable and insurance type as one of the predictors (along with other factors like age and medical condition), the EMMs would represent the average waiting time for each insurance type, adjusted for the effects of age and medical condition. This adjustment helps isolate the effect of insurance type on waiting time, ensuring the comparison between insurance types is fair.
Interpretation: In the plot you provided earlier, the Estimated Marginal Means for each scenario represent the average predicted waiting time for an appointment, adjusted for other factors in the model. This gives a clearer, model-based comparison of the expected waiting times across different medical scenarios, taking into account variability in other factors.
This image is a plot of Estimated Marginal Means (also known as least-squares means) for different scenarios. Each point represents the estimated marginal mean waiting time (in days) for a different medical scenario, and the error bars represent the 95% confidence intervals (CI) around these estimates.
Here’s a breakdown of the different components of the plot:
Y-axis:
X-axis:
The x-axis labels are rotated for readability, showing the different medical conditions (scenarios) being compared.
Estimated Marginal Means (Points on the Plot):
Confidence Intervals (Error Bars):
Interpretation of the Estimated Marginal Means:
A simple rule of thumb is that if error bars for 95% confidence intervals overlap by less than about half the length of one error bar, the difference between the two groups might still be statistically significant. If the error bars overlap considerably, it’s more likely (but not guaranteed) that the difference between the groups is not statistically significant.
## Extracted interaction data:
## scenario insurance rate SE df asymp.LCL
## TOA Blue Cross/Blue Shield 4.718326 0.9436969 Inf 3.188173
## Pregnancy after tubal Blue Cross/Blue Shield 6.408744 1.2848820 Inf 4.326298
## UTI Blue Cross/Blue Shield 3.459394 0.7314328 Inf 2.285743
## Vaginitis Blue Cross/Blue Shield 8.426966 1.6220978 Inf 5.778623
## TOA Medicaid 6.438937 1.2938995 Inf 4.342760
## Pregnancy after tubal Medicaid 5.607467 1.1338873 Inf 3.772637
## UTI Medicaid 3.114619 0.6627911 Inf 2.052433
## Vaginitis Medicaid 6.909619 1.3393328 Inf 4.725639
## asymp.UCL
## 6.982870
## 9.493566
## 5.235677
## 12.289045
## 9.546902
## 8.334670
## 4.726511
## 10.102936
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Scenario: TOA
## Filtered data for scenario:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## TOA Blue Cross/Blue Shield 4.718326 0.9436969 Inf 3.188173 6.982870
## TOA Medicaid 6.438937 1.2938995 Inf 4.342760 9.546902
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Blue Cross/Blue Shield data:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## TOA Blue Cross/Blue Shield 4.718326 0.9436969 Inf 3.188173 6.98287
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Medicaid data:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## TOA Medicaid 6.438937 1.293899 Inf 4.34276 9.546902
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Interaction p-value for scenario TOA : NA
## Wait times for Medicaid are longer compared to Blue Cross/Blue Shield.
##
## Scenario: Pregnancy after tubal
## Filtered data for scenario:
## scenario insurance rate SE df asymp.LCL
## Pregnancy after tubal Blue Cross/Blue Shield 6.408744 1.284882 Inf 4.326298
## Pregnancy after tubal Medicaid 5.607467 1.133887 Inf 3.772637
## asymp.UCL
## 9.493566
## 8.334670
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Blue Cross/Blue Shield data:
## scenario insurance rate SE df asymp.LCL
## Pregnancy after tubal Blue Cross/Blue Shield 6.408744 1.284882 Inf 4.326298
## asymp.UCL
## 9.493566
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Medicaid data:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## Pregnancy after tubal Medicaid 5.607467 1.133887 Inf 3.772637 8.33467
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Interaction p-value for scenario Pregnancy after tubal : <0.01
## Wait times for Medicaid are shorter compared to Blue Cross/Blue Shield.
##
## Scenario: UTI
## Filtered data for scenario:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## UTI Blue Cross/Blue Shield 3.459394 0.7314328 Inf 2.285743 5.235677
## UTI Medicaid 3.114619 0.6627911 Inf 2.052433 4.726511
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Blue Cross/Blue Shield data:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## UTI Blue Cross/Blue Shield 3.459394 0.7314328 Inf 2.285743 5.235677
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Medicaid data:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## UTI Medicaid 3.114619 0.6627911 Inf 2.052433 4.726511
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Interaction p-value for scenario UTI : <0.01
## Wait times for Medicaid are shorter compared to Blue Cross/Blue Shield.
##
## Scenario: Vaginitis
## Filtered data for scenario:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## Vaginitis Blue Cross/Blue Shield 8.426966 1.622098 Inf 5.778623 12.28904
## Vaginitis Medicaid 6.909619 1.339333 Inf 4.725639 10.10294
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Blue Cross/Blue Shield data:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## Vaginitis Blue Cross/Blue Shield 8.426966 1.622098 Inf 5.778623 12.28904
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Medicaid data:
## scenario insurance rate SE df asymp.LCL asymp.UCL
## Vaginitis Medicaid 6.909619 1.339333 Inf 4.725639 10.10294
##
## Confidence level used: 0.95
## Intervals are back-transformed from the log scale
##
## Interaction p-value for scenario Vaginitis : <0.01
## Wait times for Medicaid are shorter compared to Blue Cross/Blue Shield.
##
## Generated sentences:
## TOA: Patients with Blue Cross/Blue Shield insurance wait 4.7 days, with a 95% confidence interval (CI) ranging from 3.2 to 7.0 days. Medicaid recipients in this scenario experience longer waits, at 6.4 days with a CI of 4.3 to 9.5 days (p-value = NA).
##
## Pregnancy after tubal: Patients with Blue Cross/Blue Shield insurance wait 6.4 days, with a 95% confidence interval (CI) ranging from 4.3 to 9.5 days. Medicaid recipients in this scenario experience shorter waits, at 5.6 days with a CI of 3.8 to 8.3 days (p-value = <0.01).
##
## UTI: Patients with Blue Cross/Blue Shield insurance wait 3.5 days, with a 95% confidence interval (CI) ranging from 2.3 to 5.2 days. Medicaid recipients in this scenario experience shorter waits, at 3.1 days with a CI of 2.1 to 4.7 days (p-value = <0.01).
##
## Vaginitis: Patients with Blue Cross/Blue Shield insurance wait 8.4 days, with a 95% confidence interval (CI) ranging from 5.8 to 12.3 days. Medicaid recipients in this scenario experience shorter waits, at 6.9 days with a CI of 4.7 to 10.1 days (p-value = <0.01).
Poisson Model The models need to be able to deal with NA in the
business_days_until_appointment
outcome variable (558) and
also non-parametric data.
business_days_until_appointment
can be transformed with
a square root function so that 0 is not infinity from
log(business_days_until_appointment).
poisson_full_model
$$ \[\begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \text{{Patient Insurance}} \\ & + \beta_2 \cdot \text{{US Census Bureau Subdivision}} \\ & + \beta_3 \cdot \text{{Physician Academic Affiliation}} \\ & + \beta_4 \cdot \text{{Physician Age}} \\ & + \beta_5 \cdot \text{{Physician Gender}} \\ & + \beta_6 \cdot \text{{Physician Honorrific}} \\ & + \beta_7 \cdot \text{{Physician US Census Bureau}} \\ & + \beta_8 \cdot \text{{UTI, TOA, Vaginitis, Ectopic Scenario}} \\ & + \beta_9 \cdot \text{{Date that the call was made}} \\ & + \beta_10 \cdot \text{{Appointment Central Number}} \\ & + \beta_11 \cdot \text{{Number of Phone Transfers}} \\ & + \beta_12 \cdot \text{{Minutes on the phone}} \\ & + \beta_13 \cdot \text{{Minutes on hold}} \\ & + \beta_14 \cdot \text{{Rurality}} \\ & + ( 1 | \text{{Physician NPI}}) \end{align*}\] $$
poisson_full_model
What variables are significant in poisson_full_model
?
\[
\begin{align*}
\log(\lambda) &= \beta_0 \\
& + \beta_1 \cdot \text{Individual Predictor} \\
& + (1 \mid \text{Physician NPI})
\end{align*}
\]
This analysis explores the significance of various predictors on the
outcome variable business_days_until_appointment
,
accounting for the random effects associated with physicians. The goal
is to identify which variables significantly influence the time to
appointment while controlling for variability across individual
physicians.
The step-by-step approach demonstrates how individual predictors are assessed for their significance in influencing the response variable while accounting for the random effects associated with repeated measures on physicians. Significant variables will be used in the final multivariate model to better understand their impact on appointment wait times.
For poisson_full_model
: This analysis explores the
significance of various predictors on the outcome variable
business_days_until_appointment
, accounting for the random
effects associated with physicians. The goal is to identify which
variables significantly influence the time to appointment while
controlling for variability across individual physicians.
The step-by-step approach demonstrates how individual predictors are assessed for their significance in influencing the response variable while accounting for the random effects associated with repeated measures on physicians. Significant variables will be used in the final multivariate model to better understand their impact on appointment wait times.
## Predictor P_Value IRR
## 1 gender 0.0004052041 0.0001794285
## 2 academic 0.0022440459 578923.1830971744
## 3 hold_time_minutes 0.0403542942 5.1010672992
## 4 age 0.0631738778 0.8078885085
## 5 Medicaid_to_Medicare_Fee_Index 0.1296461022 0.8919231411
## 6 Med_sch 0.1789727708 0.0121337674
## 7 Grd_yr_category 0.1855833254 122.8246619928
## CI_Lower CI_Upper Wait_Time_Effect
## 1 0.000001577188 0.02041266 shorter wait time
## 2 123.402476751912 2715926460.70752954 longer wait time
## 3 1.078788441501 24.12047311 longer wait time
## 4 0.645570901010 1.01101806 shorter wait time
## 5 0.769558824876 1.03374409 shorter wait time
## 6 0.000019773598 7.44570167 shorter wait time
## 7 0.100535232272 150055.82871536 longer wait time
## Predictor P_Value IRR CI_Lower CI_Upper
## 1 gender <0.01 0.00 0.00 0.02
## 2 academic <0.01 578923.18 123.40 2715926460.71
## 3 hold_time_minutes 0.040 5.10 1.08 24.12
## 4 age 0.063 0.81 0.65 1.01
## 5 Medicaid_to_Medicare_Fee_Index 0.130 0.89 0.77 1.03
## 6 Med_sch 0.179 0.01 0.00 7.45
## 7 Grd_yr_category 0.186 122.82 0.10 150055.83
## Wait_Time_Effect
## 1 shorter wait time
## 2 longer wait time
## 3 longer wait time
## 4 shorter wait time
## 5 shorter wait time
## 6 shorter wait time
## 7 longer wait time
Predictor | P_Value | IRR | CI_Lower | CI_Upper | Wait_Time_Effect |
---|---|---|---|---|---|
gender | <0.01 | 0.00 | 0.00 | 0.02 | shorter wait time |
academic | <0.01 | 578923.18 | 123.40 | 2715926460.71 | longer wait time |
hold_time_minutes | 0.040 | 5.10 | 1.08 | 24.12 | longer wait time |
age | 0.063 | 0.81 | 0.65 | 1.01 | shorter wait time |
Medicaid_to_Medicare_Fee_Index | 0.130 | 0.89 | 0.77 | 1.03 | shorter wait time |
Med_sch | 0.179 | 0.01 | 0.00 | 7.45 | shorter wait time |
Grd_yr_category | 0.186 | 122.82 | 0.10 | 150055.83 | longer wait time |
academic
From the analysis and boxplot you provided, the issue with the high IRR seems clearer now. Let’s break down the results and address what might be going on:
Key Insights: 1. Sample Imbalance: - There is a major imbalance in the number of observations between Private Practice (556 cases) and University (47 cases). This discrepancy could lead to inflated coefficients, especially if the smaller group (University) has greater variability in wait times. This could explain why the estimate for academicUniversity is so large and significant.
Recommendations to Address the IRR Issue:
poisson_full_model
by removing
academic
## Predictor P_Value IRR CI_Lower
## 1 gender 0.0004052041 0.0001794285 0.000001577188
## 2 hold_time_minutes 0.0403542942 5.1010672992 1.078788441501
## 3 age 0.0631738778 0.8078885085 0.645570901010
## 4 Medicaid_to_Medicare_Fee_Index 0.1296461022 0.8919231411 0.769558824876
## 5 Med_sch 0.1789727708 0.0121337674 0.000019773598
## 6 Grd_yr_category 0.1855833254 122.8246619928 0.100535232272
## CI_Upper Wait_Time_Effect
## 1 0.02041266 shorter wait time
## 2 24.12047311 longer wait time
## 3 1.01101806 shorter wait time
## 4 1.03374409 shorter wait time
## 5 7.44570167 shorter wait time
## 6 150055.82871536 longer wait time
## Predictor P_Value IRR CI_Lower CI_Upper
## 1 gender <0.01 0.00 0.00 0.02
## 2 hold_time_minutes 0.040 5.10 1.08 24.12
## 3 age 0.063 0.81 0.65 1.01
## 4 Medicaid_to_Medicare_Fee_Index 0.130 0.89 0.77 1.03
## 5 Med_sch 0.179 0.01 0.00 7.45
## 6 Grd_yr_category 0.186 122.82 0.10 150055.83
## Wait_Time_Effect
## 1 shorter wait time
## 2 longer wait time
## 3 shorter wait time
## 4 shorter wait time
## 5 shorter wait time
## 6 longer wait time
Predictor | P_Value | IRR | CI_Lower | CI_Upper | Wait_Time_Effect |
---|---|---|---|---|---|
gender | <0.01 | 0.00 | 0.00 | 0.02 | shorter wait time |
hold_time_minutes | 0.040 | 5.10 | 1.08 | 24.12 | longer wait time |
age | 0.063 | 0.81 | 0.65 | 1.01 | shorter wait time |
Medicaid_to_Medicare_Fee_Index | 0.130 | 0.89 | 0.77 | 1.03 | shorter wait time |
Med_sch | 0.179 | 0.01 | 0.00 | 7.45 | shorter wait time |
Grd_yr_category | 0.186 | 122.82 | 0.10 | 150055.83 | longer wait time |
log_business_days_until_appointments
with academic
##
## Private Practice University
## 537 45
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: formula_simple
## Data: df3_filtered
##
## AIC BIC logLik deviance df.resid
## 5388.4 5405.9 -2690.2 5380.4 578
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.3249 -0.4584 -0.2239 0.2451 6.3292
##
## Random effects:
## Groups Name Variance Std.Dev.
## NPI (Intercept) 296.6 17.22
## Residual 354.8 18.84
## Number of obs: 582, groups: NPI, 401
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 16.851 1.230 348.861 13.701 < 0.0000000000000002 ***
## academicUniversity 13.269 4.313 385.358 3.076 0.00224 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## acdmcUnvrst -0.285
## Robust linear mixed model fit by DAStau
## Formula: formula_simple
## Data: df3_filtered
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.0400 -0.7605 -0.2985 0.7410 13.3397
##
## Random effects:
## Groups Name Variance Std.Dev.
## NPI (Intercept) 0.0 0.00
## Residual 299.8 17.32
## Number of obs: 582, groups: NPI, 401
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 13.1685 0.7664 17.182
## academicUniversity 4.8409 2.7562 1.756
##
## Correlation of Fixed Effects:
## (Intr)
## acdmcUnvrst -0.278
##
## Robustness weights for the residuals:
## 483 weights are ~= 1. The remaining 99 ones are summarized as
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.101 0.389 0.570 0.583 0.754 0.995
##
## Robustness weights for the random effects:
## All 401 weights are ~= 1.
##
## Rho functions used for fitting:
## Residuals:
## eff: smoothed Huber (k = 1.345, s = 10)
## sig: smoothed Huber, Proposal 2 (k = 1.345, s = 10)
## Random Effects, variance component 1 (NPI):
## eff: smoothed Huber (k = 1.345, s = 10)
## vcp: smoothed Huber, Proposal 2 (k = 1.345, s = 10)
Robust LMM with log_business_days_until_appointments
\[ \begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \underline{\mathbf{\large{\text{{SINGLE PREDICTOR}}}}} \\ & + ( 1 | \text{{Physician NPI}}) \end{align*} \]
## The following predictors were found to be significant predicting business days until new patient appointment:
## - gender : p = <0.01
## - hold_time_minutes : p = 0.04
## - age : p = 0.06
## - Medicaid_to_Medicare_Fee_Index : p = 0.13
## - Med_sch : p = 0.18
## - Grd_yr_category : p = 0.19
poisson_significant
Formula with only significant
variables\[ \begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \text{{Patient Insurance}} \\ & + \beta_2 \cdot \text{{US Census Bureau Subdivision}} \\ & + \beta_3 \cdot \text{{Physician Academic Affiliation}} \\ & + ( 1 | \text{{Physician Name}}) \end{align*} \]
where:
Fixed effects include…
Random effects account for variability between physicians, modeled as a random intercept.
The random effect for physician suggests that there is substantial variability in appointment wait times between physician. Physicians with a higher random intercept will tend to have longer wait times compared to Physicians with a lower random intercept.
poisson
Model with only significant variables## Generalized linear mixed model fit by maximum likelihood (Adaptive
## Gauss-Hermite Quadrature, nAGQ = 0) [glmerMod]
## Family: poisson ( log )
## Formula: business_days_until_appointment ~ gender + hold_time_minutes +
## age + Medicaid_to_Medicare_Fee_Index + Med_sch + Grd_yr_category +
## (1 | NPI)
## Data: df3
##
## AIC BIC logLik deviance df.resid
## 4535.9 4576.7 -2258.0 4515.9 425
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -6.5839 -0.7957 0.0010 0.0884 7.0372
##
## Random effects:
## Groups Name Variance Std.Dev.
## NPI (Intercept) 3.613 1.901
## Number of obs: 435, groups: NPI, 321
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.991991 1.584621 1.888 0.059 .
## genderMale -0.362270 0.253352 -1.430 0.153
## hold_time_minutes -0.026211 0.015268 -1.717 0.086 .
## age -0.011317 0.022985 -0.492 0.622
## Medicaid_to_Medicare_Fee_Index -0.001761 0.007185 -0.245 0.806
## Med_schInternational Medical Graduate -0.174718 0.300588 -0.581 0.561
## Grd_yr_category1990 to 1999 0.021450 0.387624 0.055 0.956
## Grd_yr_category2000 to 2009 -0.274714 0.525830 -0.522 0.601
## Grd_yr_category2010 or greater -0.569103 0.711369 -0.800 0.424
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) gndrMl hld_t_ age M__M_F Md_IMG G__1t1 G__2t2
## genderMale 0.122
## hld_tm_mnts -0.032 0.006
## age -0.926 -0.256 0.019
## Mdcd__M_F_I -0.306 0.051 0.009 -0.018
## Md_schIntMG -0.088 -0.080 0.014 0.021 0.114
## G__1990t199 -0.644 0.021 0.022 0.550 0.023 0.087
## G__2000t200 -0.800 0.019 0.010 0.739 0.033 0.039 0.760
## Grd__2010og -0.866 -0.068 0.017 0.851 0.005 -0.021 0.723 0.833
poisson_significant
Model CoefficientsGeneric Interpretation of Significant Predictors: In a Poisson regression, significant predictors are those with p-values less than a chosen threshold (usually p < 0.05). These predictors have a statistically significant effect on the outcome variable—in this case, business days until an appointment. The Incidence Rate Ratios (IRRs) help interpret the direction and magnitude of these effects:
Analysis Based on Current Results
Examples of Significant Predictors:
Non-Significant Predictors: 1. Gender (Male) (IRR = 0.74, p = 0.227): - Interpretation: Being male is associated with a 26% reduction in waiting time compared to females (IRR = 0.74), but this effect is not statistically significant (p = 0.227).
Random Effects and Marginal/Conditional R²:
Random Effects:
Marginal R² (0.856): This means that 85.6% of the variance in waiting time is explained by the fixed effects in the model, such as hold time, gender, and practice setting.
Conditional R² (0.996): When accounting for both fixed effects and random effects (NPI variability), the model explains 99.6% of the total variance in waiting times.
business days until appointment |
|||
---|---|---|---|
Predictors | Incidence Rate Ratios | CI | p |
(Intercept) | 19.93 | 0.89 – 444.87 | 0.059 |
gender [Male] | 0.70 | 0.42 – 1.14 | 0.153 |
hold time minutes | 0.97 | 0.95 – 1.00 | 0.086 |
age | 0.99 | 0.95 – 1.03 | 0.622 |
Medicaid to Medicare Fee Index |
1.00 | 0.98 – 1.01 | 0.806 |
Med sch [International Medical Graduate] |
0.84 | 0.47 – 1.51 | 0.561 |
Grd yr category [1990 to 1999] |
1.02 | 0.48 – 2.18 | 0.956 |
Grd yr category [2000 to 2009] |
0.76 | 0.27 – 2.13 | 0.601 |
Grd yr category [2010 or greater] |
0.57 | 0.14 – 2.28 | 0.424 |
Random Effects | |||
σ2 | 0.02 | ||
τ00 NPI | 3.61 | ||
ICC | 0.99 | ||
N NPI | 321 | ||
Observations | 435 | ||
Marginal R2 / Conditional R2 | 0.017 / 0.993 |
poisson_significant
modelFixed
Effectspoisson_significant
Model Performance## We fitted a poisson mixed model (estimated using ML and BOBYQA optimizer) to
## predict business_days_until_appointment with gender, hold_time_minutes, age,
## Medicaid_to_Medicare_Fee_Index, Med_sch and Grd_yr_category (formula:
## business_days_until_appointment ~ gender + hold_time_minutes + age +
## Medicaid_to_Medicare_Fee_Index + Med_sch + Grd_yr_category). The model included
## NPI as random effect (formula: ~1 | NPI). The model's total explanatory power
## is substantial (conditional R2 = 0.99) and the part related to the fixed
## effects alone (marginal R2) is of 0.02. The model's intercept, corresponding to
## gender = Female, hold_time_minutes = 0, age = 0, Medicaid_to_Medicare_Fee_Index
## = 0, Med_sch = US Senior Medical Student and Grd_yr_category = Less than 1990,
## is at 2.99 (95% CI [-0.11, 6.10], p = 0.059). Within this model:
##
## - The effect of gender [Male] is statistically non-significant and negative
## (beta = -0.36, 95% CI [-0.86, 0.13], p = 0.153; Std. beta = -0.36, 95% CI
## [-0.86, 0.13])
## - The effect of hold time minutes is statistically non-significant and negative
## (beta = -0.03, 95% CI [-0.06, 3.71e-03], p = 0.086; Std. beta = -0.04, 95% CI
## [-0.08, 5.16e-03])
## - The effect of age is statistically non-significant and negative (beta =
## -0.01, 95% CI [-0.06, 0.03], p = 0.622; Std. beta = -0.12, 95% CI [-0.60,
## 0.36])
## - The effect of Medicaid to Medicare Fee Index is statistically non-significant
## and negative (beta = -1.76e-03, 95% CI [-0.02, 0.01], p = 0.806; Std. beta =
## -0.03, 95% CI [-0.25, 0.20])
## - The effect of Med sch [International Medical Graduate] is statistically
## non-significant and negative (beta = -0.17, 95% CI [-0.76, 0.41], p = 0.561;
## Std. beta = -0.17, 95% CI [-0.76, 0.41])
## - The effect of Grd yr category [1990 to 1999] is statistically non-significant
## and positive (beta = 0.02, 95% CI [-0.74, 0.78], p = 0.956; Std. beta = 0.02,
## 95% CI [-0.74, 0.78])
## - The effect of Grd yr category [2000 to 2009] is statistically non-significant
## and negative (beta = -0.27, 95% CI [-1.31, 0.76], p = 0.601; Std. beta = -0.27,
## 95% CI [-1.31, 0.76])
## - The effect of Grd yr category [2010 or greater] is statistically
## non-significant and negative (beta = -0.57, 95% CI [-1.96, 0.83], p = 0.424;
## Std. beta = -0.57, 95% CI [-1.96, 0.83])
##
## Standardized parameters were obtained by fitting the model on a standardized
## version of the dataset. 95% Confidence Intervals (CIs) and p-values were
## computed using a Wald z-distribution approximation.
## The marginal R² value of the model is 0.017 and the conditional R² value is 0.993
## The marginal R² represents the proportion of variance explained by the fixed effects ( (Intercept), genderMale, hold_time_minutes, age, Medicaid_to_Medicare_Fee_Index, Med_schInternational Medical Graduate, Grd_yr_category1990 to 1999, Grd_yr_category2000 to 2009, Grd_yr_category2010 or greater ) alone ( 1.67 %). The conditional R² represents the proportion of variance explained by both the fixed effects and the random effects ( NPI ) combined ( 99.34 %). This indicates how much of the variability in the outcome can be attributed to the fixed effects versus the entire model, including random effects.
For poisson_significant
model: To determine which random
effects were significant in your model, you need to look at the variance
components for the random effects and their corresponding standard
deviations. In mixed models, random effects themselves do not have
p-values like fixed effects do. Instead, you evaluate their significance
by looking at the variance of the random effects. If the variance is
near zero, the random effect may not be contributing much to the
model.
Here’s how you can extract and interpret the variance of the random
effects to assess their significance for
poisson_significant
:
## [1] "The random effects in the model are:\n NPI"
## [2] "The random effects in the model are:\n (Intercept)"
## [3] "The random effects in the model are:\n NA"
## [4] "The random effects in the model are:\n 3.61312108307961"
## [5] "The random effects in the model are:\n 1.90082116020409"
## [6] "The random effects in the model are:\n Yes"
## The significant random effects are: NPI
simr_poisson_full_model
Model Power analysisThe power analysis you’ve conducted with the powerSim function is used to estimate the statistical power of your model for detecting effects of a specific predictor—in this case, the predictor insurance in a Poisson mixed-effects model.
poisson_significant
model assumptionsChecking the binned residuals because the data is non-parametric the residuals will not be normally distributed. Collinearity was tested as well as heteroscedasticity was checked.
The residuals appear to be spread out more as the fitted values
increase. This funnel shape (with wider dispersion of residuals at
higher fitted values) is an indication of heteroscedasticity. In a model
with homoscedasticity, the residuals would have a consistent spread
across all levels of fitted values, without a clear pattern.
The data is non-parametric so the residuals will not be within error
bounds.
poisson_significant
CollinearityVariance Inflation Factors (VIF) were calculated to assess multicollinearity among predictors. All VIF values were below the commonly used threshold of 5, suggesting that multicollinearity is not a concern for this model.
GVIF | Df | GVIF^(1/(2*Df)) | |
---|---|---|---|
gender | 1.228796 | 1 | 1.108511 |
hold_time_minutes | 1.001360 | 1 | 1.000679 |
age | 4.677638 | 1 | 2.162785 |
Medicaid_to_Medicare_Fee_Index | 1.019788 | 1 | 1.009845 |
Med_sch | 1.054505 | 1 | 1.026891 |
Grd_yr_category | 4.628542 | 3 | 1.290944 |
## 86 outliers detected: cases 8, 9, 15, 17, 18, 21, 25, 30, 32, 35, 36,
## 43, 54, 55, 58, 60, 63, 68, 69, 72, 73, 92, 100, 105, 113, 133, 137,
## 145, 151, 152, 154, 158, 159, 171, 180, 181, 190, 196, 197, 198, 203,
## 208, 209, 216, 218, 226, 227, 247, 251, 252, 260, 261, 262, 263, 267,
## 268, 278, 284, 285, 295, 296, 298, 301, 309, 313, 319, 321, 331, 338,
## 342, 344, 348, 355, 357, 362, 363, 365, 366, 389, 396, 399, 408, 409,
## 413, 425, 428.
## - Based on the following method and threshold: cook (0.9).
## - For variable: (Whole model).
poisson
Intraclass Correlation CoefficientThe Intraclass Correlation Coefficient (ICC) is a statistical measure used to evaluate the proportion of variance in a dependent variable that can be attributed to differences between groups or clusters. It is commonly used in the context of hierarchical or mixed models to quantify the degree of similarity within clusters.
## The intraclass correlation (ICC) of the model for the random effect group ' NPI ' is 0.783 .
## This indicates that 78.3 % of the variance in the outcome variable is attributable to differences between the NPI groups.
##
## This is considered a high ICC for the NPI group, indicating that most of the variance is due to differences between these groups.
A low to moderate Intraclass Correlation Coefficient (ICC) for the group “physician NPI name” suggests that while there is some variation in the outcome variable (e.g., business days until appointment) that can be attributed to differences between individual physicians, a substantial portion of the variation occurs within these groups—meaning that much of the variability in appointment times is due to factors other than just the differences between physicians.
In practical terms, this indicates that:
Variation Between Physicians: The fact that the ICC is not zero means that there is some consistency in the appointment times associated with each physician. Some physicians might systematically have longer or shorter wait times, contributing to the variance in the data.
Variation Within Physicians: Since the ICC is low to moderate, it means that even within the same physician, there is considerable variability in appointment times. This could be due to a variety of factors, such as the type of insurance, the scenario, or other factors that are not captured by the physician’s identity alone.
Implications: The low to moderate ICC suggests that while the identity of the physician (as indicated by the NPI name) does have an effect, it is not the dominant factor driving differences in appointment times. Other factors—potentially those captured by fixed effects or residual variance—are also playing a significant role.
In summary, while who the physician is does matter to some extent, other variables are likely more influential in determining how long a patient waits for an appointment. This insight can guide you to look more closely at those other factors in your analysis or to consider whether there are ways to reduce variability within physicians, such as through standardized scheduling practices.
poisson_significant
DispersionOverdispersion in your model implies that the variability in the observed data is greater than what the model predicts under the Poisson assumption. Specifically, in a Poisson model, the mean and variance of the count data are assumed to be equal.
## [1] "Significant overdispersion detected. Consider using a Negative Binomial model or adding random effects to account for overdispersion."
## Warning: Autocorrelated residuals detected (p < .001).
## [1] FALSE
Testing assumptions you can use the logLik function to get the log-likelihood of the model, and calculate the residual deviance as -2 * logLik(model). The residual degrees of freedom can be computed as the number of observations minus the number of parameters estimated (which includes both fixed effects and random effects).
The number of parameters estimated can be calculated as the number of fixed effects plus the number of random effects parameters. The number of fixed effects can be obtained from the length of fixef(model), and the number of random effects parameters can be obtained from the length of VarCorr(model).
If the dispersion parameter is considerably greater than 1, it indicates overdispersion. If it is less than 1, it indicates underdispersion. A value around 1 is considered ideal for Poisson regression.
## 'log Lik.' 7.894981 (df=10)
The Poisson regression assumes that the log of the expected count is a linear function of the predictors. One way to check this is to plot the observed counts versus the predicted counts and see if the relationship looks linear.
{r} # # geography level: https://api.census.gov/data/2016/acs/acs5/geography.html # # # Vector of all state abbreviations, including DC # state_abbreviations <- c(state.abb, "DC") # # # Retrieve ACS data for all ZIP codes (ZCTAs) without specifying a state # income_data <- tidycensus::get_acs( # geography = "zcta", # variables = "B19013_001", # Median Household Income # survey = "acs5", # year = 2022, # cache_table = TRUE # ) %>% dplyr::select(-moe) # # # Step 1: Prepare all_income_data with correct column names # # Assuming data_dir is the directory where you want to save the file # all_income_data <- income_data %>% # dplyr::rename(zip_code = GEOID, median_income = estimate) %>% # dplyr::mutate(zip_code = as.character(zip_code)) # Ensure zip_code is character # # # Save the all_income_data dataframe to the specified file path # file_path <- file.path(data_dir, "median_household_income_by_zcta.rds") # readr::write_rds(all_income_data, file_path) # # # Step 2: Prepare all_zips with consistent column names # all_zips <- zipcodeR::zip_code_db %>% # dplyr::rename(zip_code = zipcode) %>% # Rename 'zipcode' to 'zip_code' # dplyr::select(zip_code, state) # Keep only relevant columns # # # Construct the full file path # file_path <- file.path(data_dir, "Phase_2.rds") # ucc_data <- readRDS(file_path) %>% # dplyr::rename(id_number = ID) %>% # dplyr::rename(zip_code = zip) %>% # mutate(state = exploratory::statecode(state, output = "alpha_code")) # # # Create a dataframe to store the top 10% affluent ZIP codes for each state # affluent_zip_codes_summary <- data.frame(state = character(), zip_code = character(), median_income = numeric(), stringsAsFactors = FALSE) # # # Loop through each state abbreviation # for (state_abbreviation in state_abbreviations) { # # # Filter for the state's affluent ZIP codes (top 10%) # affluent_zip_codes <- all_income_data %>% # dplyr::inner_join(all_zips %>% dplyr::filter(state == state_abbreviation), by = "zip_code") %>% # dplyr::arrange(desc(median_income)) %>% # dplyr::slice(1:ceiling(n() * 0.10)) # Top 10% affluent ZIP codes in the state # # # Append to the affluent ZIP codes summary # if (nrow(affluent_zip_codes) > 0) { # affluent_zip_codes <- affluent_zip_codes %>% # dplyr::mutate(state = state_abbreviation) %>% # dplyr::select(state, zip_code, median_income) # # affluent_zip_codes_summary <- dplyr::bind_rows(affluent_zip_codes_summary, affluent_zip_codes) %>% dplyr::arrange(zip_code) # } # } # # # View the summary of affluent ZIP codes # print(affluent_zip_codes_summary) #
{r} # # physicians_in_affluent_zips<- affluent_zip_codes_summary # # Calculate the percentage of physicians in affluent ZIP codes for each state # physician_affluent_summary <- ucc_data %>% # dplyr::group_by(state) %>% # dplyr::summarize(total_physicians = n()) %>% # dplyr::left_join( # physicians_in_affluent_zips %>% # dplyr::group_by(state) %>% # dplyr::summarize(physicians_in_affluent = n()), # by = "state" # ) %>% # dplyr::mutate( # percent_in_affluent = (physicians_in_affluent / total_physicians) * 100 # ) # # # Replace NA with 0 for states with no physicians in affluent ZIP codes # physician_affluent_summary <- physician_affluent_summary %>% # dplyr::mutate( # physicians_in_affluent = ifelse(is.na(physicians_in_affluent), 0, physicians_in_affluent), # percent_in_affluent = ifelse(is.na(percent_in_affluent), 0, percent_in_affluent) # ) # # # Calculate the overall percentage for physicians in U.S. affluent ZIP codes # total_physicians_us <- nrow(ucc_data) # physicians_in_affluent_us <- nrow(physicians_in_affluent_zips) # # # Calculate the percentage # percent_in_affluent_us <- (physicians_in_affluent_us / total_physicians_us) * 100 # # # Add US-level summary to the output # us_summary <- data.frame( # state = "US", # total_physicians = total_physicians_us, # physicians_in_affluent = physicians_in_affluent_us, # percent_in_affluent = percent_in_affluent_us # ) # # # Combine state and US-level summaries # physician_affluent_summary <- dplyr::bind_rows(physician_affluent_summary, us_summary) # # # View the summary of physicians in affluent ZIP codes for each state and the U.S. overall # print(physician_affluent_summary, n = nrow(physician_affluent_summary)) #