FYI: If someone was told to go to the ED then we make their business days until appoint == 0.

tyler install

Read in data

## Rows: 1,114
## Columns: 45
## $ NPI                                     <dbl> 1265759062, 1265759062, 108300…
## $ age                                     <dbl> 53, 53, 36, 36, 51, 51, 52, 52…
## $ age_category                            <fct> 50 to 59 years old, 50 to 59 y…
## $ gender                                  <fct> Female, Female, Female, Female…
## $ Med_sch                                 <fct> US Senior Medical Student, US …
## $ Grd_yr                                  <dbl> 2010, 2010, 2015, 2015, 1998, …
## $ academic                                <fct> Private Practice, Private Prac…
## $ ACOG_District                           <fct> District V, District V, Distri…
## $ cbsatype10                              <fct> Metro, Metro, Metro, Metro, NA…
## $ scenario                                <fct> Prior trip to ED and was found…
## $ scenario_type                           <fct> Emergent, Emergent, Emergent, …
## $ insurance                               <fct> Blue Cross/Blue Shield, Medica…
## $ including_this_physician_in_the_study   <fct> No, No, No, Yes, No, No, Yes, …
## $ told_to_go_to_the_emergency_department  <fct> No, No, No, No, No, No, No, No…
## $ offered_a_clinic_appointment_to_be_seen <fct> No, No, No, Yes, No, No, Yes, …
## $ reason_for_exclusions                   <fct> Went to voicemail, Number cont…
## $ central_number                          <fct> No, No, Yes, No, No, No, No, N…
## $ number_of_transfers                     <fct> No transfers, No transfers, No…
## $ call_time_minutes                       <dbl> NA, NA, 1.4, 2.3, NA, 0.8, 1.0…
## $ hold_time_minutes                       <dbl> NA, NA, 0.0, 0.1, NA, NA, NA, …
## $ Provider.Enumeration.Date               <dbl> 2010, 2010, 2015, 2015, 2005, …
## $ day_of_the_week                         <ord> Thursday, Tuesday, Thursday, T…
## $ business_days_until_appointment         <dbl> NA, NA, NA, 1, NA, NA, 28, NA,…
## $ state                                   <chr> "Texas", "Texas", "Washington"…
## $ zip                                     <chr> "48001", "48001", "83535", "83…
## $ lat                                     <dbl> 42.63923, 42.63923, 46.53419, …
## $ lng                                     <dbl> -82.58170, -82.58170, -116.724…
## $ record_id                               <dbl> 1072, 201, 861, 296, 391, 1097…
## $ id_number                               <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,…
## $ Is.Sole.Proprietor                      <chr> "N", "N", "Y", "Y", "N", "N", …
## $ Grd_yr_category                         <fct> 2010 or greater, 2010 or great…
## $ Provider.Credential.Text                <chr> "MD", "MD", "DO", "DO", "MD", …
## $ median_household_income                 <dbl> 63272, 63272, 59414, 59414, NA…
## $ Medicaid_to_Medicare_Fee_Index          <dbl> 58, 58, 63, 63, 74, 74, 49, 49…
## $ basic_first_name                        <chr> NA, NA, "ABIGAIL", "ABIGAIL", …
## $ basic_last_name                         <chr> NA, NA, "PREST", "PREST", "RAS…
## $ basic_middle_name                       <chr> NA, NA, "ROSE", "ROSE", NA, NA…
## $ basic_credential                        <chr> NA, NA, "D.O.", "D.O.", "MD", …
## $ basic_sole_proprietor                   <chr> NA, NA, "NO", "NO", "NO", "NO"…
## $ basic_gender                            <chr> NA, NA, "F", "F", "F", "F", "F…
## $ basic_enumeration_date                  <date> NA, NA, 2015-04-09, 2015-04-0…
## $ basic_last_updated                      <date> NA, NA, 2024-08-22, 2024-08-2…
## $ taxonomies_code                         <chr> NA, NA, "207V00000X", "207V000…
## $ taxonomies_desc                         <chr> NA, NA, "Obstetrics & Gynecolo…
## $ taxonomies_state                        <chr> NA, NA, "Washington", "Washing…

Quality Check the Data

Are there any physicians included more than twice?

Included More than Twice
NPI	record_id	N
NA	NA	NA
—:	———:	–:

Variables of those physicians included more than twice?

Variables of Physicians Included More Than Twice
NPI	reason_for_exclusions	insurance	business_days_until_appointment
NA	NA	NA	NA
—:	:———————	:———	——————————-:

Find physicians called more than three times

NPI numbers called more than thrice
NPI	calls_count
NA	NA
—:	———–:

Do they have exclusion and have an appt?
NPI	id_number	reason_for_exclusions	business_days_until_appointment
NA	NA	NA	NA
—:	———:	:———————	——————————-:

Do they have business_days_until_appointment greater than zero but are an excluded category?

Records with Appointments but in Excluded Category
NPI	id_number	reason_for_exclusions	business_days_until_appointment
NA	NA	NA	NA
—:	———:	:———————	——————————-:

Do they have NA for business_days_until_appointment but are “Included” in the Reasons for exclusion category?

Included Records with NA for Appointments
NPI	id_number	reason_for_exclusions	business_days_until_appointment
NA	NA	NA	NA
—:	———:	:———————	——————————-:

Data Munging

Create Median Household Income Quantiles

`{r Create Median Household Income Quantiles, include = FALSE, message = FALSE} # physician_income_of_neighborhood <- tyler::results_section_physician_by_household_income( # year = 2022, # physician_information_with_zip = "Melanie/data/Phase_3/Phase_2.rds"); physician_income_of_neighborhood #`

## Zip Analysis

## National percentage of physicians in most affluent ZIP Codes

`{r, include = FALSE, message = FALSE} # tyler::results_section_analysis_household_income(physician_income_of_neighborhood) #`

Results

Check data normality

The data is not normally distributed. Plus it is count data. t-test assumes that data is normally distributed, and comparing the means of counts data is also not appropriate, we can check the incidence rate ratio for comparison of business_days_until_appointment among the categories of insurance. Better to use Poisson regression.

This Q-Q plot displays the distribution of the business_days_until_appointment variable against a theoretical normal distribution. Here’s an interpretation based on the plot’s characteristics:

Heavy Right Tail (Positive Skew): The data points deviate upward from the reference line on the right side, indicating that the business_days_until_appointment distribution has a heavy right tail or positive skew. This suggests that while most appointments are scheduled within a typical range, there are a few cases where the wait time is significantly longer.
Departure from Normality: The points deviate from the reference line at both ends, especially at the upper end (right tail). This indicates that the data does not follow a normal distribution closely. Instead, it appears to have a skewed, possibly exponential or log-normal distribution, given the pattern of points rising sharply at higher values.
Outliers: The data point at the top right, well above the line, is likely an outlier with a much longer wait time than the majority. This extreme value contributes to the non-normality and might need consideration, depending on the analysis goals.

In summary, the business_days_until_appointment variable is not normally distributed and shows positive skewness with some outliers, especially toward longer wait times.

## Starting normality check and summary calculation for variable: business_days_until_appointment

## Data extracted for variable: business_days_until_appointment

## Shapiro-Wilk normality test completed with p-value: 0.00000000000000000000000000000268309516735403

## The p-value is less than or equal to 0.05, indicating that the data is not normally distributed.

## Histogram with Density Plot created.

## Q-Q Plot created.

## Data is NOT normally distributed. Median: 8, IQR: 26

## $median
## [1] 8
## 
## $iqr
## [1] 26

## Summary calculation completed for variable: business_days_until_appointment

## $median
## [1] 8
## 
## $iqr
## [1] 26

Appointment Accessibility

## [1] "Physicians were successfully contacted in 49 states including the District of Columbia. The excluded states include North Dakota and Rhode Island."

Overall distribution of calls between BCBS and Medicaid

## [1] 1114

Insurance Acceptance Rates

## [1] 412

## [1] 416

## [1] 142

## [1] 1114

## [1] 556

## [1] "Out of the 557 unique physicians assigned Medicaid, 235 (42.2%; n = 235 / N = 557) were successfully contacted. Of the 235 physicians accepting Medicaid who were successfully contacted, 176 (74.9%; n = 176 / N = 235) provided an appointment date."

These acceptance rates reflect the proportion of physicians who were successfully contacted, accepted the respective insurance, and provided an appointment to the patient.

Medicaid Acceptance Rate: Out of the total number of physicians assigned Medicaid insurance (557), 176 physicians accepted Medicaid and provided an appointment, resulting in an acceptance rate of 74.9%.

Blue Cross/Blue Shield Acceptance Rate: Among the physicians assigned Blue Cross/Blue Shield insurance (557), 236 accepted this insurance and provided an appointment, yielding an acceptance rate of 73.1%.

Individual Insurance Rates of Successfully Making an Appointment

## [1] "Out of the 557 unique physicians assigned to be called for either insurance type, 557 (50%; n = 557 / N = 1114) were assigned to Blue Cross/Blue Shield, 557 (50%; n = 557 / N = 1114) were assigned to Medicaid, and 557 (50%; n = 557 / N = 1114) were assigned to at least one of these insurance types."

xx insurance (???%) was more likely yy insurance (???) to be referred to the emergency department (p=????)

## [1] "Blue Cross/Blue Shield insurance (16%) was more likely than Medicaid insurance (13%) to be referred to the emergency department (p=0.146)."

For emergent scenarios, xx insurance (???%) was more likely yy insurance (???) to be referred to the emergency department (p=????)

## For Emergent scenarios, Blue Cross/Blue Shield (13.6%) was more likely than Medicaid (11.1%) to be referred to the emergency department (p=0.4405).

## For Urgent scenarios, Blue Cross/Blue Shield (18.4%) was more likely than Medicaid (14.4%) to be referred to the emergency department (p=0.2515).

Told to seek Emergency Care

breakdown by both `scenario_type` and `scenario`

## For Emergent scenarios, patients presenting with Prior trip to ED and was found to have a 6 cm TOA accounted for 21.8% of ED referrals (n = 31 / N = 142).
## For Emergent scenarios, patients presenting with Positive pregnancy test after a tubal ligation accounted for 20.4% of ED referrals (n = 29 / N = 142).
## For Urgent scenarios, patients presenting with Acute cystitis accounted for 40.1% of ED referrals (n = 57 / N = 142).
## For Urgent scenarios, patients presenting with Recurrent/Treatment resistant vaginitis accounted for 17.6% of ED referrals (n = 25 / N = 142).

split the breakdown by insurance in addition to scenario_type and scenario:

## For Blue Cross/Blue Shield patients in Emergent scenarios, those presenting with Prior trip to ED and was found to have a 6 cm TOA accounted for 15.5% of ED referrals (n = 22 / N = 142).
## For Blue Cross/Blue Shield patients in Emergent scenarios, those presenting with Positive pregnancy test after a tubal ligation accounted for 9.9% of ED referrals (n = 14 / N = 142).
## For Blue Cross/Blue Shield patients in Urgent scenarios, those presenting with Acute cystitis accounted for 22.5% of ED referrals (n = 32 / N = 142).
## For Blue Cross/Blue Shield patients in Urgent scenarios, those presenting with Recurrent/Treatment resistant vaginitis accounted for 9.2% of ED referrals (n = 13 / N = 142).
## For Medicaid patients in Emergent scenarios, those presenting with Prior trip to ED and was found to have a 6 cm TOA accounted for 6.3% of ED referrals (n = 9 / N = 142).
## For Medicaid patients in Emergent scenarios, those presenting with Positive pregnancy test after a tubal ligation accounted for 10.6% of ED referrals (n = 15 / N = 142).
## For Medicaid patients in Urgent scenarios, those presenting with Acute cystitis accounted for 17.6% of ED referrals (n = 25 / N = 142).
## For Medicaid patients in Urgent scenarios, those presenting with Recurrent/Treatment resistant vaginitis accounted for 8.5% of ED referrals (n = 12 / N = 142).

scenario_type	n	percent
Emergent	60	42.25352
Urgent	82	57.74648

## For the 142 patients who were told to go to the Emergency Department, 42.3% were in the Emergent scenario type (n = 60 / N = 142) and 57.7% were in the Urgent scenario type (n = 82 / N = 142).

To calculate how many unique physicians referred their patients to the Emergency Department (ED)

## Number of unique physicians who referred patients to the ED: 120

May not need this

## Our sample included 557 calls to physician offices from 49 states excluding North Dakota and Rhode Island . We made calls to 557 unique physicians that accepted Blue Cross/Blue Shield. One Hundred Seventy-Six physician offices accepted Medicaid, giving a 74.9 % Medicaid acceptance rate for OBGYN practices (n = 176 /N = 235 ).  Physicians offices accepted Blue Cross/Blue Shield at a rate of 73.1 % (n = 236 /N = 323 ).

Age Physician Description

## The median age of the participants is 53 years, with an interquartile range (IQR) spanning from 44 years (25th percentile) to 61 years (75th percentile).

Gender Physician Description

## In our dataset, the most common physician gender was Female (n = 706/N = 1,114, 63.4%).

For the patient flow sheet

## For Blue Cross/Blue Shield patients presenting with Prior trip to ED and was found to have a 6 cm TOA (a Emergent scenario), the referral rate to the ED was 16.3% (n = 23 / N = 141).
## For Blue Cross/Blue Shield patients presenting with Positive pregnancy test after a tubal ligation (a Emergent scenario), the referral rate to the ED was 10.8% (n = 15 / N = 139).
## For Blue Cross/Blue Shield patients presenting with Acute cystitis (a Urgent scenario), the referral rate to the ED was 25% (n = 34 / N = 136).
## For Blue Cross/Blue Shield patients presenting with Recurrent/Treatment resistant vaginitis (a Urgent scenario), the referral rate to the ED was 12.1% (n = 17 / N = 141).
## For Medicaid patients presenting with Prior trip to ED and was found to have a 6 cm TOA (a Emergent scenario), the referral rate to the ED was 9.9% (n = 14 / N = 141).
## For Medicaid patients presenting with Positive pregnancy test after a tubal ligation (a Emergent scenario), the referral rate to the ED was 12.2% (n = 17 / N = 139).
## For Medicaid patients presenting with Acute cystitis (a Urgent scenario), the referral rate to the ED was 19.1% (n = 26 / N = 136).
## For Medicaid patients presenting with Recurrent/Treatment resistant vaginitis (a Urgent scenario), the referral rate to the ED was 9.9% (n = 14 / N = 141).

## 
## 
##  Of the 162 Emergent scenario calls for Blue Cross/Blue Shield patients that should have received ED referral instructions, only 22.2% (n = 36) were appropriately triaged to emergency care.
## Of the 156 Urgent scenario calls for Blue Cross/Blue Shield patients that should have received ED referral instructions, only 28.8% (n = 45) were appropriately triaged to emergency care.
## Of the 114 Emergent scenario calls for Medicaid patients that should have received ED referral instructions, only 21.1% (n = 24) were appropriately triaged to emergency care.
## Of the 118 Urgent scenario calls for Medicaid patients that should have received ED referral instructions, only 30.5% (n = 36) were appropriately triaged to emergency care.

Exclusions

## [1] "Of the total 1114 phone calls made, 558 were successfully connected, and 556 were excluded."

How many physician offices were successfully contact and had a business_days_until_appointment >0?

## [1] 401

## [1] 404

## [1] 16

## [1] 160

Answer: Subsequently, these phone calls resulted in ??? of the ???? physicians meeting exclusion criteria.

## [1] "Subsequently, these phone calls resulted in 378 of the 557 physicians meeting exclusion criteria."

Answer: Of the remaining ??? physicians meeting inclusion criteria, ??? (??%) did not accept Medicaid.

## [1] 371

## [1] 29

## [1] "Of the remaining 371 physicians meeting inclusion criteria, 29 ( 7.8 %) did not accept Medicaid and were not accepting new patients."

A total of??? physicians accepted Medicaid and/or BCBS and gave a date for the soonest available appointment

## [1] 302

## [1] "A total of 302 physicians accepted Medicaid and/or BCBS and gave a date for the soonest available appointment."

Physician offices accepted Blue Cross/Blue Shield at a rate of ?? % (n = ??? / N = ???)

## [1] "Physician offices accepted Blue Cross/Blue Shield at a rate of 100 % (n = 557 / N = 557 )."

4 Exploratory Data Analysis (EDA)

Graph each variable

4.1 Histogram Plots for Each Predictor

4.2 Business Days by Insurance

Interpretation: Highly Skewed Distribution:

Both distributions are right-skewed, meaning that most patients receive appointments relatively quickly, while a smaller number experience longer wait times.
The majority of appointments for both insurance types seem to occur within a shorter range of business days, with a notable decline in frequency as the waiting time increases. Differences Between Insurance Types:
The distribution for Blue Cross/Blue Shield shows more variation in the number of days, with many individuals receiving appointments within shorter time frames, but also some patients experiencing longer delays (greater than 100 days).
Medicaid appears to have more concentrated appointments in shorter time frames, suggesting fewer extreme wait times compared to Blue Cross/Blue Shield. However, the initial spike is much higher, implying a higher number of shorter wait times compared to Blue Cross/Blue Shield. Appointment Delays:
While both distributions drop significantly after the initial spike (indicating a substantial number of shorter wait times), Medicaid exhibits a sharper drop compared to Blue Cross/Blue Shield, suggesting that fewer Medicaid patients have extended wait times.
Overall, the distribution suggests that while there is a large volume of quick appointments for both insurance types, Medicaid tends to offer fewer appointments with long delays compared to Blue Cross/Blue Shield. However, Blue Cross/Blue Shield patients exhibit a broader range of wait times, possibly indicating more variability in scheduling.

4.3 Log Transformation for Business Days

## Plots saved to: output/density_plot_20250324_082233.tiff and output/density_plot_20250324_082233.png

The log transformation applied to the business_days_until_appointment variable has several significant effects:

Reducing Skewness:
The original business_days_until_appointment variable is highly skewed to the right, with a large number of values clustered at low numbers and a few extreme values extending into high numbers. By taking the logarithm, we compress these larger values and stretch out the smaller ones, reducing the extreme skewness. This makes it easier to visualize and interpret the underlying distributions.
Enhanced Comparability:
The log transformation reduces the range of values, which helps in comparing the distributions between Blue Cross/Blue Shield and Medicaid more effectively. Without this transformation, the plot would be dominated by very long tails, making it challenging to identify differences between the distributions.
Better Insight into Relative Differences:
By taking the logarithm, we can see relative differences more clearly. The log-transformed density plot reveals distinct peaks for the two insurance categories that wouldn’t be as apparent otherwise. For example, Blue Cross/Blue Shield shows a more pronounced peak around log 3 (approximately 20 days), whereas Medicaid has a more bimodal shape. This insight into the central tendencies and spread is made possible by transforming the values to a log scale.

In summary, the log transformation improves interpretability by reducing skewness, allowing for better comparison between groups, and highlighting important patterns that would otherwise be obscured in the original scale.

5. Appointment and Scenario Analysis

told_to_go_to_the_emergency_depa for emergency scenario types

5.1 Scenario Type by Insurance

5.1 Day of the week by insurance

5.1 Central Appointment Line by Insurance

5.2 Physician Characteristics by Insurance

5.2 Physician MD vs. DO by Insurance

6. Table 1

6.1 Display Summary Table

	Overall (N=557)
Age (years)
- Less than 50 years old	214 (39.9%)
- 50 to 55 years old	86 (16.0%)
- 56 to 60 years old	85 (15.8%)
- 61 to 65 years old	67 (12.5%)
- Greater than 65 years old	85 (15.8%)
Gender
- Female	353 (63.4%)
- Male	204 (36.6%)
Medical School Training
- Allopathic training	513 (93.8%)
- Osteopathic training	34 (6.2%)
Medical School Location
- US Senior Medical Student	408 (82.1%)
- International Medical Graduate	89 (17.9%)
Academic Affiliation
- Private Practice	500 (89.8%)
- University	57 (10.2%)
Rurality
- Metropolitan area	507 (91.0%)
- Rural area	50 (9.0%)
Number of Phone Transfers
- No transfers	352 (63.2%)
- One transfer	158 (28.4%)
- Two transfers	37 (6.6%)
- More than two transfers	10 (1.8%)
Insurance
- Blue Cross/Blue Shield	557 (100.0%)
age_category
- Less than 40 years old	76 (14.2%)
- 40. to 49 years old	138 (25.7%)
- 50 to 59 years old	171 (31.8%)
- 60 to 69 years old	118 (22.0%)
- 70 years and greater	34 (6.3%)
American College of OBGYNs Districts
- District I	34 (6.1%)
- District II	33 (5.9%)
- District III	37 (6.6%)
- District IV	65 (11.7%)
- District V	54 (9.7%)
- District VI	64 (11.5%)
- District VII	93 (16.7%)
- District VIII	93 (16.7%)
- District IX	30 (5.4%)
- District XI	20 (3.6%)
- District XII	34 (6.1%)
scenario
- Acute cystitis	136 (24.4%)
- Positive pregnancy test after a tubal ligation	139 (25.0%)
- Prior trip to ED and was found to have a 6 cm TOA	141 (25.3%)
- Recurrent/Treatment resistant vaginitis	141 (25.3%)
scenario_type
- Emergent	280 (50.3%)
- Urgent	277 (49.7%)
told_to_go_to_the_emergency_department
- Yes	89 (16.0%)
- No	468 (84.0%)
Central scheduling
- Yes, central scheduling number	201 (36.1%)
- No	356 (63.9%)
call_time_minutes
- n	447
- Median (Q1, Q3)	2.0 (1.0, 3.5)
hold_time_minutes
- n	419
- Median (Q1, Q3)	0.4 (0.0, 1.8)
Day of the week Called
- Thursday	132 (23.7%)
- Tuesday	112 (20.1%)
- Wednesday	172 (30.9%)
- Friday	82 (14.7%)
- Monday	58 (10.4%)
- Saturday	1 (0.2%)
business_days_until_appointment
- n	331
- Median (Q1, Q3)	9.0 (0.0, 26.0)

The majority are young (under 50), female, trained in allopathic programs, and primarily working in private practice????? and metropolitan areas. Most do not use central scheduling, and the calls were more likely to be made midweek.

6.2 Save Summary Table to Word

7. Wait Time by Insurance Figures

Waiting time in Days (Log Scale) for Blue Cross/Blue Shield versus Medicaid. The code you provided will create a scatter plot with points representing the relationship between the insurance variable (x-axis) and the days variable (y-axis). Additionally, it includes a line plot that connects points with the same npi value.

7.1 Line Plot of Waiting Time by Insurance Type

The line plot shows the relationship between Blue Cross/Blue Shield and Medicaid regarding the log-transformed values of business days until appointment.

Data Representation:
- The vertical axis (y-axis) represents the log-transformed “business days until appointment.”
- The horizontal axis (x-axis) represents two different insurance types: “Blue Cross/Blue Shield” and “Medicaid”.
- Each individual line connecting points on both sides represents changes in “business days until appointment” for each instance or individual between the two insurance types.
Log Transformation:
- The y-axis uses a logarithmic scale to visualize “business days until appointment” because the original values may be skewed with extreme values (outliers). The log transformation helps compress high values, making it easier to view the data distribution.
Connections Between Insurance Types:
- The lines connecting the dots show the changes in “business days until appointment” for each physician office between the two insurance types.
- There is a considerable number of crossed lines, indicating inconsistent differences between “business days until appointment” for “Blue Cross/Blue Shield” versus “Medicaid.”
General Observations:
- The red line represents the median “business days until appointment” for both insurance types.
- The median appears relatively flat between the two insurance types, indicating similar average wait times.
- However, there is high variability in the individual waiting times; for some offices, “Medicaid” results in longer waiting times, while for others, “Blue Cross/Blue Shield” has longer waits.
Insights:
- There appears to be no systematic difference between the waiting times for “Blue Cross/Blue Shield” versus “Medicaid” based on the median line, which is relatively similar for both groups.
- The variability in the lines indicates unequal access depending on the provider. While the average waiting times are similar, some patients experience significant differences depending on their insurance.

Overall, this plot suggests that while on average waiting times are similar between the two insurance groups, individual-level variability is significant. This implies potential disparities in access to timely appointments based on the specific provider or office handling the appointment.

7.2 Scatter Plot of Waiting Times by Insurance Type

## Plots saved to: Melanie/Figures/urgent_gyn_vs_insurance_none_20250324_082240.tiff and Melanie/Figures/urgent_gyn_vs_insurance_none_20250324_082240.png

The scatter plot depicts the distribution of business days until appointment for patients covered by Blue Cross/Blue Shield and Medicaid.

Insurance Type:
- The x-axis shows two different insurance types: “Blue Cross/Blue Shield” and “Medicaid”.
- The y-axis represents the business days until appointment, indicating the waiting period in days for appointments for patients under each insurance type.
Data Representation:
- Each point in the plot represents an appointment for an individual provider, with the purple dots for Blue Cross/Blue Shield and yellow dots for Medicaid.
- The height of the points along the y-axis represents the number of days until the next available appointment for each provider.
Distribution and Variability:
- For both Blue Cross/Blue Shield and Medicaid, the majority of the points are clustered below 50 days, indicating that most appointments are scheduled within approximately 50 business days.
- Medicaid data shows a greater spread, with a number of points rising significantly higher than 100 business days. This suggests that for some providers, Medicaid patients are facing longer delays compared to Blue Cross/Blue Shield.
- There is also noticeable variability within each insurance type, with some points reaching above 150 to 250 days, indicating outliers where patients had to wait considerably longer for appointments.
Overall Insights:
- The visual spread indicates that Medicaid tends to have more variability in waiting times, with some appointments requiring extended waiting periods.
- In contrast, Blue Cross/Blue Shield generally shows a tighter clustering of waiting times below 50 days, suggesting a more consistent access pattern.
- The difference in height and density of points suggests that while Blue Cross/Blue Shield has more consistent and possibly shorter waiting times, Medicaid patients may be more prone to experiencing significantly longer waits.
Potential Implications:
- This variability in Medicaid waiting times could imply potential barriers or delays in accessing timely care for patients depending on the provider or region.
- The differences suggest that Medicaid patients may not be receiving as consistent access to appointments as patients with Blue Cross/Blue Shield, potentially reflecting disparities in healthcare access.

The overall interpretation of this plot suggests notable differences in waiting times between Blue Cross/Blue Shield and Medicaid, with Medicaid patients generally facing a wider range and potentially longer waiting periods, indicating potential disparities in appointment availability.

7.3 Density Plot of Waiting Times by Insurance

## Plots saved to: Melanie/Figures/urgent_GYN_vs_insurance_density_20250324_082241.tiff and Melanie/Figures/urgent_GYN_vs_insurance_density_20250324_082241.png

Comparison of the Density Plot and Scatter Plot of Waiting Times by Insurance Type

The density plot and scatter plot provide different perspectives on the waiting times for appointments by insurance type (Blue Cross/Blue Shield vs. Medicaid). Each plot reveals unique insights, and here’s how they compare:

Density Plot:
- The density plot provides a smooth representation of the distribution of waiting times for each insurance type, showing the overall trends and concentration of data points.
- It allows for an immediate understanding of the shape of the distribution, highlighting where most of the data is concentrated (i.e., peak densities).
- The log-transformed x-axis helps compress the range of waiting times, which is particularly useful for visualizing data that varies widely. The transformation helps reduce the influence of outliers, making it easier to compare the central trends of the two insurance groups.
- The density plot captures differences in variability between the two groups:
  - Blue Cross/Blue Shield has a sharper peak, indicating a more consistent and shorter waiting time.
  - Medicaid has a broader, flatter peak, indicating greater variability and suggesting longer waiting times for some patients.
- Overall, the density plot is useful for understanding the underlying probability distribution of the waiting times and highlighting trends across the two insurance groups.
Scatter Plot:
- The scatter plot provides individual-level data points, showing each appointment’s waiting time by insurance type.
- It allows us to see the spread and range of waiting times without aggregating the data. The scatter plot shows each instance, giving an immediate sense of the variability and presence of outliers.
- Unlike the density plot, the scatter plot is more focused on the actual distribution of individual data points, which helps identify clusters, patterns, or any outlying values.
- One key insight from the scatter plot is that both insurance types have a few extreme outliers (very long waiting times), but it does not provide the same visual insight into how most of the data is distributed.
- The scatter plot shows that Blue Cross/Blue Shield patients generally have shorter waiting times, with many points clustered at the lower end. Medicaid patients also tend to have shorter waiting times, but the scatter plot reveals more variation in their waiting times, with some values spread across a larger range.
Unique Insights from the Density Plot:
- The density plot reveals the central tendencies and overall distribution shape of waiting times, which may be less apparent from the scatter plot. Specifically, the density plot shows where the data peaks, which can indicate the most common waiting times.
- It also provides a better visualization of the relative concentration of waiting times, helping to identify whether there is a skew in the data or bimodal tendencies.
- The density plot helps to compare the overall spread and concentration between the insurance types, highlighting systematic differences in access to care, which might be harder to discern from individual points in the scatter plot.
Unique Insights from the Scatter Plot:
- The scatter plot shows the specific distribution of waiting times, allowing for an easy identification of outliers and the spread of individual values.
- It provides a granular view of the data, illustrating exactly how waiting times vary for each appointment and showing every observation instead of summarizing it through density.
Conclusion:
- The density plot is excellent for understanding summary distributions and identifying general trends between insurance types, such as typical waiting times and where data clusters occur.
- The scatter plot, on the other hand, is better suited for observing individual data points, detecting outliers, and understanding raw spread and variability in waiting times.

Therefore, the density plot provides a more generalized view of the distribution of waiting times, which can make trends and patterns more immediately apparent, whereas the scatter plot offers a more detailed view of individual data points, providing insights into outliers and specific instances. Using both plots together gives a comprehensive understanding of the data, balancing overall trends with individual observations.

8. Line Plot of Waiting Times by Scenario

Waiting time in Days (Log Scale) for Blue Cross/Blue Shield versus Medicaid. The code you provided will create a scatter plot with points representing the relationship between the scenario variable (x-axis) and the days variable (y-axis). Additionally, it includes a line plot that connects points with the same NPI name value.

8.1 Create Line Plot for Different Scenarios

## # A tibble: 6 × 5
##   Group1                Group2               Direction p_value p_value_formatted
##   <fct>                 <fct>                <chr>       <dbl> <chr>            
## 1 TOA                   Pregnancy after tub… Higher    9.01e-1 p=0.901          
## 2 TOA                   UTI                  Higher    1.02e-2 p=0.01           
## 3 TOA                   Vaginitis            Lower     1.31e-1 p=0.131          
## 4 Pregnancy after tubal UTI                  Higher    4.05e-3 p<0.01           
## 5 Pregnancy after tubal Vaginitis            Lower     6.70e-2 p=0.067          
## 6 UTI                   Vaginitis            Lower     3.55e-5 p<0.01

## # A tibble: 6 × 5
##   Group1                Group2                Higher Lower Significant
##   <fct>                 <fct>                  <int> <int>       <dbl>
## 1 TOA                   Pregnancy after tubal      1     0           0
## 2 TOA                   UTI                        1     0           1
## 3 TOA                   Vaginitis                  0     1           0
## 4 Pregnancy after tubal UTI                        1     0           1
## 5 Pregnancy after tubal Vaginitis                  0     1           0
## 6 UTI                   Vaginitis                  0     1           1

## Payments for TOA (median: $9.0000; IQR [$0.0000 - $26.0000]) are higher than payments for Pregnancy after tubal (median: $9.0000; IQR [$1.0000 - $22.0000], p=0.901).
## Payments for TOA (median: $9.0000; IQR [$0.0000 - $26.0000]) are higher than payments for UTI (median: $1.0000; IQR [$0.0000 - $20.0000], p=0.01).
## Payments for TOA (median: $9.0000; IQR [$0.0000 - $26.0000]) are lower than payments for Vaginitis (median: $12.0000; IQR [$1.0000 - $33.7500], p=0.131).
## Payments for Pregnancy after tubal (median: $9.0000; IQR [$1.0000 - $22.0000]) are higher than payments for UTI (median: $1.0000; IQR [$0.0000 - $20.0000], p<0.01).
## Payments for Pregnancy after tubal (median: $9.0000; IQR [$1.0000 - $22.0000]) are lower than payments for Vaginitis (median: $12.0000; IQR [$1.0000 - $33.7500], p=0.067).
## Payments for UTI (median: $1.0000; IQR [$0.0000 - $20.0000]) are lower than payments for Vaginitis (median: $12.0000; IQR [$1.0000 - $33.7500], p<0.01).

Comparison and Unique Insights of the Line Plot

The line plot above provides unique insights compared to the previous figures (scatter plot and density plot) regarding waiting times for different medical scenarios. Let’s discuss what the line plot reveals that is not evident in the other figures:

Comparison of Median Waiting Times Across Scenarios

Unlike the density plot or scatter plot, the line plot explicitly highlights the median waiting times for each scenario through the red line connecting the median points.
This connection provides a clear comparison of waiting times across different scenarios, revealing that Pregnancy after Tubal Ligation has a significantly shorter waiting time compared to TOA, UTI, or Vaginitis.

Sequential Comparison Across Categories

The line plot offers a way to sequentially compare different scenarios using the connecting line, which isn’t possible with the other figures.
- The line connecting the median waiting times provides a smooth visual reference that makes it easy to see increases or decreases between scenarios.
- The dip for Pregnancy after Tubal Ligation stands out clearly in the line plot, showing that this scenario is prioritized differently compared to the others.

Trend Analysis

In contrast to the density plot and scatter plot, which mostly illustrate distribution and individual data points, the line plot provides an overall trend analysis for waiting times.
- It helps to understand whether there are any significant patterns or changes in waiting times between different scenarios, as indicated by the sharp decrease in the line for Pregnancy after Tubal Ligation.

Individual Points Plus Central Tendency

The scatter points in the line plot show the individual waiting times, similar to the scatter plot.
However, unlike the scatter plot alone, this figure combines individual variability with a summary metric (i.e., the median) across categories, giving a more holistic view of both the spread and the central tendency.
This helps in visualizing how the median relates to individual values within each scenario, making it easy to see the consistency or variability in responses.

Visualizing Differences in Central Tendency Between Scenarios

The density plot emphasizes the overall shape of waiting times, while the scatter plot focuses on individual values without centrality measures.
The line plot specifically shows how median waiting times differ across categories. This makes it easier to spot differences between scenarios, especially when those differences are relatively small but consistent, such as the comparison between UTI and Vaginitis.

Summary - The line plot uniquely contributes by providing a clear view of trends and central tendency (i.e., median waiting times) across different medical scenarios. - It effectively combines the strengths of a scatter plot (individual data points) with a trend line, making it possible to quickly compare scenarios. - The sharp dip for the Pregnancy after Tubal Ligation scenario is a key feature that stands out in this plot, indicating differential prioritization in scheduling that is less obvious in the scatter or density plots.

In conclusion, the line plot highlights the relationship between categories (medical scenarios) through trend lines that connect median values, something the other figures do not illustrate as effectively. This helps in understanding both the individual spread of data and the overall trend, making it a valuable addition to the data visualization toolkit for this study.

8.2 Scatter Plot of Waiting Times by Scenario

## Plots saved to: Melanie/Figures/urgent_GYN_vs_scenario_none_20250324_082242.tiff and Melanie/Figures/urgent_GYN_vs_scenario_none_20250324_082242.png

8.3 Density Plot of Waiting Times by Scenario

Understanding a Density Plot:

A density plot is a smoothed version of a histogram that shows the distribution of a continuous variable. It represents the relative frequency of data points in different ranges of values, with areas under the curve corresponding to proportions of the data.

X-axis (Log Waiting Times in Days):
- The x-axis shows the logarithm of waiting times in days, meaning the waiting times have been transformed to a logarithmic scale to make the distribution more manageable or easier to interpret. A log transformation is often used when the raw data is skewed.
- Values closer to the left (lower on the x-axis) represent shorter waiting times, while values to the right (higher on the x-axis) represent longer waiting times.
Y-axis (Density):
- The y-axis represents density, which is the relative concentration of data points for a given range of values on the x-axis. The area under the entire curve sums to 1, meaning it reflects the proportion of observations.
- Higher peaks represent regions where there is a higher concentration of data points, while lower regions represent ranges with fewer data points.
Colors (Insurance):
- The two colors (purple for Blue Cross/Blue Shield and yellow for Medicaid) represent the distribution of waiting times for the two different insurance groups.
- The overlap between the two distributions is shaded, showing regions where both groups have similar waiting times.

How to Read the Density Plot: 1. Shape of the Distribution: - The shape of each curve tells you about the distribution of waiting times within each insurance group. - A peak indicates the most common waiting times for that group. - A wider curve indicates a more spread-out distribution, meaning the waiting times vary more within that group. - A narrower curve indicates that waiting times are more concentrated around the peak.

## Plots saved to: Melanie/Figures/urgent_GYN_vs_scenario_density_20250324_082243.tiff and Melanie/Figures/urgent_GYN_vs_scenario_density_20250324_082243.png

Interpretation of Waiting Times by Scenario

The scatter plot shows the distribution of waiting times (in business days) for four medical scenarios: TOA (tubo-ovarian abscess), Pregnancy after tubal ligation, UTI (urinary tract infection), and Vaginitis.

Key Insights

Comparison of Waiting Times Across Scenarios
- The TOA (purple) and Vaginitis (yellow) scenarios show substantial variability in waiting times, with the range extending up to 250 days for some outliers.
- Pregnancy after tubal ligation (blue) generally has shorter waiting times, with most of the waiting times concentrated below 50 days, suggesting prioritization for these cases.
- UTI (green) exhibits a moderate distribution of waiting times, mainly clustered below 100 days, with fewer extreme values.
Variability and Prioritization
- Pregnancy after tubal ligation exhibits less variability compared to other scenarios, indicating more consistent scheduling of appointments. The clustering of points close to shorter waiting times implies that these cases may be given higher priority.
- TOA and Vaginitis have more outliers and a wider range of waiting times, indicating inconsistency in scheduling and possibly lower prioritization.
- UTI falls somewhere in between, with moderate waiting times and less variability compared to TOA and Vaginitis.
Outliers in the Data
- Both TOA and Vaginitis scenarios have several outliers where waiting times extend to around 250 days. This may indicate difficulties in prioritizing these cases or challenges in scheduling appointments promptly.
- In contrast, Pregnancy after tubal ligation and UTI scenarios show fewer outliers, implying a more consistent and predictable range of waiting times.

Summary - Pregnancy after tubal ligation is associated with shorter and more consistent waiting times, possibly reflecting a higher prioritization for these appointments. - TOA and Vaginitis have a wider range and more frequent longer delays, indicating variability in access to care. - UTI appointments are scheduled faster than TOA and Vaginitis but with moderate consistency.

This plot helps visualize both individual waiting times and the patterns of variability across the different medical conditions, highlighting disparities in how quickly patients are able to get appointments depending on their scenario. These insights can be used to understand differences in healthcare access and scheduling priorities.

9. Statistical testing

9.1 Fit the Interaction Model

Consider the following scenario:

You want to examine how insurance type affects waiting time for appointments. However, patients may differ in other ways, like their age and medical condition, which could also influence waiting times.

When fitting a regression model with waiting time as the dependent variable and insurance type as one of the predictors (along with other factors like age and medical condition), the EMMs would represent the average waiting time for each insurance type, adjusted for the effects of age and medical condition. This adjustment helps isolate the effect of insurance type on waiting time, ensuring the comparison between insurance types is fair.

Interpretation: In the plot you provided earlier, the Estimated Marginal Means for each scenario represent the average predicted waiting time for an appointment, adjusted for other factors in the model. This gives a clearer, model-based comparison of the expected waiting times across different medical scenarios, taking into account variability in other factors.

This image is a plot of Estimated Marginal Means (also known as least-squares means) for different scenarios. Each point represents the estimated marginal mean waiting time (in days) for a different medical scenario, and the error bars represent the 95% confidence intervals (CI) around these estimates.

Here’s a breakdown of the different components of the plot:

Y-axis:
- “Estimated Marginal Means for Waiting Time in Days”: The y-axis shows the estimated average waiting time in business days, which is the dependent variable in this model. The values range from about 15 to 30 days.
- The scale indicates that patients in different scenarios are estimated to wait between 15 and 30 days for an appointment, on average.
X-axis:
- “scenario”: The x-axis lists five different medical scenarios, which are the levels of the predictor variable. These are:
  1. Prior trip to ED and was found to have a 6 cm TOA: This scenario seems to be related to a prior emergency department (ED) visit and the discovery of a 6 cm tubo-ovarian abscess (TOA).
  2. Positive pregnancy test after a tubal ligation: This scenario involves a positive pregnancy test after a sterilization procedure (tubal ligation).
  3. Acute cystitis: This scenario involves a urinary tract infection, commonly known as acute cystitis.
  4. Recurrent/Treatment resistant vaginitis: This scenario refers to persistent or treatment-resistant vaginal infections.
The x-axis labels are rotated for readability, showing the different medical conditions (scenarios) being compared.
Estimated Marginal Means (Points on the Plot):
- Colored Points: Each colored point represents the estimated marginal mean waiting time for that specific scenario.
  - The different colors correspond to different scenarios, as explained by the legend at the bottom of the plot.
  - The vertical position of each point represents the estimated waiting time in business days.
Confidence Intervals (Error Bars):
- Error Bars: The vertical bars around each point represent the 95% confidence intervals. These intervals give a range within which the true mean waiting time for each scenario is expected to lie 95% of the time, based on the model.
  - Narrower intervals (e.g., the scenario “Positive pregnancy test after a tubal ligation”) indicate more precision in the estimate.
  - Wider intervals (e.g., “Recurrent/Treatment resistant vaginitis”) indicate more uncertainty or variability in the estimate.
Interpretation of the Estimated Marginal Means:

A simple rule of thumb is that if error bars for 95% confidence intervals overlap by less than about half the length of one error bar, the difference between the two groups might still be statistically significant. If the error bars overlap considerably, it’s more likely (but not guaranteed) that the difference between the groups is not statistically significant.

“Prior trip to ED and was found to have a 6 cm TOA”: The estimated mean waiting time for this scenario is around 20 days, with a relatively narrow confidence interval.
“Positive pregnancy test after a tubal ligation”: This scenario has the shortest estimated waiting time, just under 17 days, and the narrowest confidence interval, indicating a highly precise estimate.
“Acute cystitis”: This scenario has an estimated waiting time of around 25 days, and the confidence interval is slightly wider, indicating some variability.
“Recurrent/Treatment resistant vaginitis”: This scenario has the longest estimated waiting time, around 27 days, and the confidence interval is quite wide, indicating a high degree of uncertainty or variability in the estimate.

9.2 Extract and Plot the Interaction Data

Interpretation of Estimated Marginal Means of Waiting Times by Insurance Type and Scenario

The figure shows a Comparison of Waiting Times by Insurance Type for different medical scenarios using estimated marginal means (in days). The following key points can be observed:

Scenario-Based Differences in Waiting Times:
- The estimated waiting times are represented for each scenario—TOA, Pregnancy after tubal ligation, UTI, and Vaginitis—across different insurance types (Blue Cross/Blue Shield and Medicaid).
- The scenarios TOA (Tubo-Ovarian Abscess) and UTI tend to have shorter average waiting times, especially for Blue Cross/Blue Shield, as indicated by the points being lower on the y-axis.
Impact of Insurance Type:
- For each scenario, there is a visible difference between the waiting times for Blue Cross/Blue Shield versus Medicaid.
- Medicaid consistently shows slightly longer waiting times compared to Blue Cross/Blue Shield across all scenarios. This suggests that Medicaid patients may face delays in accessing appointments, highlighting a potential disparity in healthcare access based on insurance type.
Variability in Waiting Times:
- The error bars represent variability (e.g., confidence intervals) in waiting times. The error bars for Pregnancy after tubal ligation and Vaginitis show higher variability, particularly for Blue Cross/Blue Shield, which could indicate inconsistent access or other factors affecting appointment scheduling for these conditions.
- The UTI scenario, in contrast, shows less variability, indicating more consistent scheduling times irrespective of insurance type.
Overall Trends:
- The waiting times for Vaginitis are generally longer, especially for Medicaid, with significant variability, suggesting less predictability in appointment scheduling.
- TOA and UTI scenarios exhibit more consistent and shorter waiting times, which may reflect their higher prioritization in scheduling practices.

Summary This plot highlights the impact of insurance type on waiting times for medical appointments across various scenarios. It reveals a trend where Medicaid patients tend to experience longer waiting times and more variability compared to those with Blue Cross/Blue Shield. Additionally, certain scenarios like Vaginitis have a much broader range of waiting times, suggesting inconsistent scheduling practices, especially for non-urgent conditions.

These findings provide insight into potential inequities in healthcare access based on insurance type, and emphasize the importance of ensuring equitable care for patients irrespective of their insurance status.

9.3 Combined plot of Scenarios and Insurance

9.4 Overall Comparison Plot

9.5 Scenario-Based Dot Plot Creation

Interpretation of Median Business Days Until Appointment by Subspecialty

The figure titled “Comparison of Business Days Until Appointment by Scenario” shows the median waiting times for appointments across four different subspecialties based on insurance type (Blue Cross/Blue Shield and Medicaid). The vertical lines represent the variability in the waiting times, likely indicating confidence intervals around the median. Here are the key observations:

Impact of Insurance Type:
- Medicaid patients tend to have longer waiting times compared to those with Blue Cross/Blue Shield, especially for the Vaginitis subspecialty.
- In most subspecialties, Blue Cross/Blue Shield has shorter or comparable median waiting times compared to Medicaid, suggesting easier or faster access to appointments for those with private insurance.
Scenario-Based Differences:
- The waiting times for Vaginitis appointments appear to be the longest overall, with both Medicaid and Blue Cross/Blue Shield showing high median values and larger variability.
- Pregnancy after Tubal Ligation and TOA (Tubo-Ovarian Abscess) show relatively lower median waiting times compared to other conditions. The difference between insurance types in these cases is less pronounced, indicating more consistent access for patients regardless of insurance type.
Variability in Waiting Times:
- The length of the error bars, particularly for Vaginitis, highlights a considerable range in the waiting times, reflecting inconsistent access to appointments within this subspecialty.
- In contrast, other conditions such as UTI (Urinary Tract Infection) show smaller variability, suggesting more predictable scheduling times for both types of insurance.

Summary This plot reveals potential disparities in healthcare access based on insurance type, with Medicaid patients generally experiencing longer waiting times compared to those with Blue Cross/Blue Shield. The Vaginitis subspecialty appears to have the most extended and variable waiting times, indicating potential issues with appointment availability for this condition. These findings emphasize the need for more equitable healthcare access, particularly for Medicaid patients and in subspecialties with high waiting times.

9.6 Combined Plot of Scenarios and Insurance: Color Version

9.7 Combined Plot of Scenarios and Insurance: Black-and-White Version

9.8 Extracting and Reporting Interaction Results

## TOA: Patients with Blue Cross/Blue Shield insurance wait 4.8 days, with a 95% confidence interval (CI) ranging from 3.2 to 7.2 days. Medicaid recipients in this scenario experience longer waits, at 6.6 days with a CI of 4.4 to 9.8 days (p-value = NA).
## 
## Pregnancy after tubal: Patients with Blue Cross/Blue Shield insurance wait 6.4 days, with a 95% confidence interval (CI) ranging from 4.3 to 9.5 days. Medicaid recipients in this scenario experience shorter waits, at 5.6 days with a CI of 3.8 to 8.3 days (p-value = <0.01).
## 
## UTI: Patients with Blue Cross/Blue Shield insurance wait 3.2 days, with a 95% confidence interval (CI) ranging from 2.1 to 4.9 days. Medicaid recipients in this scenario experience shorter waits, at 2.9 days with a CI of 1.9 to 4.4 days (p-value = <0.01).
## 
## Vaginitis: Patients with Blue Cross/Blue Shield insurance wait 8.3 days, with a 95% confidence interval (CI) ranging from 5.7 to 12.1 days. Medicaid recipients in this scenario experience shorter waits, at 6.8 days with a CI of 4.6 to 10.0 days (p-value = <0.01).

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: business_days_until_appointment ~ scenario * insurance + (1 |  
##     NPI)
##    Data: df3
## 
##      AIC      BIC   logLik deviance df.resid 
##   6306.6   6345.9  -3144.3   6288.6      567 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -7.0560 -0.7810 -0.0995  0.1963  7.0571 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  NPI    (Intercept) 3.634    1.906   
## Number of obs: 576, groups:  NPI, 395
## 
## Fixed effects:
##                                     Estimate Std. Error z value
## (Intercept)                          1.85629    0.20098   9.236
## scenarioTOA                         -0.28459    0.28365  -1.003
## scenarioUTI                         -0.68879    0.29325  -2.349
## scenarioVaginitis                    0.25877    0.27822   0.930
## insuranceMedicaid                   -0.13355    0.05497  -2.430
## scenarioTOA:insuranceMedicaid        0.44351    0.07416   5.980
## scenarioUTI:insuranceMedicaid        0.02710    0.07680   0.353
## scenarioVaginitis:insuranceMedicaid -0.06473    0.06901  -0.938
##                                                 Pr(>|z|)    
## (Intercept)                         < 0.0000000000000002 ***
## scenarioTOA                                       0.3157    
## scenarioUTI                                       0.0188 *  
## scenarioVaginitis                                 0.3523    
## insuranceMedicaid                                 0.0151 *  
## scenarioTOA:insuranceMedicaid              0.00000000223 ***
## scenarioUTI:insuranceMedicaid                     0.7242    
## scenarioVaginitis:insuranceMedicaid               0.3483    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) scnTOA scnUTI scnrVg insrnM sTOA:M sUTI:M
## scenarioTOA -0.701                                          
## scenarioUTI -0.677  0.482                                   
## scenarVgnts -0.717  0.508  0.491                            
## insurncMdcd -0.105  0.074  0.072  0.076                     
## scnrTOA:nsM  0.078 -0.105 -0.053 -0.056 -0.741              
## scnrUTI:nsM  0.076 -0.053 -0.101 -0.054 -0.716  0.530       
## scnrVgnts:M  0.084 -0.059 -0.057 -0.093 -0.797  0.590  0.570

Poisson Model The models need to be able to deal with NA in the business_days_until_appointment outcome variable (538) and also non-parametric data.

10. Poisson Regression Model Analysis of Wait Time

business_days_until_appointment can be transformed with a square root function so that 0 is not infinity from log(business_days_until_appointment).

10.1 Descriptive Analysis of Wait Times

10.1.1 Wait Time with Single Predictor

In interpreting this output:

Poisson Model Appropriateness:
- Since we are dealing with count data for the outcome (business_days_until_appointment), Poisson regression is indeed more suitable than a Kruskal-Wallis test. The Kruskal-Wallis test would only indicate if there is a statistically significant difference across groups in insurance but would not provide specific information on the effect size or direction of differences, which the Poisson model offers.
Interpretation of the Medicaid Coefficient:
- The coefficient for Medicaid, -0.008725, suggests a slight (but statistically insignificant) reduction in the log count of days until the appointment for Medicaid patients compared to the baseline insurance group. The p-value of 0.659 shows this effect is not statistically significant, meaning we don’t have enough evidence to conclude that Medicaid influences wait time compared to the baseline insurance category.
Null and Residual Deviance:
- The null and residual deviance are nearly identical, indicating that adding insurance as a predictor does not improve the model’s fit substantially. This suggests that insurance may not be a strong predictor of business_days_until_appointment.
Overdispersion:
- If you find that the variance is significantly greater than the mean (overdispersion), a negative binomial regression might be more appropriate, as it allows for extra variation in the data.
Conclusion:
- This Poisson regression model indicates that insurance type does not significantly influence the wait time for an appointment (business_days_until_appointment) based on the p-value and the similarity in deviance values.

In summary, while Poisson regression provides more detailed insights than a Kruskal-Wallis test, this model suggests that insurance type does not significantly affect the wait time for an appointment.

10.2 Is there a Difference in Wait Times by Insurance?

## 
## Call:
## glm(formula = business_days_until_appointment ~ as.factor(insurance), 
##     family = "poisson", data = df)
## 
## Coefficients:
##                              Estimate Std. Error z value            Pr(>|z|)
## (Intercept)                   2.89272    0.01294 223.546 <0.0000000000000002
## as.factor(insurance)Medicaid -0.01169    0.01991  -0.587               0.557
##                                 
## (Intercept)                  ***
## as.factor(insurance)Medicaid    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 16661  on 575  degrees of freedom
## Residual deviance: 16660  on 574  degrees of freedom
##   (538 observations deleted due to missingness)
## AIC: 18504
## 
## Number of Fisher Scoring iterations: 6

11. Model Interpretation and Variable Extraction

11.1 Correct Extraction of Variable Names and Summary

## Using Poisson regression, the baseline rate of business_days_until_appointment (intercept) is estimated to be 18.04 times the reference category, with a 95% confidence interval ranging from 17.59 to 18.5 . For Medicaid compared to the reference category (Blue Cross/Blue Shield), the incidence rate ratio (IRR) of business_days_until_appointment is estimated to be 0.99 , indicating that the waiting time for Medicaid patients is lower than for those with Blue Cross/Blue Shield. The 95% confidence interval for the IRR ranges from 0.95 to 1.03 , with a p-value of 0.55715 . Given that the confidence interval includes 1 and the p-value is greater than 0.05, the effect is not statistically significant, suggesting no conclusive evidence that the type of insurance impacts the wait time for an appointment.

12. Scenarios and Mathematical Representation

\[ \begin{{align*}} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{{e^{{-\lambda}} \cdot \lambda^x}}{{x!}} \\sqrt{{\lambda}} &= \beta_0 \& + \beta_1 \cdot \underline{{\mathbf{{\large{{\textPatient Scenario}}}}}} \& + ( 1 | \text{{Physician NPI}}) \end{{align*}} \] \[ \begin{align*} P(\text{Business Days until New Patient Appointment} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ &+ \beta_1 \cdot \text{Female Population per Physician Zip Code} \\ &+ \beta_2 \cdot \text{Rurality per Physician Zip Code} \\ &+ \beta_3 \cdot \text{Medicaid Expansion Status by State} \\ &+ \beta_4 \cdot \text{Proportion of Females on Medicaid by State} \\ &+ \beta_5 \cdot \text{Proportion of Medicaid to Medicare Reimbursement by State} \\ &+ \beta_6 \cdot \text{Median Household Income per Physician Zip Code in USD} \\ &+ \beta_7 \cdot \text{ACOG District by State} \\ &+ \beta_8 \cdot \text{Physician Age} \\ &+ \beta_9 \cdot \text{Physician Gender} \\ &+ \beta_{10} \cdot \text{Physician Practice Setting} \\ &+ \beta_{11} \cdot \text{Physician Year of Graduation} \\ &+ \beta_{12} \cdot \text{Central Appointment Phone Number} \\ &+ \beta_{13} \cdot \text{Number of Phone Transfers} \\ &+ \beta_{14} \cdot \text{Call time (minutes)} \\ &+ \beta_{15} \cdot \text{Hold time (minutes)} \\ &+ \beta_{16} \cdot \text{Day of the Week Physician Called} \\ &+ \beta_{17} \cdot \text{Median State Income} \\ &+ \beta_{18} \cdot \text{Physician Medical Training} \\ &+ \beta_{19} \cdot \text{Insurance of the Simulated Patient} \\ &+ \beta_{20} \cdot \text{Scenario of the Simulated Patient} \\ &+ ( 1 | \text{Physician NPI}) \end{align*} \]

13. Scenario-Based Poisson Regression Analysis

13.1 Fitting the Poisson Model for Scenario Differences

## Logging inputs...
## Model Object:  glm lm 
## Specs:  ~scenario | scenario 
## Variable of Interest:  scenario 
## Color By:  scenario 
## Output Directory:  Melanie/Figures 
## Y-Axis Min:  12 
## Y-Axis Max:  24 
## Using existing output directory:  Melanie/Figures 
## Computing estimated marginal means...
## Logging estimated marginal means data...
## # A tibble: 4 × 6
##   scenario               rate    SE    df asymp.LCL asymp.UCL
##   <fct>                 <dbl> <dbl> <dbl>     <dbl>     <dbl>
## 1 TOA                    19.4 0.371   Inf      18.6      20.1
## 2 Pregnancy after tubal  17.0 0.341   Inf      16.4      17.7
## 3 UTI                    13.1 0.309   Inf      12.5      13.8
## 4 Vaginitis              21.9 0.383   Inf      21.2      22.7
## Range of estimated marginal means with CIs:  12.54665 22.70932 
## Creating the plot...
## Plot created successfully.

## Saving plot to:  Melanie/Figures/interaction_scenario_comparison_plot_20250324_082252.png

## Plot saved successfully to:  Melanie/Figures/interaction_scenario_comparison_plot_20250324_082252.png 
## Returning the estimated data and plot object.

14. Scenario Summary and Wait Time Analysis

14.1 Calculating Scenario Call Counts

14.2 Scenario Summary Statement

## There were 1114 calls made with senarios having to do with 278 positive pregnancy test after a tubal ligation, 282 prior trip to ED and was found to have a 6 cm TOA, 272 Acute cystitis, and 282 with Recurrent/Treatment resistant vaginitis.

15. Wait Time Statistics by Scenario

15.1 Calculating Median Wait Times for Scenarios

15.2 Displaying Wait Time Statistics in Table

Business Days Until Next Appointment Joint Scenario
scenario	Median_business_days_until_appointment	Q1	Q3
TOA	9	0	26
Pregnancy after tubal	9	1	22
UTI	1	0	20
Vaginitis	12	1	34

Number of offices with each of the four scenarios successfully contacted: `business_days_until_appointment ~ scenario`

16. Successful Contacts by Scenario

16.1 Filtering for Successful Contacts

16.2 Counting Successful Contacts by Scenario

16.3 Displaying Contact Counts in Table

Number of successful calls contacted for each scenario
scenario	count
Prior trip to ED and was found to have a 6 cm TOA	135
Positive pregnancy test after a tubal ligation	144
Acute cystitis	135
Recurrent/Treatment resistant vaginitis	144

17. Poisson Regression Model for Wait Times by Scenario

17.1 Fitting Poisson Regression Model for Scenarios

## 
## Call:
## glm(formula = business_days_until_appointment ~ as.factor(scenario), 
##     family = "poisson", data = df)
## 
## Coefficients:
##                                                                   Estimate
## (Intercept)                                                        2.96330
## as.factor(scenario)Positive pregnancy test after a tubal ligation -0.12729
## as.factor(scenario)Acute cystitis                                 -0.38781
## as.factor(scenario)Recurrent/Treatment resistant vaginitis         0.12532
##                                                                   Std. Error
## (Intercept)                                                          0.01914
## as.factor(scenario)Positive pregnancy test after a tubal ligation    0.02766
## as.factor(scenario)Acute cystitis                                    0.03030
## as.factor(scenario)Recurrent/Treatment resistant vaginitis           0.02589
##                                                                   z value
## (Intercept)                                                       154.831
## as.factor(scenario)Positive pregnancy test after a tubal ligation  -4.601
## as.factor(scenario)Acute cystitis                                 -12.801
## as.factor(scenario)Recurrent/Treatment resistant vaginitis          4.841
##                                                                               Pr(>|z|)
## (Intercept)                                                       < 0.0000000000000002
## as.factor(scenario)Positive pregnancy test after a tubal ligation           0.00000420
## as.factor(scenario)Acute cystitis                                 < 0.0000000000000002
## as.factor(scenario)Recurrent/Treatment resistant vaginitis                  0.00000129
##                                                                      
## (Intercept)                                                       ***
## as.factor(scenario)Positive pregnancy test after a tubal ligation ***
## as.factor(scenario)Acute cystitis                                 ***
## as.factor(scenario)Recurrent/Treatment resistant vaginitis        ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 16661  on 575  degrees of freedom
## Residual deviance: 16318  on 572  degrees of freedom
##   (538 observations deleted due to missingness)
## AIC: 18165
## 
## Number of Fisher Scoring iterations: 6

17.2 Extracting P-Values for Scenario Comparisons

17.3 Calculating Wait Time Statistics by Scenario

17.4 Summary Sentence for Scenario-Based Wait Times

## The median wait time across all scenarios was 8 business days, with an interquartile range (IQR) of 0 to 26 days. Specifically, the median wait time was 9 days (IQR: 0 to 26) for 'Prior trip to ED and was found to have a 6 cm TOA', 9 days (IQR: 1 to 22) for 'Positive pregnancy test after a tubal ligation', 1 days (IQR: 0 to 20) for 'Acute cystitis', and 12 days (IQR: 1 to 34) for 'Recurrent/Treatment resistant vaginitis'. The p-value for the difference between 'Positive pregnancy test after a tubal ligation' and 'Prior trip to ED and was found to have a 6 cm TOA' scenarios was <0.01, for 'Acute cystitis' and 'Prior trip to ED and was found to have a 6 cm TOA', it was <0.01, and for 'Recurrent/Treatment resistant vaginitis' and 'Prior trip to ED and was found to have a 6 cm TOA', it was <0.01.

18. Insurance-Based Analysis of Wait Times

19.1 Calculating Wait Times by Insurance

19.2 Displaying Wait Times by Insurance in Table Format

Business Days Until Next Appointment By Each Insurance
insurance	Median_business_days_until_appointment	Q1	Q3
Blue Cross/Blue Shield	9	0	26
Medicaid	7	0	26

19.3 Summary Sentence for Insurance Comparison

## Medicaid patients experienced a 1.16 % shorter wait for a new patient appointment compared to patients with BCBS (Incidence Rate Ratio: 0.988 ; CI: 1 - 1 ; p = 0.56 ) with median wait times of 7 business days (IQR: 0 - 26 ) and 9 business days (IQR: 0 - 26 ) respectively.

20. Full Poisson Model: `poisson_full_model`

20.1 Specifying Full Model for Predictors

20.2 Single Predictor Models for Evaluating Significance

This analysis explores the significance of various predictors on the outcome variable business_days_until_appointment, accounting for the random effects associated with physicians. The goal is to identify which variables significantly influence the time to appointment while controlling for variability across individual physicians.

The step-by-step approach demonstrates how individual predictors are assessed for their significance in influencing the response variable while accounting for the random effects associated with repeated measures on physicians. Significant variables will be used in the final multivariate model to better understand their impact on appointment wait times.

20.2.1 Preparing Data for Single Predictor Models

For poisson_full_model: This analysis explores the significance of various predictors on the outcome variable business_days_until_appointment, accounting for the random effects associated with physicians. The goal is to identify which variables significantly influence the time to appointment while controlling for variability across individual physicians.

20.2.2 Running Single Predictor Models

##                        Predictor      P_Value
## 1                   basic_gender 0.0003591399
## 2                         gender 0.0004194205
## 3                       academic 0.0022693606
## 4               taxonomies_state 0.0038728703
## 5              hold_time_minutes 0.0480345076
## 6                            age 0.0561885749
## 7 Medicaid_to_Medicare_Fee_Index 0.1412262799
##                                             IRR
## 1      0.00014016244875362677819834178460922658
## 2      0.00016572868718421024749644365758882714
## 3 609164.43471115571446716785430908203125000000
## 4      0.00000000000000000000000000000001470771
## 5      4.91048046559481488060328047140501439571
## 6      0.80176390159387911360511225211666896939
## 7      0.89433024131783978027954162826063111424
##                                                           CI_Lower
## 1   0.000001126248530925099505152898841930930728949533659033477306
## 2   0.000001378471133572550876933513613886717052992025855928659439
## 3 124.624823729347568246339506004005670547485351562500000000000000
## 4   0.000000000000000000000000000000000000000000000000000005891588
## 5   1.017986049224947953106834575009997934103012084960937500000000
## 6   0.639603275208110555460905288782669231295585632324218750000000
## 7   0.770967156339567694089964788872748613357543945312500000000000
##                       CI_Upper  Wait_Time_Effect
## 1          0.01744331868249030 shorter wait time
## 2          0.01992497128657226 shorter wait time
## 3 2977587429.31386232376098633  longer wait time
## 4          0.00000000003671622 shorter wait time
## 5         23.68678669157279870  longer wait time
## 6          1.00503762068741809 shorter wait time
## 7          1.03743275437708382 shorter wait time

##                        Predictor P_Value       IRR CI_Lower      CI_Upper
## 1                   basic_gender   <0.01      0.00     0.00          0.02
## 2                         gender   <0.01      0.00     0.00          0.02
## 3                       academic   <0.01 609164.43   124.62 2977587429.31
## 4               taxonomies_state   <0.01      0.00     0.00          0.00
## 5              hold_time_minutes   0.048      4.91     1.02         23.69
## 6                            age   0.056      0.80     0.64          1.01
## 7 Medicaid_to_Medicare_Fee_Index   0.141      0.89     0.77          1.04
##    Wait_Time_Effect
## 1 shorter wait time
## 2 shorter wait time
## 3  longer wait time
## 4 shorter wait time
## 5  longer wait time
## 6 shorter wait time
## 7 shorter wait time

Significant Variables Predicting Number of Business Days until Appointment
Predictor	P_Value	IRR	CI_Lower	CI_Upper	Wait_Time_Effect
basic_gender	<0.01	0.00	0.00	0.02	shorter wait time
gender	<0.01	0.00	0.00	0.02	shorter wait time
academic	<0.01	609164.43	124.62	2977587429.31	longer wait time
taxonomies_state	<0.01	0.00	0.00	0.00	shorter wait time
hold_time_minutes	0.048	4.91	1.02	23.69	longer wait time
age	0.056	0.80	0.64	1.01	shorter wait time
Medicaid_to_Medicare_Fee_Index	0.141	0.89	0.77	1.04	shorter wait time

Troubleshooting large IRR for `academic`

From the analysis and boxplot you provided, the issue with the high IRR seems clearer now. Let’s break down the results and address what might be going on:

Key Insights: 1. Sample Imbalance: - There is a major imbalance in the number of observations between Private Practice (556 cases) and University (47 cases). This discrepancy could lead to inflated coefficients, especially if the smaller group (University) has greater variability in wait times. This could explain why the estimate for academicUniversity is so large and significant.

Fixed Effects:
- The model indicates that being at a University is associated with a longer wait time, with an Estimate of 13.905 (p = 0.00124). This suggests that patients at University settings wait, on average, about 13.9 more days than those at private practices.
- However, due to the imbalance in the dataset and some high variance in wait times for university cases, this estimate might be exaggerated. The few outliers seen in the boxplot for University settings could be contributing to this as well.
Random Effects:
- The random effects (NPI) show variability among individual providers (standard deviation of 17.53). This means that individual providers still account for a fair amount of variation in wait times, which is typical in mixed-effects models.

Recommendations to Address the IRR Issue:

Consider Balancing the Dataset:
- The imbalance between University and Private Practice may lead to inflated estimates. You could try down-sampling the larger group (Private Practice) or performing bootstrapping to create a more balanced dataset. This might provide a more realistic estimate for the effect of academicUniversity.

20.3 Refitting Full Model Without academic Predictor

##                        Predictor      P_Value
## 1                   basic_gender 0.0003591399
## 2                         gender 0.0004194205
## 3               taxonomies_state 0.0038728703
## 4              hold_time_minutes 0.0480345076
## 5                            age 0.0561885749
## 6 Medicaid_to_Medicare_Fee_Index 0.1412262799
##                                        IRR
## 1 0.00014016244875362677819834178460922658
## 2 0.00016572868718421024749644365758882714
## 3 0.00000000000000000000000000000001470771
## 4 4.91048046559481488060328047140501439571
## 5 0.80176390159387911360511225211666896939
## 6 0.89433024131783978027954162826063111424
##                                                         CI_Lower
## 1 0.000001126248530925099505152898841930930728949533659033477306
## 2 0.000001378471133572550876933513613886717052992025855928659439
## 3 0.000000000000000000000000000000000000000000000000000005891588
## 4 1.017986049224947953106834575009997934103012084960937500000000
## 5 0.639603275208110555460905288782669231295585632324218750000000
## 6 0.770967156339567694089964788872748613357543945312500000000000
##               CI_Upper  Wait_Time_Effect
## 1  0.01744331868249030 shorter wait time
## 2  0.01992497128657226 shorter wait time
## 3  0.00000000003671622 shorter wait time
## 4 23.68678669157279870  longer wait time
## 5  1.00503762068741809 shorter wait time
## 6  1.03743275437708382 shorter wait time

##                        Predictor P_Value  IRR CI_Lower CI_Upper
## 1                   basic_gender   <0.01 0.00     0.00     0.02
## 2                         gender   <0.01 0.00     0.00     0.02
## 3               taxonomies_state   <0.01 0.00     0.00     0.00
## 4              hold_time_minutes   0.048 4.91     1.02    23.69
## 5                            age   0.056 0.80     0.64     1.01
## 6 Medicaid_to_Medicare_Fee_Index   0.141 0.89     0.77     1.04
##    Wait_Time_Effect
## 1 shorter wait time
## 2 shorter wait time
## 3 shorter wait time
## 4  longer wait time
## 5 shorter wait time
## 6 shorter wait time

Significant Variables Predicting Number of Business Days until Appointment WITHOUT ACADEMIC
Predictor	P_Value	IRR	CI_Lower	CI_Upper	Wait_Time_Effect
basic_gender	<0.01	0.00	0.00	0.02	shorter wait time
gender	<0.01	0.00	0.00	0.02	shorter wait time
taxonomies_state	<0.01	0.00	0.00	0.00	shorter wait time
hold_time_minutes	0.048	4.91	1.02	23.69	longer wait time
age	0.056	0.80	0.64	1.01	shorter wait time
Medicaid_to_Medicare_Fee_Index	0.141	0.89	0.77	1.04	shorter wait time

20.4 Robust LMM With Log-Transformed Business Days and Academic Predictor

20.4.1 Exploring Relationship Between Academic Status and Wait Times

## 
## Private Practice       University 
##              531               45

20.4.2 Fitting Simple and Robust Linear Mixed Model for Academic Status

## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: formula_simple
##    Data: df3_filtered
## 
##      AIC      BIC   logLik deviance df.resid 
##   5336.3   5353.8  -2664.2   5328.3      572 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3280 -0.4538 -0.2240  0.2387  6.2875 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  NPI      (Intercept) 300.9    17.35   
##  Residual             355.6    18.86   
## Number of obs: 576, groups:  NPI, 395
## 
## Fixed effects:
##                    Estimate Std. Error      df t value             Pr(>|t|)    
## (Intercept)          16.797      1.244 342.181  13.502 < 0.0000000000000002 ***
## academicUniversity   13.320      4.334 379.737   3.073              0.00227 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## acdmcUnvrst -0.287

## Robust linear mixed model fit by DAStau 
## Formula: formula_simple 
##    Data: df3_filtered 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.0411 -0.7570 -0.2939  0.7481 13.3737 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  NPI      (Intercept)   0.0     0.00   
##  Residual             298.4    17.27   
## Number of obs: 576, groups: NPI, 395
## 
## Fixed effects:
##                    Estimate Std. Error t value
## (Intercept)         13.0769     0.7688  17.009
## academicUniversity   4.9077     2.7507   1.784
## 
## Correlation of Fixed Effects:
##             (Intr)
## acdmcUnvrst -0.280
## 
## Robustness weights for the residuals: 
##  479 weights are ~= 1. The remaining 97 ones are summarized as
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.101   0.388   0.568   0.575   0.750   0.993 
## 
## Robustness weights for the random effects: 
##  All 395 weights are ~= 1.
## 
## Rho functions used for fitting:
##   Residuals:
##     eff: smoothed Huber (k = 1.345, s = 10) 
##     sig: smoothed Huber, Proposal 2 (k = 1.345, s = 10) 
##   Random Effects, variance component 1 (NPI):
##     eff: smoothed Huber (k = 1.345, s = 10) 
##     vcp: smoothed Huber, Proposal 2 (k = 1.345, s = 10)

20.5 Robust LMM With Log-Transformed Business Days

20.5.1 Defining Robust Linear Mixed-Effects Model for Log-Transformed Wait Times

20.5.2 Running Robust Linear Mixed-Effects Model for Each Predictor

Robust LMM with log_business_days_until_appointments

## The following predictors were found to be significant predicting business days until new patient appointment:
## -  basic_gender : p = <0.01 
## -  gender : p = <0.01 
## -  taxonomies_state : p = <0.01 
## -  hold_time_minutes : p = 0.05 
## -  age : p = 0.06 
## -  Medicaid_to_Medicare_Fee_Index : p = 0.14

Model `poisson_significant` Formula with only significant variables

where:

Fixed effects include…
Random effects account for variability between physicians, modeled as a random intercept.

The random effect for physician suggests that there is substantial variability in appointment wait times between physician. Physicians with a higher random intercept will tend to have longer wait times compared to Physicians with a lower random intercept.

`poisson` Model with only significant variables

## Generalized linear mixed model fit by maximum likelihood (Adaptive
##   Gauss-Hermite Quadrature, nAGQ = 0) [glmerMod]
##  Family: poisson  ( log )
## Formula: 
## business_days_until_appointment ~ basic_gender + gender + taxonomies_state +  
##     hold_time_minutes + age + Medicaid_to_Medicare_Fee_Index +      (1 | NPI)
##    Data: df3
## 
##      AIC      BIC   logLik deviance df.resid 
##   4790.6   5010.2  -2342.3   4684.6      413 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -6.6326 -0.7748 -0.0121  0.1375  7.2534 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  NPI    (Intercept) 2.803    1.674   
## Number of obs: 466, groups:  NPI, 341
## 
## Fixed effects:
##                                         Estimate  Std. Error z value  Pr(>|z|)
## (Intercept)                             6.810582    1.627389   4.185 0.0000285
## basic_genderM                          -0.305741    0.233757  -1.308   0.19089
## taxonomies_stateAlaska                -19.126721 2103.363588  -0.009   0.99274
## taxonomies_stateArizona                -3.197713    1.321695  -2.419   0.01555
## taxonomies_stateArkansas               -3.520711    1.679784  -2.096   0.03609
## taxonomies_stateCalifornia             -2.261976    1.248235  -1.812   0.06996
## taxonomies_stateColorado               -1.961216    1.391654  -1.409   0.15876
## taxonomies_stateConnecticut            -2.070117    1.309561  -1.581   0.11393
## taxonomies_stateDistrict of Columbia   -1.780902    1.572826  -1.132   0.25751
## taxonomies_stateFlorida                -2.555552    1.312856  -1.947   0.05159
## taxonomies_stateGeorgia                -1.772211    1.292764  -1.371   0.17042
## taxonomies_stateHawaii                 -3.569325    1.530009  -2.333   0.01965
## taxonomies_stateIdaho                   0.577574    2.082235   0.277   0.78149
## taxonomies_stateIllinois               -2.813088    1.338316  -2.102   0.03556
## taxonomies_stateIndiana                -2.886477    1.498090  -1.927   0.05401
## taxonomies_stateIowa                   -2.268100    1.491896  -1.520   0.12844
## taxonomies_stateKansas                 -2.573459    1.599028  -1.609   0.10753
## taxonomies_stateKentucky               -4.435779    1.521959  -2.915   0.00356
## taxonomies_stateLouisiana              -2.516323    1.445682  -1.741   0.08176
## taxonomies_stateMaine                 -20.818401 2103.363543  -0.010   0.99210
## taxonomies_stateMaryland               -2.056208    1.305419  -1.575   0.11523
## taxonomies_stateMassachusetts          -0.811775    1.296130  -0.626   0.53111
## taxonomies_stateMichigan               -2.320000    1.348869  -1.720   0.08544
## taxonomies_stateMinnesota              -2.030315    1.358857  -1.494   0.13514
## taxonomies_stateMississippi            -1.219650    1.345611  -0.906   0.36473
## taxonomies_stateMissouri               -3.537544    1.309981  -2.700   0.00692
## taxonomies_stateMontana               -19.322715 1487.303378  -0.013   0.98963
## taxonomies_stateNebraska               -2.136055    1.365589  -1.564   0.11777
## taxonomies_stateNevada                 -2.171398    1.362735  -1.593   0.11107
## taxonomies_stateNew Jersey             -3.330276    1.436668  -2.318   0.02045
## taxonomies_stateNew Mexico             -1.700597    1.696606  -1.002   0.31617
## taxonomies_stateNew York               -2.759731    1.402315  -1.968   0.04907
## taxonomies_stateNorth Carolina         -1.526363    1.297747  -1.176   0.23953
## taxonomies_stateOhio                   -2.431309    1.324655  -1.835   0.06644
## taxonomies_stateOklahoma               -0.991649    1.473322  -0.673   0.50090
## taxonomies_stateOregon                 -1.697098    1.418255  -1.197   0.23146
## taxonomies_statePennsylvania           -2.456329    1.310211  -1.875   0.06083
## taxonomies_statePuerto Rico            -1.564389    1.702583  -0.919   0.35818
## taxonomies_stateRhode Island           -5.097105    2.309134  -2.207   0.02729
## taxonomies_stateSouth Carolina         -0.943069    1.471878  -0.641   0.52170
## taxonomies_stateTennessee              -2.232718    1.329920  -1.679   0.09318
## taxonomies_stateTexas                  -2.937998    1.314828  -2.235   0.02545
## taxonomies_stateUtah                  -20.970264 1348.638640  -0.016   0.98759
## taxonomies_stateVermont                -4.015323    2.172622  -1.848   0.06458
## taxonomies_stateVirginia               -2.931074    1.378281  -2.127   0.03345
## taxonomies_stateWashington             -3.896584    1.401531  -2.780   0.00543
## taxonomies_stateWest Virginia          -4.550914    1.870632  -2.433   0.01498
## taxonomies_stateWisconsin              -3.409228    1.376809  -2.476   0.01328
## taxonomies_stateWyoming                -4.709060    2.285424  -2.060   0.03935
## hold_time_minutes                      -0.035416    0.015006  -2.360   0.01827
## age                                    -0.009658    0.010884  -0.887   0.37490
## Medicaid_to_Medicare_Fee_Index         -0.026701    0.012243  -2.181   0.02919
##                                         
## (Intercept)                          ***
## basic_genderM                           
## taxonomies_stateAlaska                  
## taxonomies_stateArizona              *  
## taxonomies_stateArkansas             *  
## taxonomies_stateCalifornia           .  
## taxonomies_stateColorado                
## taxonomies_stateConnecticut             
## taxonomies_stateDistrict of Columbia    
## taxonomies_stateFlorida              .  
## taxonomies_stateGeorgia                 
## taxonomies_stateHawaii               *  
## taxonomies_stateIdaho                   
## taxonomies_stateIllinois             *  
## taxonomies_stateIndiana              .  
## taxonomies_stateIowa                    
## taxonomies_stateKansas                  
## taxonomies_stateKentucky             ** 
## taxonomies_stateLouisiana            .  
## taxonomies_stateMaine                   
## taxonomies_stateMaryland                
## taxonomies_stateMassachusetts           
## taxonomies_stateMichigan             .  
## taxonomies_stateMinnesota               
## taxonomies_stateMississippi             
## taxonomies_stateMissouri             ** 
## taxonomies_stateMontana                 
## taxonomies_stateNebraska                
## taxonomies_stateNevada                  
## taxonomies_stateNew Jersey           *  
## taxonomies_stateNew Mexico              
## taxonomies_stateNew York             *  
## taxonomies_stateNorth Carolina          
## taxonomies_stateOhio                 .  
## taxonomies_stateOklahoma                
## taxonomies_stateOregon                  
## taxonomies_statePennsylvania         .  
## taxonomies_statePuerto Rico             
## taxonomies_stateRhode Island         *  
## taxonomies_stateSouth Carolina          
## taxonomies_stateTennessee            .  
## taxonomies_stateTexas                *  
## taxonomies_stateUtah                    
## taxonomies_stateVermont              .  
## taxonomies_stateVirginia             *  
## taxonomies_stateWashington           ** 
## taxonomies_stateWest Virginia        *  
## taxonomies_stateWisconsin            *  
## taxonomies_stateWyoming              *  
## hold_time_minutes                    *  
## age                                     
## Medicaid_to_Medicare_Fee_Index       *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## fit warnings:
## fixed-effect model matrix is rank deficient so dropping 1 column / coefficient

Table of `poisson_significant` Model Coefficients

Generic Interpretation of Significant Predictors: In a Poisson regression, significant predictors are those with p-values less than a chosen threshold (usually p < 0.05). These predictors have a statistically significant effect on the outcome variable—in this case, business days until an appointment. The Incidence Rate Ratios (IRRs) help interpret the direction and magnitude of these effects:

IRRs > 1: The predictor increases the expected mean number of business days. For example, an IRR of 2 means the expected waiting time is twice as long for that category compared to the reference group.
IRRs < 1: The predictor decreases the expected mean number of business days. For instance, an IRR of 0.5 means the waiting time is halved compared to the reference group.
p-values < 0.05: Indicate that the effect of the predictor is statistically significant, meaning it’s unlikely that the observed effect is due to random chance.

Analysis Based on Current Results

Examples of Significant Predictors:

Hold Time (IRR = 0.97, p = 0.033):
- Interpretation: For each additional minute spent on hold, the waiting time for an appointment decreases by 3% (IRR = 0.97). This effect is small but statistically significant (p = 0.033).
- Example: If a patient spends an extra 5 minutes on hold, the expected waiting time could decrease from 15 days to approximately 14.25 days.

Non-Significant Predictors: 1. Gender (Male) (IRR = 0.74, p = 0.227): - Interpretation: Being male is associated with a 26% reduction in waiting time compared to females (IRR = 0.74), but this effect is not statistically significant (p = 0.227).

Academic Setting (University) (IRR = 1.12, p = 0.770):
- Interpretation: Patients in university-affiliated settings are expected to wait 12% longer than those in non-university settings, but this effect is not statistically significant (p = 0.770).
Age (IRR = 0.99, p = 0.617):
- Interpretation: Increasing age slightly reduces waiting time, but the effect is minimal and not statistically significant.
Graduation Year Category (2010 or greater) (IRR = 0.59, p = 0.451):
- Interpretation: Physicians who graduated in 2010 or later were associated with a 41% reduction in waiting time, though this result is not statistically significant.
International Medical Graduates (IRR = 0.91, p = 0.761):
- Interpretation: Being an international medical graduate is associated with a 9% reduction in waiting time, though this effect is not statistically significant.

Random Effects and Marginal/Conditional R²:

Random Effects:
- Variance (NPI): 3.59 (indicating significant variability between NPIs).
- Intraclass Correlation Coefficient (ICC): 0.98 (suggesting that 98% of the variation in waiting times is explained by differences between NPIs).
Marginal R² (0.856): This means that 85.6% of the variance in waiting time is explained by the fixed effects in the model, such as hold time, gender, and practice setting.
Conditional R² (0.996): When accounting for both fixed effects and random effects (NPI variability), the model explains 99.6% of the total variance in waiting times.

Summary: The random effects model demonstrates that while some predictors, such as hold time, have a small but significant effect on waiting time, other factors like gender, academic setting, and graduation year show non-significant effects in this model. The high ICC (0.98) indicates that the majority of the variability in waiting times is due to differences between providers (NPIs), and the conditional R² (0.996) suggests that the model is highly effective at explaining overall variance when including both fixed and random effects.

	business days until appointment
Predictors	Incidence Rate Ratios	CI	p
(Intercept)	907.40	37.37 – 22030.98	<0.001
basic gender [M]	0.74	0.47 – 1.16	0.191
taxonomies state [Alaska]	0.00	0.00 – Inf	0.993
taxonomies state [Arizona]	0.04	0.00 – 0.54	0.016
taxonomies state [Arkansas]	0.03	0.00 – 0.80	0.036
taxonomies state [California]	0.10	0.01 – 1.20	0.070
taxonomies state [Colorado]	0.14	0.01 – 2.15	0.159
taxonomies state [Connecticut]	0.13	0.01 – 1.64	0.114
taxonomies state [District of Columbia]	0.17	0.01 – 3.68	0.258
taxonomies state [Florida]	0.08	0.01 – 1.02	0.052
taxonomies state [Georgia]	0.17	0.01 – 2.14	0.170
taxonomies state [Hawaii]	0.03	0.00 – 0.57	0.020
taxonomies state [Idaho]	1.78	0.03 – 105.50	0.781
taxonomies state [Illinois]	0.06	0.00 – 0.83	0.036
taxonomies state [Indiana]	0.06	0.00 – 1.05	0.054
taxonomies state [Iowa]	0.10	0.01 – 1.93	0.128
taxonomies state [Kansas]	0.08	0.00 – 1.75	0.108
taxonomies state [Kentucky]	0.01	0.00 – 0.23	0.004
taxonomies state [Louisiana]	0.08	0.00 – 1.37	0.082
taxonomies state [Maine]	0.00	0.00 – Inf	0.992
taxonomies state [Maryland]	0.13	0.01 – 1.65	0.115
taxonomies state [Massachusetts]	0.44	0.04 – 5.63	0.531
taxonomies state [Michigan]	0.10	0.01 – 1.38	0.085
taxonomies state [Minnesota]	0.13	0.01 – 1.88	0.135
taxonomies state [Mississippi]	0.30	0.02 – 4.13	0.365
taxonomies state [Missouri]	0.03	0.00 – 0.38	0.007
taxonomies state [Montana]	0.00	0.00 – Inf	0.990
taxonomies state [Nebraska]	0.12	0.01 – 1.72	0.118
taxonomies state [Nevada]	0.11	0.01 – 1.65	0.111
taxonomies state [New Jersey]	0.04	0.00 – 0.60	0.020
taxonomies state [New Mexico]	0.18	0.01 – 5.08	0.316
taxonomies state [New York]	0.06	0.00 – 0.99	0.049
taxonomies state [North Carolina]	0.22	0.02 – 2.77	0.240
taxonomies state [Ohio]	0.09	0.01 – 1.18	0.066
taxonomies state [Oklahoma]	0.37	0.02 – 6.66	0.501
taxonomies state [Oregon]	0.18	0.01 – 2.95	0.231
taxonomies state [Pennsylvania]	0.09	0.01 – 1.12	0.061
taxonomies state [Puerto Rico]	0.21	0.01 – 5.89	0.358
taxonomies state [Rhode Island]	0.01	0.00 – 0.56	0.027
taxonomies state [South Carolina]	0.39	0.02 – 6.97	0.522
taxonomies state [Tennessee]	0.11	0.01 – 1.45	0.093
taxonomies state [Texas]	0.05	0.00 – 0.70	0.025
taxonomies state [Utah]	0.00	0.00 – Inf	0.988
taxonomies state [Vermont]	0.02	0.00 – 1.27	0.065
taxonomies state [Virginia]	0.05	0.00 – 0.79	0.033
taxonomies state [Washington]	0.02	0.00 – 0.32	0.005
taxonomies state [West Virginia]	0.01	0.00 – 0.41	0.015
taxonomies state [Wisconsin]	0.03	0.00 – 0.49	0.013
taxonomies state [Wyoming]	0.01	0.00 – 0.79	0.039
hold time minutes	0.97	0.94 – 0.99	0.018
age	0.99	0.97 – 1.01	0.375
Medicaid to Medicare Fee Index	0.97	0.95 – 1.00	0.029
Random Effects
σ²	0.02
τ₀₀ _NPI	2.80
ICC	0.99
N _NPI	341
Observations	466
Marginal R² / Conditional R²	0.630 / 0.997

Visualize the `poisson_significant` modelFixed Effects

`poisson_significant` Model Performance

## We fitted a poisson mixed model (estimated using ML and BOBYQA optimizer) to
## predict business_days_until_appointment with basic_gender, gender,
## taxonomies_state, hold_time_minutes, age and Medicaid_to_Medicare_Fee_Index
## (formula: business_days_until_appointment ~ basic_gender + gender +
## taxonomies_state + hold_time_minutes + age + Medicaid_to_Medicare_Fee_Index).
## The model included NPI as random effect (formula: ~1 | NPI). The model's total
## explanatory power is substantial (conditional R2 = 1.00) and the part related
## to the fixed effects alone (marginal R2) is of 0.63. The model's intercept,
## corresponding to basic_gender = F, gender = Female, taxonomies_state = Alabama,
## hold_time_minutes = 0, age = 0 and Medicaid_to_Medicare_Fee_Index = 0, is at
## 6.81 (95% CI [3.62, 10.00], p < .001). Within this model:
## 
##   - The effect of basic gender [M] is statistically non-significant and negative
## (beta = -0.31, 95% CI [-0.76, 0.15], p = 0.191; Std. beta = -0.31, 95% CI
## [-0.76, 0.15])
##   - The effect of taxonomies state [Alaska] is statistically non-significant and
## negative (beta = -19.13, 95% CI [-4141.64, 4103.39], p = 0.993; Std. beta =
## -19.13, 95% CI [-4141.64, 4103.39])
##   - The effect of taxonomies state [Arizona] is statistically significant and
## negative (beta = -3.20, 95% CI [-5.79, -0.61], p = 0.016; Std. beta = -3.20,
## 95% CI [-5.79, -0.61])
##   - The effect of taxonomies state [Arkansas] is statistically significant and
## negative (beta = -3.52, 95% CI [-6.81, -0.23], p = 0.036; Std. beta = -3.52,
## 95% CI [-6.81, -0.23])
##   - The effect of taxonomies state [California] is statistically non-significant
## and negative (beta = -2.26, 95% CI [-4.71, 0.18], p = 0.070; Std. beta = -2.26,
## 95% CI [-4.71, 0.18])
##   - The effect of taxonomies state [Colorado] is statistically non-significant
## and negative (beta = -1.96, 95% CI [-4.69, 0.77], p = 0.159; Std. beta = -1.96,
## 95% CI [-4.69, 0.77])
##   - The effect of taxonomies state [Connecticut] is statistically non-significant
## and negative (beta = -2.07, 95% CI [-4.64, 0.50], p = 0.114; Std. beta = -2.07,
## 95% CI [-4.64, 0.50])
##   - The effect of taxonomies state [District of Columbia] is statistically
## non-significant and negative (beta = -1.78, 95% CI [-4.86, 1.30], p = 0.258;
## Std. beta = -1.78, 95% CI [-4.86, 1.30])
##   - The effect of taxonomies state [Florida] is statistically non-significant and
## negative (beta = -2.56, 95% CI [-5.13, 0.02], p = 0.052; Std. beta = -2.56, 95%
## CI [-5.13, 0.02])
##   - The effect of taxonomies state [Georgia] is statistically non-significant and
## negative (beta = -1.77, 95% CI [-4.31, 0.76], p = 0.170; Std. beta = -1.77, 95%
## CI [-4.31, 0.76])
##   - The effect of taxonomies state [Hawaii] is statistically significant and
## negative (beta = -3.57, 95% CI [-6.57, -0.57], p = 0.020; Std. beta = -3.57,
## 95% CI [-6.57, -0.57])
##   - The effect of taxonomies state [Idaho] is statistically non-significant and
## positive (beta = 0.58, 95% CI [-3.50, 4.66], p = 0.781; Std. beta = 0.58, 95%
## CI [-3.50, 4.66])
##   - The effect of taxonomies state [Illinois] is statistically significant and
## negative (beta = -2.81, 95% CI [-5.44, -0.19], p = 0.036; Std. beta = -2.81,
## 95% CI [-5.44, -0.19])
##   - The effect of taxonomies state [Indiana] is statistically non-significant and
## negative (beta = -2.89, 95% CI [-5.82, 0.05], p = 0.054; Std. beta = -2.89, 95%
## CI [-5.82, 0.05])
##   - The effect of taxonomies state [Iowa] is statistically non-significant and
## negative (beta = -2.27, 95% CI [-5.19, 0.66], p = 0.128; Std. beta = -2.27, 95%
## CI [-5.19, 0.66])
##   - The effect of taxonomies state [Kansas] is statistically non-significant and
## negative (beta = -2.57, 95% CI [-5.71, 0.56], p = 0.108; Std. beta = -2.57, 95%
## CI [-5.71, 0.56])
##   - The effect of taxonomies state [Kentucky] is statistically significant and
## negative (beta = -4.44, 95% CI [-7.42, -1.45], p = 0.004; Std. beta = -4.44,
## 95% CI [-7.42, -1.45])
##   - The effect of taxonomies state [Louisiana] is statistically non-significant
## and negative (beta = -2.52, 95% CI [-5.35, 0.32], p = 0.082; Std. beta = -2.52,
## 95% CI [-5.35, 0.32])
##   - The effect of taxonomies state [Maine] is statistically non-significant and
## negative (beta = -20.82, 95% CI [-4143.34, 4101.70], p = 0.992; Std. beta =
## -20.82, 95% CI [-4143.34, 4101.70])
##   - The effect of taxonomies state [Maryland] is statistically non-significant
## and negative (beta = -2.06, 95% CI [-4.61, 0.50], p = 0.115; Std. beta = -2.06,
## 95% CI [-4.61, 0.50])
##   - The effect of taxonomies state [Massachusetts] is statistically
## non-significant and negative (beta = -0.81, 95% CI [-3.35, 1.73], p = 0.531;
## Std. beta = -0.81, 95% CI [-3.35, 1.73])
##   - The effect of taxonomies state [Michigan] is statistically non-significant
## and negative (beta = -2.32, 95% CI [-4.96, 0.32], p = 0.085; Std. beta = -2.32,
## 95% CI [-4.96, 0.32])
##   - The effect of taxonomies state [Minnesota] is statistically non-significant
## and negative (beta = -2.03, 95% CI [-4.69, 0.63], p = 0.135; Std. beta = -2.03,
## 95% CI [-4.69, 0.63])
##   - The effect of taxonomies state [Mississippi] is statistically non-significant
## and negative (beta = -1.22, 95% CI [-3.86, 1.42], p = 0.365; Std. beta = -1.22,
## 95% CI [-3.86, 1.42])
##   - The effect of taxonomies state [Missouri] is statistically significant and
## negative (beta = -3.54, 95% CI [-6.11, -0.97], p = 0.007; Std. beta = -3.54,
## 95% CI [-6.11, -0.97])
##   - The effect of taxonomies state [Montana] is statistically non-significant and
## negative (beta = -19.32, 95% CI [-2934.38, 2895.74], p = 0.990; Std. beta =
## -19.32, 95% CI [-2934.38, 2895.74])
##   - The effect of taxonomies state [Nebraska] is statistically non-significant
## and negative (beta = -2.14, 95% CI [-4.81, 0.54], p = 0.118; Std. beta = -2.14,
## 95% CI [-4.81, 0.54])
##   - The effect of taxonomies state [Nevada] is statistically non-significant and
## negative (beta = -2.17, 95% CI [-4.84, 0.50], p = 0.111; Std. beta = -2.17, 95%
## CI [-4.84, 0.50])
##   - The effect of taxonomies state [New Jersey] is statistically significant and
## negative (beta = -3.33, 95% CI [-6.15, -0.51], p = 0.020; Std. beta = -3.33,
## 95% CI [-6.15, -0.51])
##   - The effect of taxonomies state [New Mexico] is statistically non-significant
## and negative (beta = -1.70, 95% CI [-5.03, 1.62], p = 0.316; Std. beta = -1.70,
## 95% CI [-5.03, 1.62])
##   - The effect of taxonomies state [New York] is statistically significant and
## negative (beta = -2.76, 95% CI [-5.51, -0.01], p = 0.049; Std. beta = -2.76,
## 95% CI [-5.51, -0.01])
##   - The effect of taxonomies state [North Carolina] is statistically
## non-significant and negative (beta = -1.53, 95% CI [-4.07, 1.02], p = 0.240;
## Std. beta = -1.53, 95% CI [-4.07, 1.02])
##   - The effect of taxonomies state [Ohio] is statistically non-significant and
## negative (beta = -2.43, 95% CI [-5.03, 0.16], p = 0.066; Std. beta = -2.43, 95%
## CI [-5.03, 0.16])
##   - The effect of taxonomies state [Oklahoma] is statistically non-significant
## and negative (beta = -0.99, 95% CI [-3.88, 1.90], p = 0.501; Std. beta = -0.99,
## 95% CI [-3.88, 1.90])
##   - The effect of taxonomies state [Oregon] is statistically non-significant and
## negative (beta = -1.70, 95% CI [-4.48, 1.08], p = 0.231; Std. beta = -1.70, 95%
## CI [-4.48, 1.08])
##   - The effect of taxonomies state [Pennsylvania] is statistically
## non-significant and negative (beta = -2.46, 95% CI [-5.02, 0.11], p = 0.061;
## Std. beta = -2.46, 95% CI [-5.02, 0.11])
##   - The effect of taxonomies state [Puerto Rico] is statistically non-significant
## and negative (beta = -1.56, 95% CI [-4.90, 1.77], p = 0.358; Std. beta = -1.56,
## 95% CI [-4.90, 1.77])
##   - The effect of taxonomies state [Rhode Island] is statistically significant
## and negative (beta = -5.10, 95% CI [-9.62, -0.57], p = 0.027; Std. beta =
## -5.10, 95% CI [-9.62, -0.57])
##   - The effect of taxonomies state [South Carolina] is statistically
## non-significant and negative (beta = -0.94, 95% CI [-3.83, 1.94], p = 0.522;
## Std. beta = -0.94, 95% CI [-3.83, 1.94])
##   - The effect of taxonomies state [Tennessee] is statistically non-significant
## and negative (beta = -2.23, 95% CI [-4.84, 0.37], p = 0.093; Std. beta = -2.23,
## 95% CI [-4.84, 0.37])
##   - The effect of taxonomies state [Texas] is statistically significant and
## negative (beta = -2.94, 95% CI [-5.52, -0.36], p = 0.025; Std. beta = -2.94,
## 95% CI [-5.52, -0.36])
##   - The effect of taxonomies state [Utah] is statistically non-significant and
## negative (beta = -20.97, 95% CI [-2664.25, 2622.31], p = 0.988; Std. beta =
## -20.97, 95% CI [-2664.25, 2622.31])
##   - The effect of taxonomies state [Vermont] is statistically non-significant and
## negative (beta = -4.02, 95% CI [-8.27, 0.24], p = 0.065; Std. beta = -4.02, 95%
## CI [-8.27, 0.24])
##   - The effect of taxonomies state [Virginia] is statistically significant and
## negative (beta = -2.93, 95% CI [-5.63, -0.23], p = 0.033; Std. beta = -2.93,
## 95% CI [-5.63, -0.23])
##   - The effect of taxonomies state [Washington] is statistically significant and
## negative (beta = -3.90, 95% CI [-6.64, -1.15], p = 0.005; Std. beta = -3.90,
## 95% CI [-6.64, -1.15])
##   - The effect of taxonomies state [West Virginia] is statistically significant
## and negative (beta = -4.55, 95% CI [-8.22, -0.88], p = 0.015; Std. beta =
## -4.55, 95% CI [-8.22, -0.88])
##   - The effect of taxonomies state [Wisconsin] is statistically significant and
## negative (beta = -3.41, 95% CI [-6.11, -0.71], p = 0.013; Std. beta = -3.41,
## 95% CI [-6.11, -0.71])
##   - The effect of taxonomies state [Wyoming] is statistically significant and
## negative (beta = -4.71, 95% CI [-9.19, -0.23], p = 0.039; Std. beta = -4.71,
## 95% CI [-9.19, -0.23])
##   - The effect of hold time minutes is statistically significant and negative
## (beta = -0.04, 95% CI [-0.06, -6.00e-03], p = 0.018; Std. beta = -0.05, 95% CI
## [-0.09, -8.16e-03])
##   - The effect of age is statistically non-significant and negative (beta =
## -9.66e-03, 95% CI [-0.03, 0.01], p = 0.375; Std. beta = -0.10, 95% CI [-0.33,
## 0.13])
##   - The effect of Medicaid to Medicare Fee Index is statistically significant and
## negative (beta = -0.03, 95% CI [-0.05, -2.70e-03], p = 0.029; Std. beta =
## -0.44, 95% CI [-0.84, -0.04])
## 
## Standardized parameters were obtained by fitting the model on a standardized
## version of the dataset. 95% Confidence Intervals (CIs) and p-values were
## computed using a Wald z-distribution approximation.

## The marginal R² value of the model is 0.63 and the conditional R² value is 0.997

## The marginal R² represents the proportion of variance explained by the fixed effects ( (Intercept), basic_genderM, taxonomies_stateAlaska, taxonomies_stateArizona, taxonomies_stateArkansas, taxonomies_stateCalifornia, taxonomies_stateColorado, taxonomies_stateConnecticut, taxonomies_stateDistrict of Columbia, taxonomies_stateFlorida, taxonomies_stateGeorgia, taxonomies_stateHawaii, taxonomies_stateIdaho, taxonomies_stateIllinois, taxonomies_stateIndiana, taxonomies_stateIowa, taxonomies_stateKansas, taxonomies_stateKentucky, taxonomies_stateLouisiana, taxonomies_stateMaine, taxonomies_stateMaryland, taxonomies_stateMassachusetts, taxonomies_stateMichigan, taxonomies_stateMinnesota, taxonomies_stateMississippi, taxonomies_stateMissouri, taxonomies_stateMontana, taxonomies_stateNebraska, taxonomies_stateNevada, taxonomies_stateNew Jersey, taxonomies_stateNew Mexico, taxonomies_stateNew York, taxonomies_stateNorth Carolina, taxonomies_stateOhio, taxonomies_stateOklahoma, taxonomies_stateOregon, taxonomies_statePennsylvania, taxonomies_statePuerto Rico, taxonomies_stateRhode Island, taxonomies_stateSouth Carolina, taxonomies_stateTennessee, taxonomies_stateTexas, taxonomies_stateUtah, taxonomies_stateVermont, taxonomies_stateVirginia, taxonomies_stateWashington, taxonomies_stateWest Virginia, taxonomies_stateWisconsin, taxonomies_stateWyoming, hold_time_minutes, age, Medicaid_to_Medicare_Fee_Index ) alone ( 63.01 %). The conditional R² represents the proportion of variance explained by both the fixed effects and the random effects ( NPI ) combined ( 99.68 %). This indicates how much of the variability in the outcome can be attributed to the fixed effects versus the entire model, including random effects.

For poisson_significant model: To determine which random effects were significant in your model, you need to look at the variance components for the random effects and their corresponding standard deviations. In mixed models, random effects themselves do not have p-values like fixed effects do. Instead, you evaluate their significance by looking at the variance of the random effects. If the variance is near zero, the random effect may not be contributing much to the model.

Here’s how you can extract and interpret the variance of the random effects to assess their significance for poisson_significant:

## [1] "The random effects in the model are:\n NPI"             
## [2] "The random effects in the model are:\n (Intercept)"     
## [3] "The random effects in the model are:\n NA"              
## [4] "The random effects in the model are:\n 2.80299979854244"
## [5] "The random effects in the model are:\n 1.67421617437607"
## [6] "The random effects in the model are:\n Yes"

## The significant random effects are: NPI

`simr_poisson_full_model` Model Power analysis

The power analysis you’ve conducted with the powerSim function is used to estimate the statistical power of your model for detecting effects of a specific predictor—in this case, the predictor insurance in a Poisson mixed-effects model.

Checking the binned residuals because the data is non-parametric the residuals will not be normally distributed. Collinearity was tested as well as heteroscedasticity was checked.

The residuals appear to be spread out more as the fitted values increase. This funnel shape (with wider dispersion of residuals at higher fitted values) is an indication of heteroscedasticity. In a model with homoscedasticity, the residuals would have a consistent spread across all levels of fitted values, without a clear pattern. The data is non-parametric so the residuals will not be within error bounds.

`poisson_significant` Collinearity

Variance Inflation Factors (VIF) were calculated to assess multicollinearity among predictors. All VIF values were below the commonly used threshold of 5, suggesting that multicollinearity is not a concern for this model.

Variable Importance Factors
	GVIF	Df	GVIF^(1/(2*Df))
basic_gender	1.374102	1	1.172221
gender	570678934847172982603776.000000	0	Inf
taxonomies_state	5.657115	47	1.018606
hold_time_minutes	1.005786	1	1.002889
age	1.413102	1	1.188740
Medicaid_to_Medicare_Fee_Index	4.042736	1	2.010655

## 39 outliers detected: cases 23, 63, 66, 72, 74, 75, 104, 105, 110, 125,
##   148, 156, 157, 194, 203, 211, 212, 217, 223, 230, 242, 263, 268, 271,
##   305, 319, 322, 368, 370, 372, 373, 381, 388, 391, 392, 414, 438, 454,
##   457.
## - Based on the following method and threshold: cook (0.9).
## - For variable: (Whole model).

`poisson` Intraclass Correlation Coefficient

The Intraclass Correlation Coefficient (ICC) is a statistical measure used to evaluate the proportion of variance in a dependent variable that can be attributed to differences between groups or clusters. It is commonly used in the context of hierarchical or mixed models to quantify the degree of similarity within clusters.

20.7 Model Summary and Interpretation

20.7.1 Intraclass Correlation Coefficient (ICC) Calculation and Interpretation

## The intraclass correlation (ICC) of the model for the random effect group ' NPI ' is 0.737 .
## This indicates that 73.7 % of the variance in the outcome variable is attributable to differences between the NPI groups.

## 
##  This is a moderate ICC for the NPI group, suggesting that a considerable portion of the variance is due to differences between these groups.

A low to moderate Intraclass Correlation Coefficient (ICC) for the group “physician NPI name” suggests that while there is some variation in the outcome variable (e.g., business days until appointment) that can be attributed to differences between individual physicians, a substantial portion of the variation occurs within these groups—meaning that much of the variability in appointment times is due to factors other than just the differences between physicians.

In practical terms, this indicates that:

Variation Between Physicians: The fact that the ICC is not zero means that there is some consistency in the appointment times associated with each physician. Some physicians might systematically have longer or shorter wait times, contributing to the variance in the data.
Variation Within Physicians: Since the ICC is low to moderate, it means that even within the same physician, there is considerable variability in appointment times. This could be due to a variety of factors, such as the type of insurance, the scenario, or other factors that are not captured by the physician’s identity alone.
Implications: The low to moderate ICC suggests that while the identity of the physician (as indicated by the NPI name) does have an effect, it is not the dominant factor driving differences in appointment times. Other factors—potentially those captured by fixed effects or residual variance—are also playing a significant role.

In summary, while who the physician is does matter to some extent, other variables are likely more influential in determining how long a patient waits for an appointment. This insight can guide you to look more closely at those other factors in your analysis or to consider whether there are ways to reduce variability within physicians, such as through standardized scheduling practices.

`poisson_significant` Dispersion

Overdispersion in your model implies that the variability in the observed data is greater than what the model predicts under the Poisson assumption. Specifically, in a Poisson model, the mean and variance of the count data are assumed to be equal.

POWER ANALYSIS

Read in Data

data_dir <- here::here("Melanie", "data", "Phase_3")
# Construct the full file path
file_path <- file.path(data_dir, "Phase_2.rds")
df <- readRDS(file_path) %>%
  dplyr::rename(id_number = ID)

a priori power analysis

#' A Priori Sample Size Analysis for Multiple Insurance Levels
#'
#' @description
#' Calculates the required sample size for studies involving multiple insurance
#' levels based on effect size, power, significance level, and number of groups.
#'
#' @param effect_size Numeric. Cohen's f effect size for ANOVA.
#' @param alpha Numeric. Significance level (Type I error probability), default 0.05.
#' @param power Numeric. Desired statistical power (1 - Type II error probability),
#'   default 0.80.
#' @param num_groups Integer. Number of insurance levels/groups to compare, default 3.
#' @param analysis_type Character. Type of analysis: "anova" for one-way ANOVA,
#'   "regression" for linear regression, default "anova".
#' @param verbose Logical. Whether to display detailed logging messages, default TRUE.
#'
#' @return A list containing sample size results and analysis parameters.
#'
#' @importFrom pwr pwr.anova.test pwr.f2.test
#' @importFrom assertthat assert_that
#' @importFrom logger log_info log_debug log_error
#'
#' @examples
#' # Example 1: Basic ANOVA sample size calculation with 3 insurance levels
#' insurance_sample_size(effect_size = 0.25, alpha = 0.05, power = 0.80,
#'                       num_groups = 3, analysis_type = "anova", 
#'                       verbose = TRUE)
#'
#' # Example 2: Higher power requirement with 4 insurance levels
#' insurance_sample_size(effect_size = 0.20, alpha = 0.01, power = 0.90,
#'                       num_groups = 4, analysis_type = "anova", 
#'                       verbose = TRUE)
#'
#' # Example 3: Using regression analysis instead of ANOVA
#' insurance_sample_size(effect_size = 0.15, alpha = 0.05, power = 0.85,
#'                       num_groups = 5, analysis_type = "regression", 
#'                       verbose = FALSE)
#'
#' @export
insurance_sample_size <- function(effect_size, alpha = 0.05, power = 0.80,
                                 num_groups = 3, analysis_type = "anova",
                                 verbose = TRUE) {
  # Set up logger
  if (verbose) {
    logger::log_threshold(logger::DEBUG)
  } else {
    logger::log_threshold(logger::INFO)
  }
  
  logger::log_info("Starting a priori sample size analysis for insurance levels")
  logger::log_debug(paste("Input parameters: effect_size =", effect_size,
                         ", alpha =", alpha, ", power =", power,
                         ", num_groups =", num_groups,
                         ", analysis_type =", analysis_type))
  
  # Input validation
  validate_inputs(effect_size, alpha, power, num_groups, analysis_type)
  
  # Calculate sample size based on analysis type
  sample_size_results <- calculate_sample_size(
    effect_size = effect_size,
    alpha = alpha,
    power = power,
    num_groups = num_groups,
    analysis_type = analysis_type
  )
  
  # Create results structure
  analysis_results <- structure_results(
    sample_size_results = sample_size_results,
    effect_size = effect_size,
    alpha = alpha,
    power = power,
    num_groups = num_groups,
    analysis_type = analysis_type
  )
  
  # Log the outcomes
  log_results(analysis_results)
  
  return(analysis_results)
}

#' @noRd
validate_inputs <- function(effect_size, alpha, power, num_groups, analysis_type) {
  # Validate numeric inputs
  assertthat::assert_that(is.numeric(effect_size), 
                         msg = "Effect size must be numeric")
  assertthat::assert_that(effect_size > 0, 
                         msg = "Effect size must be positive")
  
  assertthat::assert_that(is.numeric(alpha), 
                         msg = "Alpha must be numeric")
  assertthat::assert_that(alpha > 0 && alpha < 1, 
                         msg = "Alpha must be between 0 and 1")
  
  assertthat::assert_that(is.numeric(power), 
                         msg = "Power must be numeric")
  assertthat::assert_that(power > 0 && power < 1, 
                         msg = "Power must be between 0 and 1")
  
  assertthat::assert_that(is.numeric(num_groups), 
                         msg = "Number of groups must be numeric")
  assertthat::assert_that(num_groups >= 2, 
                         msg = "Number of groups must be at least 2")
  assertthat::assert_that(num_groups == round(num_groups), 
                         msg = "Number of groups must be an integer")
  
  # Validate analysis type
  assertthat::assert_that(is.character(analysis_type), 
                         msg = "Analysis type must be a character string")
  assertthat::assert_that(analysis_type %in% c("anova", "regression"), 
                         msg = "Analysis type must be 'anova' or 'regression'")
  
  logger::log_debug("Input validation passed")
}

#' @noRd
calculate_sample_size <- function(effect_size, alpha, power, num_groups, 
                                 analysis_type) {
  logger::log_debug("Calculating sample size")
  
  if (analysis_type == "anova") {
    logger::log_debug("Performing ANOVA sample size calculation")
    # For ANOVA, we use Cohen's f
    anova_result <- pwr::pwr.anova.test(
      k = num_groups,
      f = effect_size,
      sig.level = alpha,
      power = power
    )
    
    sample_per_group <- ceiling(anova_result$n)
    total_sample <- sample_per_group * num_groups
    
    logger::log_debug(paste("ANOVA calculation: n per group =", sample_per_group,
                           ", total sample size =", total_sample))
    
    return(list(
      sample_per_group = sample_per_group,
      total_sample = total_sample,
      calculation_type = "ANOVA"
    ))
    
  } else if (analysis_type == "regression") {
    logger::log_debug("Performing regression sample size calculation")
    # For regression, we convert Cohen's f to f² 
    # (f² is used in regression power calculations)
    f_squared <- effect_size^2
    
    regression_result <- pwr::pwr.f2.test(
      u = num_groups - 1,  # Numerator df
      v = NULL,            # Denominator df (to be calculated)
      f2 = f_squared,
      sig.level = alpha,
      power = power
    )
    
    # Total sample size needed
    total_sample <- ceiling(regression_result$v) + num_groups
    
    logger::log_debug(paste("Regression calculation: total sample size =", 
                           total_sample))
    
    return(list(
      total_sample = total_sample,
      calculation_type = "Regression"
    ))
  }
}

#' @noRd
structure_results <- function(sample_size_results, effect_size, alpha, power, 
                             num_groups, analysis_type) {
  logger::log_debug("Structuring results")
  
  # Common elements for both types
  analysis_results <- list(
    effect_size = effect_size,
    alpha = alpha,
    power = power,
    num_groups = num_groups,
    analysis_type = analysis_type
  )
  
  # Add type-specific elements
  if (analysis_type == "anova") {
    analysis_results$sample_per_group <- sample_size_results$sample_per_group
    analysis_results$total_sample <- sample_size_results$total_sample
    analysis_results$interpretation <- paste(
      "For an ANOVA with", num_groups, "insurance levels, effect size of", 
      effect_size, "(Cohen's f), alpha of", alpha, "and power of", power, 
      "you need", sample_size_results$sample_per_group, 
      "participants per group (total:", sample_size_results$total_sample, ")."
    )
  } else {
    analysis_results$total_sample <- sample_size_results$total_sample
    analysis_results$interpretation <- paste(
      "For a regression analysis with", num_groups, "insurance levels, effect size of", 
      effect_size, "(Cohen's f), alpha of", alpha, "and power of", power, 
      "you need a total of", sample_size_results$total_sample, "participants."
    )
  }
  
  return(analysis_results)
}

#' @noRd
log_results <- function(analysis_results) {
  logger::log_info("Sample size analysis completed")
  logger::log_info(analysis_results$interpretation)
  
  if (analysis_results$analysis_type == "anova") {
    logger::log_debug(paste("Sample size per group:", 
                           analysis_results$sample_per_group))
  }
  
  logger::log_debug(paste("Total sample size:", analysis_results$total_sample))
}

a priori test with ANOVA testing for insurance

# Example 1: Basic ANOVA sample size calculation with 2 insurance levels
a_priori_result1 <- insurance_sample_size(
  effect_size = 0.1,
  num_groups = 2,              # Two insurance groups
  analysis_type = "anova",
  verbose = TRUE
); a_priori_result1

## $effect_size
## [1] 0.1
## 
## $alpha
## [1] 0.05
## 
## $power
## [1] 0.8
## 
## $num_groups
## [1] 2
## 
## $analysis_type
## [1] "anova"
## 
## $sample_per_group
## [1] 394
## 
## $total_sample
## [1] 788
## 
## $interpretation
## [1] "For an ANOVA with 2 insurance levels, effect size of 0.1 (Cohen's f), alpha of 0.05 and power of 0.8 you need 394 participants per group (total: 788 )."

a priori test with proportion testing for ED referral rates

# Example: If you expect 20% ED referral rate for private insurance and 30% for Medicaid
binary_result <- ed_referral_power_analysis(
  prop1 = 0.20,  # Expected proportion in private insurance group
  prop2 = 0.30,  # Expected proportion in Medicaid group
  power = 0.80,
  alpha = 0.05,
  verbose = TRUE
); binary_result

## $prop1
## [1] 0.2
## 
## $prop2
## [1] 0.3
## 
## $effect_size_h
## [1] -0.2319843
## 
## $alpha
## [1] 0.05
## 
## $power
## [1] 0.8
## 
## $alternative
## [1] "two.sided"
## 
## $sample_size_per_group
## [1] 292
## 
## $total_sample_size
## [1] 584
## 
## $interpretation
## [1] "To detect a difference between proportions of 0.2 and 0.3 (effect size h = -0.232) with 80% power and alpha = 0.05, you need 292 participants per group (total: 584)."

SUPPLEMENT

Introduction

This statistical supplement provides a comprehensive account of the a priori sample size analyses used to guide the design of this study, which examines differences in healthcare access by insurance type. Our primary outcome was ED referral rate, a binary variable. A secondary outcome was business days until appointment, modeled as a continuous variable. All calculations were conducted in R using the pwr package, with clearly documented assumptions and reproducible code.

Methods

A Priori Power Analysis

We conducted two independent a priori power analyses tailored to our study’s primary and secondary outcomes. These included: (1) a binary outcome comparison for ED referrals between insurance groups using a two-sample test of proportions, and (2) a continuous outcome comparison of appointment wait times using one-way ANOVA.

Sample Size Calculation for Binary Outcome (ED Referral)

Our primary analysis focused on detecting a difference in ED referral rates between patients with private insurance (BCBS) and those with Medicaid. Based on pilot data and clinical judgment, we expected 20% of privately insured patients to be referred to the ED, compared to 30% of Medicaid patients — a 10-percentage-point absolute difference.

We calculated effect size using Cohen’s \(h\) for proportions:

\[ h = 2 \cdot \left[\arcsin(\sqrt{p_1}) - \arcsin(\sqrt{p_2})\right] \]

Substituting our expected proportions:

\[ h = 2 \cdot \left[\arcsin(\sqrt{0.20}) - \arcsin(\sqrt{0.30})\right] \approx -0.232 \]

This represents a small-to-moderate effect size. Using pwr::pwr.2p.test() in R and specifying: - Effect size (Cohen’s \(h\)) = 0.232 - Type I error rate (α) = 0.05 - Statistical power (1 - β) = 0.80 - Two-sided hypothesis test

We calculated that 292 participants per group (584 total) were needed to detect the expected difference in ED referral rates.

ANOVA-Based Sample Size Calculation for Appointment Wait Times

Our secondary outcome was business days until appointment, treated as a continuous variable. For this analysis, we applied a one-way ANOVA framework, treating insurance type as a categorical fixed effect.

The ANOVA model is expressed as:

\[ Y_{ij} = \mu + \alpha_i + \epsilon_{ij} \]

Where: - \(Y_{ij}\) is the number of business days until appointment for the \(j\)th patient in the \(i\)th insurance group - \(\mu\) is the grand mean - \(\alpha_i\) is the group-level deviation (e.g., Medicaid vs. BCBS) - \(\epsilon_{ij} \sim N(0, \sigma^2)\) is the error term

We assumed: - Effect size (Cohen’s \(f\)) = 0.10, reflecting a small but potentially meaningful difference in mean wait times - Significance level (α) = 0.05 - Power (1 - β) = 0.80 - Number of groups = 2 (BCBS and Medicaid)

Using pwr::pwr.anova.test() in R, we determined that 394 participants per group (788 total) were required to detect this effect with 80% power.

Regression-Based Framework (Conceptual Justification)

Although the ANOVA approach guided the sample size estimation for wait times, the primary analysis used a linear regression model. This allowed us to directly estimate the difference in business days between groups, and potentially adjust for covariates.

The equivalent regression model is:

\[ Y_i = \beta_0 + \beta_1 X_{i1} + \epsilon_i \]

Where: - \(Y_i\) is the number of business days until appointment for the \(i\)th patient - \(\beta_0\) is the intercept (representing BCBS patients) - \(\beta_1\) is the coefficient for Medicaid (compared to BCBS) - \(X_{i1}\) is a dummy variable (1 = Medicaid, 0 = BCBS) - \(\epsilon_i \sim N(0, \sigma^2)\)

This regression formulation is statistically equivalent to the ANOVA and offers greater flexibility for modeling.

Summary

Our a priori power analysis identified the required sample sizes for both primary and secondary outcomes. To detect a 10% difference in ED referral rates with 80% power, we needed 584 total participants. For detecting a small difference in appointment wait times, a total of 788 participants was required. These estimates guided recruitment targets and ensured the study was adequately powered to detect effects of clinical and policy relevance.

References

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). GPower 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39*(2), 175–191.
Lenth, R. V. (2001). Some practical guidelines for effective sample size determination. The American Statistician, 55(3), 187–193.
Peng, C. Y. J., Long, H., & Abaci, S. (2012). Power analysis software for educational researchers. The Journal of Experimental Education, 80(2), 113–136.
Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.

DRAFT: Urgent GYN Issue Mystery Caller Study

Tyler M. Muffly, MD and Melanie Mandell

24 March, 2025