FYI: If someone was told to go to the ED then we make their business days until appoint == 0.

tyler install

Read in data

Quality Check the Data

Are there any physicians included more than twice?

Included More than Twice
NPI	record_id	N
NA	NA	NA
—:	———:	–:

Variables of those physicians included more than twice?

Variables of Physicians Included More Than Twice
NPI	reason_for_exclusions	insurance	business_days_until_appointment
NA	NA	NA	NA
—:	:———————	:———	——————————-:

Find physicians called more than three times

id_numbers called more than thrice
NPI	calls_count
NA	NA
—:	———–:

Do they have exclusion and have a business_days_until_appointment >0?

Do they have exclusion and have a business_days_until_appointment >0?
NPI	id_number	reason_for_exclusions	business_days_until_appointment
NA	NA	NA	NA
—:	———:	:———————	——————————-:

Do they have business_days_until_appointment >0 but are an excluded category?

Records with Appointments but in Excluded Category
NPI	id_number	reason_for_exclusions	business_days_until_appointment
NA	NA	NA	NA
—:	———:	:———————	——————————-:

Do they have NA for business_days_until_appointment but are “Included” in the Reasons for exclusion category?

Included Records with NA for Appointments
NPI	id_number	reason_for_exclusions	business_days_until_appointment
NA	NA	NA	NA
—:	———:	:———————	——————————-:

Check data normality

The data is not normally distributed. Plus it is count data. t-test assumes that data is normally distributed, and comparing the means of counts data is also not appropriate, we can check the incidence rate ratio for comparison of business_days_until_appointment among the categories of insurance. Better to use Poisson regression.

This Q-Q plot displays the distribution of the business_days_until_appointment variable against a theoretical normal distribution. Here’s an interpretation based on the plot’s characteristics:

Heavy Right Tail (Positive Skew): The data points deviate upward from the reference line on the right side, indicating that the business_days_until_appointment distribution has a heavy right tail or positive skew. This suggests that while most appointments are scheduled within a typical range, there are a few cases where the wait time is significantly longer.
Departure from Normality: The points deviate from the reference line at both ends, especially at the upper end (right tail). This indicates that the data does not follow a normal distribution closely. Instead, it appears to have a skewed, possibly exponential or log-normal distribution, given the pattern of points rising sharply at higher values.
Outliers: The data point at the top right, well above the line, is likely an outlier with a much longer wait time than the majority. This extreme value contributes to the non-normality and might need consideration, depending on the analysis goals.

In summary, the business_days_until_appointment variable is not normally distributed and shows positive skewness with some outliers, especially toward longer wait times.

## Starting normality check and summary calculation for variable: business_days_until_appointment

## Data extracted for variable: business_days_until_appointment

## Shapiro-Wilk normality test completed with p-value: 0.0000000000000000000000000000023401989208789

## The p-value is less than or equal to 0.05, indicating that the data is not normally distributed.

## Histogram with Density Plot created.

## Q-Q Plot created.

## Data is NOT normally distributed. Use non-parametric measures like median: 8, IQR: 26

## $median
## [1] 8
## 
## $iqr
## [1] 26

## Summary calculation completed for variable: business_days_until_appointment

## $median
## [1] 8
## 
## $iqr
## [1] 26

In interpreting this output:

Poisson Model Appropriateness:
- Since we are dealing with count data for the outcome (business_days_until_appointment), Poisson regression is indeed more suitable than a Kruskal-Wallis test. The Kruskal-Wallis test would only indicate if there is a statistically significant difference across groups in insurance but would not provide specific information on the effect size or direction of differences, which the Poisson model offers.
Interpretation of the Medicaid Coefficient:
- The coefficient for Medicaid, -0.008725, suggests a slight (but statistically insignificant) reduction in the log count of days until the appointment for Medicaid patients compared to the baseline insurance group. The p-value of 0.659 shows this effect is not statistically significant, meaning we don’t have enough evidence to conclude that Medicaid influences wait time compared to the baseline insurance category.
Null and Residual Deviance:
- The null and residual deviance are nearly identical, indicating that adding insurance as a predictor does not improve the model’s fit substantially. This suggests that insurance may not be a strong predictor of business_days_until_appointment.
Overdispersion:
- If you find that the variance is significantly greater than the mean (overdispersion), a negative binomial regression might be more appropriate, as it allows for extra variation in the data.
Conclusion:
- This Poisson regression model indicates that insurance type does not significantly influence the wait time for an appointment (business_days_until_appointment) based on the p-value and the similarity in deviance values.

In summary, while Poisson regression provides more detailed insights than a Kruskal-Wallis test, this model suggests that insurance type does not significantly affect the wait time for an appointment.

## 
## Call:
## glm(formula = business_days_until_appointment ~ as.factor(insurance), 
##     family = "poisson", data = df)
## 
## Coefficients:
##                               Estimate Std. Error z value            Pr(>|z|)
## (Intercept)                   2.893030   0.012880 224.615 <0.0000000000000002
## as.factor(insurance)Medicaid -0.008725   0.019781  -0.441               0.659
##                                 
## (Intercept)                  ***
## as.factor(insurance)Medicaid    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 16741  on 581  degrees of freedom
## Residual deviance: 16741  on 580  degrees of freedom
##   (558 observations deleted due to missingness)
## AIC: 18609
## 
## Number of Fisher Scoring iterations: 6

## The baseline rate of business_days_until_appointment (intercept) is estimated to be 18.05 times the reference category, with a 95% confidence interval ranging from 17.6 to 18.51 . For Medicaid compared to the reference category (BCBS), the rate of business_days_until_appointment is approximately 0.99 times lower . The 95% confidence interval for this estimate ranges between 0.95 and 1.03 , meaning that the waiting time for an appointment is estimated to be about 0.99 times shorter for Medicaid patients than for those with BCBS insurance.

Results

Create Median Household Income Quantiles

## # A tibble: 139 × 6
##     state income_quartile income_range   physicians_in_quartile total_physicians
##     <chr> <chr>           <chr>                           <int>            <int>
##   1 AK    Q3              $61,107 - $81…                      8                8
##   2 AL    Q1 (Lowest)     $7,609 - $42,…                      6               13
##   3 AL    Q4 (Highest)    $66,830 - $17…                      5               13
##   4 AR    Q2              $40,891 - $50…                      2                8
##   5 AR    Q4 (Highest)    $61,128 - $25…                      2                8
##   6 AZ    Q2              $46,851 - $62…                      6               35
##   7 AZ    Q3              $62,108 - $80…                     12               35
##   8 AZ    Q4 (Highest)    $80,324 - $17…                     13               35
##   9 CA    Q1 (Lowest)     $2,499 - $63,…                     24               68
##  10 CA    Q2              $63,980 - $86…                     10               68
##  11 CA    Q3              $86,216 - $11…                     16               68
##  12 CA    Q4 (Highest)    $114,510 - $2…                     12               68
##  13 CO    Q1 (Lowest)     $18,125 - $57…                      8               29
##  14 CO    Q2              $57,934 - $75…                      3               29
##  15 CO    Q3              $75,458 - $97…                     10               29
##  16 CO    Q4 (Highest)    $97,288 - $17…                      6               29
##  17 CT    Q1 (Lowest)     $14,852 - $79…                      6               38
##  18 CT    Q2              $79,811 - $10…                      8               38
##  19 CT    Q3              $101,458 - $1…                     14               38
##  20 CT    Q4 (Highest)    $121,560 - $2…                     10               38
##  21 DE    Q3              $77,454 - $90…                      2                4
##  22 FL    Q1 (Lowest)     $12,894 - $53…                     26               70
##  23 FL    Q2              $53,856 - $66…                      9               70
##  24 FL    Q3              $66,302 - $82…                     13               70
##  25 FL    Q4 (Highest)    $82,948 - $25…                     18               70
##  26 GA    Q1 (Lowest)     $14,306 - $46…                      2               35
##  27 GA    Q2              $46,696 - $59…                      4               35
##  28 GA    Q3              $59,036 - $75…                      3               35
##  29 GA    Q4 (Highest)    $75,180 - $25…                     13               35
##  30 HI    Q1 (Lowest)     $35,221 - $73…                     10               18
##  31 HI    Q2              $73,118 - $87…                      2               18
##  32 HI    Q3              $87,926 - $10…                      2               18
##  33 IA    Q1 (Lowest)     $17,452 - $60…                      8               16
##  34 IA    Q3              $69,375 - $81…                      6               16
##  35 IA    Q4 (Highest)    $81,250 - $16…                      2               16
##  36 ID    Q4 (Highest)    $76,010 - $14…                      2                6
##  37 IL    Q1 (Lowest)     $12,663 - $57…                      6               49
##  38 IL    Q2              $57,692 - $70…                      6               49
##  39 IL    Q3              $70,304 - $87…                     20               49
##  40 IL    Q4 (Highest)    $87,349 - $25…                     15               49
##  41 IN    Q1 (Lowest)     $22,677 - $56…                      2               21
##  42 IN    Q2              $56,954 - $66…                      2               21
##  43 IN    Q3              $66,519 - $78…                      8               21
##  44 IN    Q4 (Highest)    $78,594 - $15…                      5               21
##  45 KS    Q2              $53,314 - $63…                      4               11
##  46 KS    Q3              $63,327 - $76…                      4               11
##  47 KS    Q4 (Highest)    $76,673 - $20…                      3               11
##  48 KY    Q1 (Lowest)     $2,499 - $41,…                      3               17
##  49 KY    Q2              $41,636 - $52…                      2               17
##  50 KY    Q3              $52,804 - $65…                      6               17
##  51 KY    Q4 (Highest)    $65,042 - $25…                      6               17
##  52 LA    Q4 (Highest)    $67,596 - $16…                     10               22
##  53 MA    Q1 (Lowest)     $20,202 - $78…                     12               39
##  54 MA    Q2              $78,849 - $10…                     13               39
##  55 MA    Q4 (Highest)    $126,844 - $2…                     12               39
##  56 MD    Q1 (Lowest)     $19,722 - $77…                      2               30
##  57 MD    Q2              $77,875 - $10…                     12               30
##  58 MD    Q3              $100,573 - $1…                      6               30
##  59 MD    Q4 (Highest)    $130,008 - $2…                      6               30
##  60 ME    Q1 (Lowest)     $21,161 - $52…                      2                6
##  61 ME    Q2              $52,258 - $61…                      2                6
##  62 ME    Q3              $61,470 - $77…                      2                6
##  63 MI    Q2              $53,727 - $63…                      8               39
##  64 MI    Q3              $63,472 - $78…                      8               39
##  65 MI    Q4 (Highest)    $78,063 - $18…                     13               39
##  66 MN    Q1 (Lowest)     $14,107 - $62…                      2               20
##  67 MN    Q2              $62,500 - $72…                      8               20
##  68 MN    Q3              $72,469 - $87…                      4               20
##  69 MN    Q4 (Highest)    $87,688 - $22…                      6               20
##  70 MO    Q1 (Lowest)     $2,499 - $48,…                      2               41
##  71 MO    Q2              $48,556 - $58…                      2               41
##  72 MO    Q3              $58,333 - $71…                     22               41
##  73 MO    Q4 (Highest)    $71,942 - $25…                     11               41
##  74 MS    Q1 (Lowest)     $2,499 - $36,…                      4               18
##  75 MS    Q2              $36,698 - $46…                      2               18
##  76 MS    Q3              $46,736 - $58…                      4               18
##  77 MS    Q4 (Highest)    $58,120 - $16…                      6               18
##  78 MT    Q3              $61,250 - $74…                      2                4
##  79 NC    Q1 (Lowest)     $2,499 - $49,…                      6               38
##  80 NC    Q2              $49,157 - $59…                      4               38
##  81 NC    Q3              $59,413 - $72…                      8               38
##  82 NC    Q4 (Highest)    $72,752 - $21…                     18               38
##  83 NE    Q1 (Lowest)     $23,393 - $58…                      1               21
##  84 NE    Q2              $58,438 - $68…                      6               21
##  85 NE    Q3              $68,125 - $80…                      2               21
##  86 NE    Q4 (Highest)    $80,078 - $17…                     12               21
##  87 NH    Q1 (Lowest)     $31,750 - $72…                      2                2
##  88 NJ    Q1 (Lowest)     $23,780 - $84…                      8               30
##  89 NJ    Q2              $84,466 - $10…                      2               30
##  90 NJ    Q3              $106,339 - $1…                     12               30
##  91 NJ    Q4 (Highest)    $138,235 - $2…                      8               30
##  92 NM    Q1 (Lowest)     $16,096 - $37…                      2               14
##  93 NM    Q4 (Highest)    $63,839 - $25…                     10               14
##  94 NV    Q2              $55,935 - $75…                      6               28
##  95 NV    Q3              $75,089 - $93…                      6               28
##  96 NY    Q1 (Lowest)     $2,499 - $61,…                      3               40
##  97 NY    Q3              $76,046 - $10…                     13               40
##  98 NY    Q4 (Highest)    $101,250 - $2…                     22               40
##  99 OH    Q2              $53,634 - $65…                      6               37
## 100 OH    Q3              $65,619 - $79…                      9               37
## 101 OH    Q4 (Highest)    $79,724 - $25…                     18               37
## 102 OK    Q2              $46,944 - $55…                      2               14
## 103 OK    Q3              $55,341 - $67…                      8               14
## 104 OK    Q4 (Highest)    $67,915 - $17…                      4               14
## 105 OR    Q1 (Lowest)     $2,499 - $55,…                      5               13
## 106 OR    Q3              $66,506 - $85…                      4               13
## 107 OR    Q4 (Highest)    $85,440 - $15…                      4               13
## 108 PA    Q1 (Lowest)     $14,319 - $55…                      6               42
## 109 PA    Q2              $55,979 - $67…                      6               42
## 110 PA    Q3              $67,750 - $82…                     10               42
## 111 PA    Q4 (Highest)    $82,341 - $25…                     18               42
## 112 SC    Q2              $45,084 - $55…                      4                8
## 113 SC    Q3              $55,095 - $69…                      2                8
## 114 SC    Q4 (Highest)    $69,025 - $17…                      2                8
## 115 SD    Q1 (Lowest)     $2,499 - $55,…                      2                2
## 116 TN    Q1 (Lowest)     $2,499 - $47,…                      6               35
## 117 TN    Q2              $47,208 - $56…                      9               35
## 118 TN    Q3              $56,804 - $69…                     10               35
## 119 TN    Q4 (Highest)    $69,372 - $18…                     10               35
## 120 TX    Q1 (Lowest)     $2,499 - $52,…                      2               58
## 121 TX    Q2              $52,264 - $64…                      8               58
## 122 TX    Q3              $64,792 - $82…                     19               58
## 123 TX    Q4 (Highest)    $82,312 - $25…                     19               58
## 124 UT    Q1 (Lowest)     $16,685 - $62…                      4                6
## 125 UT    Q2              $62,464 - $74…                      2                6
## 126 VA    Q1 (Lowest)     $4,016 - $53,…                      2               19
## 127 VA    Q2              $53,246 - $70…                      4               19
## 128 VA    Q4 (Highest)    $96,605 - $25…                     11               19
## 129 VT    Q4 (Highest)    $87,078 - $14…                      4                4
## 130 WA    Q1 (Lowest)     $26,823 - $62…                      8               26
## 131 WA    Q2              $62,500 - $76…                      2               26
## 132 WA    Q3              $76,683 - $96…                      4               26
## 133 WA    Q4 (Highest)    $96,627 - $25…                      4               26
## 134 WI    Q1 (Lowest)     $17,746 - $61…                     12               22
## 135 WI    Q4 (Highest)    $82,467 - $14…                     10               22
## 136 WV    Q4 (Highest)    $62,942 - $25…                      8                8
## 137 WY    Q1 (Lowest)     $25,809 - $57…                      2                4
## 138 WY    Q2              $57,412 - $69…                      2                4
## 139 US    Q4 (Highest)    $85,313 - $25…                     38             1140
## # ℹ 1 more variable: percent_in_quartile <dbl>

Zip Analysis

National percentage of physicians in most affluent ZIP Codes

Insurance Acceptance Rates

Steps to Calculate Medicaid Acceptance Rate

These acceptance rates reflect the proportion of physicians who were successfully contacted, accepted the respective insurance, and provided an appointment to the patient.

Medicaid Acceptance Rate: Out of the total number of physicians assigned Medicaid insurance (573), 179 physicians accepted Medicaid and provided an appointment, resulting in an acceptance rate of 75.2%.

Blue Cross/Blue Shield Acceptance Rate: Among the physicians assigned Blue Cross/Blue Shield insurance (567), 238 accepted this insurance and provided an appointment, yielding an acceptance rate of 73%.

Told to seek Emergency Care

scenario_type	n	percent
Emergent	70	43.47826
Urgent	91	56.52174

## For the 161 patients who were told to go to the Emergency Department, 43.5% were in the Emergent scenario type (n = 70 / N = 161) and 56.5% were in the Urgent scenario type (n = 91 / N = 161).

Appointment Accessibility

## Our sample included 1140 calls to physician offices from 49 states excluding North Dakota and Rhode Island . We made calls to 567 unique physicians that accepted Blue Cross/Blue Shield. One Hundred Seventy-Nine physician offices accepted Medicaid, giving a 75.2 % Medicaid acceptance rate for OBGYN practices (n = 179 /N = 238 ).  Physicians offices accepted Blue Cross/Blue Shield at a rate of 73 % (n = 238 /N = 326 ).

## # A tibble: 6 × 34
##          NPI   age age_category     gender Med_sch Grd_yr academic ACOG_District
##        <dbl> <dbl> <ord>            <fct>  <fct>    <dbl> <fct>    <fct>        
## 1 1265759062    53 50 to 59 years … Female US Sen…   2010 Private… District V   
## 2 1265759062    53 50 to 59 years … Female US Sen…   2010 Private… District V   
## 3 1083000731    36 Less than 40 ye… Female Intern…   2015 Private… District II  
## 4 1083000731    36 Less than 40 ye… Female Intern…   2015 Private… District II  
## 5 1144207358    51 50 to 59 years … Female US Sen…   1998 Private… District V   
## 6 1144207358    51 50 to 59 years … Female US Sen…   1998 Private… District V   
## # ℹ 26 more variables: cbsatype10 <fct>, scenario <fct>, scenario_type <fct>,
## #   insurance <fct>, including_this_physician_in_the_study <fct>,
## #   told_to_go_to_the_emergency_department <fct>,
## #   offered_a_clinic_appointment_to_be_seen <fct>, reason_for_exclusions <fct>,
## #   central_number <fct>, number_of_transfers <fct>, call_time_minutes <dbl>,
## #   hold_time_minutes <dbl>, Provider.Enumeration.Date <dbl>,
## #   day_of_the_week <ord>, business_days_until_appointment <dbl>, …
## # A tibble: 2 × 3
##   insurance                  n percent
##   <fct>                  <int>   <dbl>
## 1 Blue Cross/Blue Shield   567    49.7
## 2 Medicaid                 573    50.3
## # A tibble: 2 × 3
##   insurance                  n percent
##   <fct>                  <dbl>   <dbl>
## 1 Blue Cross/Blue Shield   567    49.7
## 2 Medicaid                 573    50.3

Univariate Analysis

The median physician age was 53(IQR 25th percentile 44 to 75th percentile 61).

Variable Selection

Wait Time with single predictor

Wait Times for All Insurances
Median_business_days_until_appointment	Q1	Q3
8	0	26

The median wait time across all insurance was 8 business days, with an interquartile range (IQR) of 0 to 26.

Use tyler::generate_latex_equation functions.

Scenarios for Variable Selection

\[ \begin{{align*}} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{{e^{{-\lambda}} \cdot \lambda^x}}{{x!}} \\sqrt{{\lambda}} &= \beta_0 \& + \beta_1 \cdot \underline{{\mathbf{{\large{{\textPatient Scenario}}}}}} \& + ( 1 | \text{{Physician NPI}}) \end{{align*}} \]

## Logging inputs...
## Model Object:  glm lm 
## Specs:  ~scenario | scenario 
## Variable of Interest:  scenario 
## Color By:  scenario 
## Output Directory:  Melanie/Figures 
## Y-Axis Min:  12 
## Y-Axis Max:  24 
## Using existing output directory:  Melanie/Figures 
## Computing estimated marginal means...
## Logging estimated marginal means data...
## # A tibble: 4 × 6
##   scenario                                  rate    SE    df asymp.LCL asymp.UCL
##   <fct>                                    <dbl> <dbl> <dbl>     <dbl>     <dbl>
## 1 Prior trip to ED and was found to have …  19.2 0.366   Inf      18.5      19.9
## 2 Positive pregnancy test after a tubal l…  17.0 0.341   Inf      16.4      17.7
## 3 Acute cystitis                            13.4 0.308   Inf      12.8      14.0
## 4 Recurrent/Treatment resistant vaginitis   22.0 0.382   Inf      21.3      22.8
## Range of estimated marginal means with CIs:  12.7995 22.80813 
## Creating the plot...
## Plot created successfully.

## Saving plot to:  Melanie/Figures/interaction_scenario_comparison_plot_20241111_203055.png

## Plot saved successfully to:  Melanie/Figures/interaction_scenario_comparison_plot_20241111_203055.png 
## Returning the estimated data and plot object.

## There were 1140 calls made with senarios having to do with 284 positive pregnancy test after a tubal ligation, 287 prior trip to ED and was found to have a 6 cm TOA, 282 Acute cystitis, and 287 with Recurrent/Treatment resistant vaginitis.

Business Days Until Next Appointment Joint Scenario
scenario	Median_business_days_until_appointment	Q1	Q3
Prior trip to ED and was found to have a 6 cm TOA	9	0	26
Positive pregnancy test after a tubal ligation	9	1	22
Acute cystitis	2	0	20
Recurrent/Treatment resistant vaginitis	12	1	34

Number of offices with each of the four scenarios successfully contacted: `business_days_until_appointment ~ scenario`

\[ \begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \underline{\mathbf{\large{\text{{Number of Offices Contacted}}}}} \\ & + ( 1 | \text{{Physician NPI}}) \end{align*} \]

Number of successful calls contacted for each scenario
scenario	count
Prior trip to ED and was found to have a 6 cm TOA	137
Positive pregnancy test after a tubal ligation	144
Acute cystitis	138
Recurrent/Treatment resistant vaginitis	145

## 
## Call:
## glm(formula = business_days_until_appointment ~ as.factor(scenario), 
##     family = "poisson", data = df)
## 
## Coefficients:
##                                                                   Estimate
## (Intercept)                                                        2.95360
## as.factor(scenario)Positive pregnancy test after a tubal ligation -0.11759
## as.factor(scenario)Acute cystitis                                 -0.35908
## as.factor(scenario)Recurrent/Treatment resistant vaginitis         0.13955
##                                                                   Std. Error
## (Intercept)                                                          0.01910
## as.factor(scenario)Positive pregnancy test after a tubal ligation    0.02764
## as.factor(scenario)Acute cystitis                                    0.02991
## as.factor(scenario)Recurrent/Treatment resistant vaginitis           0.02579
##                                                                   z value
## (Intercept)                                                       154.663
## as.factor(scenario)Positive pregnancy test after a tubal ligation  -4.255
## as.factor(scenario)Acute cystitis                                 -12.007
## as.factor(scenario)Recurrent/Treatment resistant vaginitis          5.411
##                                                                               Pr(>|z|)
## (Intercept)                                                       < 0.0000000000000002
## as.factor(scenario)Positive pregnancy test after a tubal ligation         0.0000209144
## as.factor(scenario)Acute cystitis                                 < 0.0000000000000002
## as.factor(scenario)Recurrent/Treatment resistant vaginitis                0.0000000626
##                                                                      
## (Intercept)                                                       ***
## as.factor(scenario)Positive pregnancy test after a tubal ligation ***
## as.factor(scenario)Acute cystitis                                 ***
## as.factor(scenario)Recurrent/Treatment resistant vaginitis        ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 16741  on 581  degrees of freedom
## Residual deviance: 16412  on 578  degrees of freedom
##   (558 observations deleted due to missingness)
## AIC: 18284
## 
## Number of Fisher Scoring iterations: 6

## The median wait time across all scenarios was 8 business days, with an interquartile range (IQR) of 0 to 26 days. Specifically, the median wait time was 9 days (IQR: 0 to 26) for 'Prior trip to ED and was found to have a 6 cm TOA', 9 days (IQR: 1 to 22) for 'Positive pregnancy test after a tubal ligation', 2 days (IQR: 0 to 20) for 'Acute cystitis', and 12 days (IQR: 1 to 34) for 'Recurrent/Treatment resistant vaginitis'. The p-value for the difference between 'Positive pregnancy test after a tubal ligation' and 'Prior trip to ED and was found to have a 6 cm TOA' scenarios was <0.01, for 'Acute cystitis' and 'Prior trip to ED and was found to have a 6 cm TOA', it was <0.01, and for 'Recurrent/Treatment resistant vaginitis' and 'Prior trip to ED and was found to have a 6 cm TOA', it was <0.01.

Insurance

\[ \begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \underline{\mathbf{\large{\text{{Patient Insurance}}}}} \\ & + ( 1 | \text{{Physician NPI}}) \end{align*} \]

Business Days Until Next Appointment By Each Insurance
insurance	Median_business_days_until_appointment	Q1	Q3
Blue Cross/Blue Shield	9.0	0	26
Medicaid	7.5	0	26

## Medicaid patients experienced a 0.87 % shorter wait for a new patient appointment compared to patients with BCBS (Incidence Rate Ratio: 0.991 ; CI: 1 - 1 ; p = 0.66 ) with median wait times of 7.5 business days (IQR: 25th percentile 0 - 75th percentile 26 ) and 9 business days (IQR: 25th percentile 0 - 75th percentile 26 ) respectively.

Exclusions

## Of the total 1140 phones calls made, 871 (76%) successfully reached a representative, while 269 calls (24%) did not yield a connection even after two attempts. For the unsuccessful connections, 73 (27%) were redirected to voicemail, 138 (51%) listed an incorrect telephone number, and 58 (22%) reached a busy signal.  For successful connections, the reasons for exclusion were 39 (4%) requiring a prior referral,63 (7%) reported that they were not currently accepting new patients and, 179 physician offices (21%) put the caller on hold for more than five minutes.

Visualizing the Each Individual Predictor

Graph each variable

Business days by insurance

Log Business Days

## Plots saved to: output/density_plot_20241111_203058.tiff and output/density_plot_20241111_203058.png

told_to_go_to_the_emergency_depa for emergency scenario types

Emergency vs Urgent scenario types

Day of the week by insurance

Central Appointment Line by Insurance

Physician Gender by Insurance

Physician MD vs. DO by Insurance

Scenario

Descriptive Tables

Table 1 - Split across Insurances

	Blue Cross/Blue Shield (N=568)	Medicaid (N=15)	Total (N=583)	p value
Age (years)				0.06
- Less than 50 years old	215 (39.2%)	4 (26.7%)	219 (38.9%)
- 50 to 55 years old	88 (16.1%)	4 (26.7%)	92 (16.3%)
- 56 to 60 years old	88 (16.1%)	0 (0.0%)	88 (15.6%)
- 61 to 65 years old	69 (12.6%)	5 (33.3%)	74 (13.1%)
- Greater than 65 years old	88 (16.1%)	2 (13.3%)	90 (16.0%)
Gender				0.43
- Female	360 (63.4%)	8 (53.3%)	368 (63.1%)
- Male	208 (36.6%)	7 (46.7%)	215 (36.9%)
Medical School Training				0.92
- Allopathic training	522 (93.5%)	13 (92.9%)	535 (93.5%)
- Osteopathic training	36 (6.5%)	1 (7.1%)	37 (6.5%)
Medical School Location				0.40
- US Senior Medical Student	414 (81.8%)	11 (73.3%)	425 (81.6%)
- International Medical Graduate	92 (18.2%)	4 (26.7%)	96 (18.4%)
Academic Affiliation				0.20
- Private Practice	511 (90.0%)	15 (100.0%)	526 (90.2%)
- University	57 (10.0%)	0 (0.0%)	57 (9.8%)
Rurality				0.60
- Metropolitan area	515 (90.7%)	13 (86.7%)	528 (90.6%)
- Rural area	53 (9.3%)	2 (13.3%)	55 (9.4%)
Number of Phone Transfers				0.60
- No transfers	358 (63.5%)	11 (78.6%)	369 (63.8%)
- One transfer	158 (28.0%)	3 (21.4%)	161 (27.9%)
- Two transfers	37 (6.6%)	0 (0.0%)	37 (6.4%)
- More than two transfers	11 (2.0%)	0 (0.0%)	11 (1.9%)
age_category				0.20
- N-Miss	20	0	20
- Less than 40 years old	76 (13.9%)	2 (13.3%)	78 (13.9%)
- 40. to 49 years old	139 (25.4%)	2 (13.3%)	141 (25.0%)
- 50 to 59 years old	176 (32.1%)	4 (26.7%)	180 (32.0%)
- 60 to 69 years old	121 (22.1%)	5 (33.3%)	126 (22.4%)
- 70 years and greater	36 (6.6%)	2 (13.3%)	38 (6.7%)
American College of OBGYNs Districts				0.28
- District I	35 (6.2%)	2 (13.3%)	37 (6.3%)
- District II	33 (5.8%)	0 (0.0%)	33 (5.7%)
- District III	38 (6.7%)	1 (6.7%)	39 (6.7%)
- District IV	65 (11.4%)	3 (20.0%)	68 (11.7%)
- District V	55 (9.7%)	2 (13.3%)	57 (9.8%)
- District VI	65 (11.4%)	0 (0.0%)	65 (11.1%)
- District VII	96 (16.9%)	1 (6.7%)	97 (16.6%)
- District VIII	95 (16.7%)	4 (26.7%)	99 (17.0%)
- District IX	31 (5.5%)	0 (0.0%)	31 (5.3%)
- District XI	20 (3.5%)	2 (13.3%)	22 (3.8%)
- District XII	35 (6.2%)	0 (0.0%)	35 (6.0%)
scenario				0.01
- Acute cystitis	137 (24.1%)	9 (60.0%)	146 (25.0%)
- Positive pregnancy test after a tubal ligation	143 (25.2%)	2 (13.3%)	145 (24.9%)
- Prior trip to ED and was found to have a 6 cm TOA	143 (25.2%)	3 (20.0%)	146 (25.0%)
- Recurrent/Treatment resistant vaginitis	145 (25.5%)	1 (6.7%)	146 (25.0%)
scenario_type				0.19
- Emergent	286 (50.4%)	5 (33.3%)	291 (49.9%)
- Urgent	282 (49.6%)	10 (66.7%)	292 (50.1%)
told_to_go_to_the_emergency_department				0.10
- Yes	91 (16.1%)	0 (0.0%)	91 (15.7%)
- No	473 (83.9%)	14 (100.0%)	487 (84.3%)
Central scheduling				0.21
- Yes, central scheduling number	202 (35.6%)	3 (20.0%)	205 (35.2%)
- No	366 (64.4%)	12 (80.0%)	378 (64.8%)
call_time_minutes				0.01
- n	451	9	460
- Median (Q1, Q3)	2.0 (1.0, 3.5)	0.9 (0.8, 1.7)	2.0 (1.0, 3.5)
hold_time_minutes				0.12
- n	423	8	431
- Median (Q1, Q3)	0.4 (0.0, 1.8)	0.0 (0.0, 0.4)	0.4 (0.0, 1.7)
Day of the week Called				0.76
- N-Miss	1	0	1
- Thursday	133 (23.5%)	3 (20.0%)	136 (23.4%)
- Tuesday	114 (20.1%)	3 (20.0%)	117 (20.1%)
- Wednesday	174 (30.7%)	4 (26.7%)	178 (30.6%)
- Friday	84 (14.8%)	4 (26.7%)	88 (15.1%)
- Monday	61 (10.8%)	1 (6.7%)	62 (10.7%)
- Saturday	1 (0.2%)	0 (0.0%)	1 (0.2%)
business_days_until_appointment				0.57
- n	333	3	336
- Median (Q1, Q3)	14.0 (2.0, 32.0)	13.0 (12.5, 28.0)	13.5 (2.0, 32.0)

Wait Time by Insurance Figures

Waiting time in Days (Log Scale) for Blue Cross/Blue Shield versus Medicaid. The code you provided will create a scatter plot with points representing the relationship between the insurance variable (x-axis) and the days variable (y-axis). Additionally, it includes a line plot that connects points with the same npi value.

Line Plot

## Plots saved to: Melanie/Figures/urgent_GYN_vs_insurance_20241111_203102.tiff and Melanie/Figures/urgent_GYN_vs_insurance_20241111_203102.png

Here we show a scatterplot that compares the Private and Medicaid times. Notice that the graph is in logarithmic scale. Points above the diagonal line are providers for whom the Medicaid waiting time was longer than the private insurance waiting time.

We also see a strong linear association, indicating that providers with longer waiting time for private insurance tend to also have longer waiting times for Medicaid.

Scatter Plot

## Plots saved to: Melanie/Figures/urgent_gyn_vs_insurance_none_20241111_203103.tiff and Melanie/Figures/urgent_gyn_vs_insurance_none_20241111_203103.png

Density Plot

## Plots saved to: Melanie/Figures/urgent_GYN_vs_insurance_density_20241111_203104.tiff and Melanie/Figures/urgent_GYN_vs_insurance_density_20241111_203104.png

Wait Time by Scenario Figures

Waiting time in Days (Log Scale) for Blue Cross/Blue Shield versus Medicaid. The code you provided will create a scatter plot with points representing the relationship between the scenario variable (x-axis) and the days variable (y-axis). Additionally, it includes a line plot that connects points with the same NPI name value.

Line Plot

## Plots saved to: Melanie/Figures/urgent_GYN_vs_scenario_20241111_203105.tiff and Melanie/Figures/urgent_GYN_vs_scenario_20241111_203105.png

Here we show a scatterplot that compares the hip, knee, and shoulder times. Notice that the graph is in logarithmic scale.

Scatter Plot

## Plots saved to: Melanie/Figures/urgent_GYN_vs_scenario_none_20241111_203106.tiff and Melanie/Figures/urgent_GYN_vs_scenario_none_20241111_203106.png

Density Plot

Understanding a Density Plot:

A density plot is a smoothed version of a histogram that shows the distribution of a continuous variable. It represents the relative frequency of data points in different ranges of values, with areas under the curve corresponding to proportions of the data.

X-axis (Log Waiting Times in Days):
- The x-axis shows the logarithm of waiting times in days, meaning the waiting times have been transformed to a logarithmic scale to make the distribution more manageable or easier to interpret. A log transformation is often used when the raw data is skewed.
- Values closer to the left (lower on the x-axis) represent shorter waiting times, while values to the right (higher on the x-axis) represent longer waiting times.
Y-axis (Density):
- The y-axis represents density, which is the relative concentration of data points for a given range of values on the x-axis. The area under the entire curve sums to 1, meaning it reflects the proportion of observations.
- Higher peaks represent regions where there is a higher concentration of data points, while lower regions represent ranges with fewer data points.
Colors (Insurance):
- The two colors (purple for Blue Cross/Blue Shield and yellow for Medicaid) represent the distribution of waiting times for the two different insurance groups.
- The overlap between the two distributions is shaded, showing regions where both groups have similar waiting times.

How to Read the Density Plot: 1. Shape of the Distribution: - The shape of each curve tells you about the distribution of waiting times within each insurance group. - A peak indicates the most common waiting times for that group. - A wider curve indicates a more spread-out distribution, meaning the waiting times vary more within that group. - A narrower curve indicates that waiting times are more concentrated around the peak.

## Plots saved to: Melanie/Figures/urgent_GYN_vs_scenario_density_20241111_203106.tiff and Melanie/Figures/urgent_GYN_vs_scenario_density_20241111_203106.png

Statistical testing

Consider the following scenario:

You want to examine how insurance type affects waiting time for appointments. However, patients may differ in other ways, like their age and medical condition, which could also influence waiting times.

When fitting a regression model with waiting time as the dependent variable and insurance type as one of the predictors (along with other factors like age and medical condition), the EMMs would represent the average waiting time for each insurance type, adjusted for the effects of age and medical condition. This adjustment helps isolate the effect of insurance type on waiting time, ensuring the comparison between insurance types is fair.

Interpretation: In the plot you provided earlier, the Estimated Marginal Means for each scenario represent the average predicted waiting time for an appointment, adjusted for other factors in the model. This gives a clearer, model-based comparison of the expected waiting times across different medical scenarios, taking into account variability in other factors.

This image is a plot of Estimated Marginal Means (also known as least-squares means) for different scenarios. Each point represents the estimated marginal mean waiting time (in days) for a different medical scenario, and the error bars represent the 95% confidence intervals (CI) around these estimates.

Here’s a breakdown of the different components of the plot:

Y-axis:
- “Estimated Marginal Means for Waiting Time in Days”: The y-axis shows the estimated average waiting time in business days, which is the dependent variable in this model. The values range from about 15 to 30 days.
- The scale indicates that patients in different scenarios are estimated to wait between 15 and 30 days for an appointment, on average.
X-axis:
- “scenario”: The x-axis lists five different medical scenarios, which are the levels of the predictor variable. These are:
  1. Prior trip to ED and was found to have a 6 cm TOA: This scenario seems to be related to a prior emergency department (ED) visit and the discovery of a 6 cm tubo-ovarian abscess (TOA).
  2. Positive pregnancy test after a tubal ligation: This scenario involves a positive pregnancy test after a sterilization procedure (tubal ligation).
  3. Acute cystitis: This scenario involves a urinary tract infection, commonly known as acute cystitis.
  4. Recurrent/Treatment resistant vaginitis: This scenario refers to persistent or treatment-resistant vaginal infections.
The x-axis labels are rotated for readability, showing the different medical conditions (scenarios) being compared.
Estimated Marginal Means (Points on the Plot):
- Colored Points: Each colored point represents the estimated marginal mean waiting time for that specific scenario.
  - The different colors correspond to different scenarios, as explained by the legend at the bottom of the plot.
  - The vertical position of each point represents the estimated waiting time in business days.
Confidence Intervals (Error Bars):
- Error Bars: The vertical bars around each point represent the 95% confidence intervals. These intervals give a range within which the true mean waiting time for each scenario is expected to lie 95% of the time, based on the model.
  - Narrower intervals (e.g., the scenario “Positive pregnancy test after a tubal ligation”) indicate more precision in the estimate.
  - Wider intervals (e.g., “Recurrent/Treatment resistant vaginitis”) indicate more uncertainty or variability in the estimate.
Interpretation of the Estimated Marginal Means:

A simple rule of thumb is that if error bars for 95% confidence intervals overlap by less than about half the length of one error bar, the difference between the two groups might still be statistically significant. If the error bars overlap considerably, it’s more likely (but not guaranteed) that the difference between the groups is not statistically significant.

“Prior trip to ED and was found to have a 6 cm TOA”: The estimated mean waiting time for this scenario is around 20 days, with a relatively narrow confidence interval.
“Positive pregnancy test after a tubal ligation”: This scenario has the shortest estimated waiting time, just under 17 days, and the narrowest confidence interval, indicating a highly precise estimate.
“Acute cystitis”: This scenario has an estimated waiting time of around 25 days, and the confidence interval is slightly wider, indicating some variability.
“Recurrent/Treatment resistant vaginitis”: This scenario has the longest estimated waiting time, around 27 days, and the confidence interval is quite wide, indicating a high degree of uncertainty or variability in the estimate.

Combined plot of Subspecialty and Insurance

## Extracted interaction data:

##  scenario              insurance                  rate        SE  df asymp.LCL
##  TOA                   Blue Cross/Blue Shield 4.718326 0.9436969 Inf  3.188173
##  Pregnancy after tubal Blue Cross/Blue Shield 6.408744 1.2848820 Inf  4.326298
##  UTI                   Blue Cross/Blue Shield 3.459394 0.7314328 Inf  2.285743
##  Vaginitis             Blue Cross/Blue Shield 8.426966 1.6220978 Inf  5.778623
##  TOA                   Medicaid               6.438937 1.2938995 Inf  4.342760
##  Pregnancy after tubal Medicaid               5.607467 1.1338873 Inf  3.772637
##  UTI                   Medicaid               3.114619 0.6627911 Inf  2.052433
##  Vaginitis             Medicaid               6.909619 1.3393328 Inf  4.725639
##  asymp.UCL
##   6.982870
##   9.493566
##   5.235677
##  12.289045
##   9.546902
##   8.334670
##   4.726511
##  10.102936
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale

## 
## Scenario: TOA 
## Filtered data for scenario:
##  scenario insurance                  rate        SE  df asymp.LCL asymp.UCL
##  TOA      Blue Cross/Blue Shield 4.718326 0.9436969 Inf  3.188173  6.982870
##  TOA      Medicaid               6.438937 1.2938995 Inf  4.342760  9.546902
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Blue Cross/Blue Shield data:
##  scenario insurance                  rate        SE  df asymp.LCL asymp.UCL
##  TOA      Blue Cross/Blue Shield 4.718326 0.9436969 Inf  3.188173   6.98287
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Medicaid data:
##  scenario insurance     rate       SE  df asymp.LCL asymp.UCL
##  TOA      Medicaid  6.438937 1.293899 Inf   4.34276  9.546902
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Interaction p-value for scenario TOA : NA 
## Wait times for Medicaid are longer compared to Blue Cross/Blue Shield.
## 
## Scenario: Pregnancy after tubal 
## Filtered data for scenario:
##  scenario              insurance                  rate       SE  df asymp.LCL
##  Pregnancy after tubal Blue Cross/Blue Shield 6.408744 1.284882 Inf  4.326298
##  Pregnancy after tubal Medicaid               5.607467 1.133887 Inf  3.772637
##  asymp.UCL
##   9.493566
##   8.334670
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Blue Cross/Blue Shield data:
##  scenario              insurance                  rate       SE  df asymp.LCL
##  Pregnancy after tubal Blue Cross/Blue Shield 6.408744 1.284882 Inf  4.326298
##  asymp.UCL
##   9.493566
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Medicaid data:
##  scenario              insurance     rate       SE  df asymp.LCL asymp.UCL
##  Pregnancy after tubal Medicaid  5.607467 1.133887 Inf  3.772637   8.33467
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Interaction p-value for scenario Pregnancy after tubal : <0.01 
## Wait times for Medicaid are shorter compared to Blue Cross/Blue Shield.
## 
## Scenario: UTI 
## Filtered data for scenario:
##  scenario insurance                  rate        SE  df asymp.LCL asymp.UCL
##  UTI      Blue Cross/Blue Shield 3.459394 0.7314328 Inf  2.285743  5.235677
##  UTI      Medicaid               3.114619 0.6627911 Inf  2.052433  4.726511
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Blue Cross/Blue Shield data:
##  scenario insurance                  rate        SE  df asymp.LCL asymp.UCL
##  UTI      Blue Cross/Blue Shield 3.459394 0.7314328 Inf  2.285743  5.235677
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Medicaid data:
##  scenario insurance     rate        SE  df asymp.LCL asymp.UCL
##  UTI      Medicaid  3.114619 0.6627911 Inf  2.052433  4.726511
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Interaction p-value for scenario UTI : <0.01 
## Wait times for Medicaid are shorter compared to Blue Cross/Blue Shield.
## 
## Scenario: Vaginitis 
## Filtered data for scenario:
##  scenario  insurance                  rate       SE  df asymp.LCL asymp.UCL
##  Vaginitis Blue Cross/Blue Shield 8.426966 1.622098 Inf  5.778623  12.28904
##  Vaginitis Medicaid               6.909619 1.339333 Inf  4.725639  10.10294
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Blue Cross/Blue Shield data:
##  scenario  insurance                  rate       SE  df asymp.LCL asymp.UCL
##  Vaginitis Blue Cross/Blue Shield 8.426966 1.622098 Inf  5.778623  12.28904
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Medicaid data:
##  scenario  insurance     rate       SE  df asymp.LCL asymp.UCL
##  Vaginitis Medicaid  6.909619 1.339333 Inf  4.725639  10.10294
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## Interaction p-value for scenario Vaginitis : <0.01 
## Wait times for Medicaid are shorter compared to Blue Cross/Blue Shield.

## 
## Generated sentences:

## TOA: Patients with Blue Cross/Blue Shield insurance wait 4.7 days, with a 95% confidence interval (CI) ranging from 3.2 to 7.0 days. Medicaid recipients in this scenario experience longer waits, at 6.4 days with a CI of 4.3 to 9.5 days (p-value = NA).
## 
## Pregnancy after tubal: Patients with Blue Cross/Blue Shield insurance wait 6.4 days, with a 95% confidence interval (CI) ranging from 4.3 to 9.5 days. Medicaid recipients in this scenario experience shorter waits, at 5.6 days with a CI of 3.8 to 8.3 days (p-value = <0.01).
## 
## UTI: Patients with Blue Cross/Blue Shield insurance wait 3.5 days, with a 95% confidence interval (CI) ranging from 2.3 to 5.2 days. Medicaid recipients in this scenario experience shorter waits, at 3.1 days with a CI of 2.1 to 4.7 days (p-value = <0.01).
## 
## Vaginitis: Patients with Blue Cross/Blue Shield insurance wait 8.4 days, with a 95% confidence interval (CI) ranging from 5.8 to 12.3 days. Medicaid recipients in this scenario experience shorter waits, at 6.9 days with a CI of 4.7 to 10.1 days (p-value = <0.01).

Poisson Model The models need to be able to deal with NA in the business_days_until_appointment outcome variable (558) and also non-parametric data.

business_days_until_appointment can be transformed with a square root function so that 0 is not infinity from log(business_days_until_appointment).

Full Poisson Model `poisson_full_model`

$$ \[\begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \text{{Patient Insurance}} \\ & + \beta_2 \cdot \text{{US Census Bureau Subdivision}} \\ & + \beta_3 \cdot \text{{Physician Academic Affiliation}} \\ & + \beta_4 \cdot \text{{Physician Age}} \\ & + \beta_5 \cdot \text{{Physician Gender}} \\ & + \beta_6 \cdot \text{{Physician Honorrific}} \\ & + \beta_7 \cdot \text{{Physician US Census Bureau}} \\ & + \beta_8 \cdot \text{{UTI, TOA, Vaginitis, Ectopic Scenario}} \\ & + \beta_9 \cdot \text{{Date that the call was made}} \\ & + \beta_10 \cdot \text{{Appointment Central Number}} \\ & + \beta_11 \cdot \text{{Number of Phone Transfers}} \\ & + \beta_12 \cdot \text{{Minutes on the phone}} \\ & + \beta_13 \cdot \text{{Minutes on hold}} \\ & + \beta_14 \cdot \text{{Rurality}} \\ & + ( 1 | \text{{Physician NPI}}) \end{align*}\] $$

Single predictor models for `poisson_full_model`

What variables are significant in poisson_full_model? \[ \begin{align*} \log(\lambda) &= \beta_0 \\ & + \beta_1 \cdot \text{Individual Predictor} \\ & + (1 \mid \text{Physician NPI}) \end{align*} \]

This analysis explores the significance of various predictors on the outcome variable business_days_until_appointment, accounting for the random effects associated with physicians. The goal is to identify which variables significantly influence the time to appointment while controlling for variability across individual physicians.

The step-by-step approach demonstrates how individual predictors are assessed for their significance in influencing the response variable while accounting for the random effects associated with repeated measures on physicians. Significant variables will be used in the final multivariate model to better understand their impact on appointment wait times.

For poisson_full_model: This analysis explores the significance of various predictors on the outcome variable business_days_until_appointment, accounting for the random effects associated with physicians. The goal is to identify which variables significantly influence the time to appointment while controlling for variability across individual physicians.

##                        Predictor      P_Value               IRR
## 1                         gender 0.0004052041      0.0001794285
## 2                       academic 0.0022440459 578923.1830971744
## 3              hold_time_minutes 0.0403542942      5.1010672992
## 4                            age 0.0631738778      0.8078885085
## 5 Medicaid_to_Medicare_Fee_Index 0.1296461022      0.8919231411
## 6                        Med_sch 0.1789727708      0.0121337674
## 7                Grd_yr_category 0.1855833254    122.8246619928
##           CI_Lower            CI_Upper  Wait_Time_Effect
## 1   0.000001577188          0.02041266 shorter wait time
## 2 123.402476751912 2715926460.70752954  longer wait time
## 3   1.078788441501         24.12047311  longer wait time
## 4   0.645570901010          1.01101806 shorter wait time
## 5   0.769558824876          1.03374409 shorter wait time
## 6   0.000019773598          7.44570167 shorter wait time
## 7   0.100535232272     150055.82871536  longer wait time

##                        Predictor P_Value       IRR CI_Lower      CI_Upper
## 1                         gender   <0.01      0.00     0.00          0.02
## 2                       academic   <0.01 578923.18   123.40 2715926460.71
## 3              hold_time_minutes   0.040      5.10     1.08         24.12
## 4                            age   0.063      0.81     0.65          1.01
## 5 Medicaid_to_Medicare_Fee_Index   0.130      0.89     0.77          1.03
## 6                        Med_sch   0.179      0.01     0.00          7.45
## 7                Grd_yr_category   0.186    122.82     0.10     150055.83
##    Wait_Time_Effect
## 1 shorter wait time
## 2  longer wait time
## 3  longer wait time
## 4 shorter wait time
## 5 shorter wait time
## 6 shorter wait time
## 7  longer wait time

Significant Variables Predicting Number of Business Days until Appointment
Predictor	P_Value	IRR	CI_Lower	CI_Upper	Wait_Time_Effect
gender	<0.01	0.00	0.00	0.02	shorter wait time
academic	<0.01	578923.18	123.40	2715926460.71	longer wait time
hold_time_minutes	0.040	5.10	1.08	24.12	longer wait time
age	0.063	0.81	0.65	1.01	shorter wait time
Medicaid_to_Medicare_Fee_Index	0.130	0.89	0.77	1.03	shorter wait time
Med_sch	0.179	0.01	0.00	7.45	shorter wait time
Grd_yr_category	0.186	122.82	0.10	150055.83	longer wait time

Troubleshooting large IRR for `academic`

From the analysis and boxplot you provided, the issue with the high IRR seems clearer now. Let’s break down the results and address what might be going on:

Key Insights: 1. Sample Imbalance: - There is a major imbalance in the number of observations between Private Practice (556 cases) and University (47 cases). This discrepancy could lead to inflated coefficients, especially if the smaller group (University) has greater variability in wait times. This could explain why the estimate for academicUniversity is so large and significant.

Fixed Effects:
- The model indicates that being at a University is associated with a longer wait time, with an Estimate of 13.905 (p = 0.00124). This suggests that patients at University settings wait, on average, about 13.9 more days than those at private practices.
- However, due to the imbalance in the dataset and some high variance in wait times for university cases, this estimate might be exaggerated. The few outliers seen in the boxplot for University settings could be contributing to this as well.
Random Effects:
- The random effects (NPI) show variability among individual providers (standard deviation of 17.53). This means that individual providers still account for a fair amount of variation in wait times, which is typical in mixed-effects models.

Recommendations to Address the IRR Issue:

Consider Balancing the Dataset:
- The imbalance between University and Private Practice may lead to inflated estimates. You could try down-sampling the larger group (Private Practice) or performing bootstrapping to create a more balanced dataset. This might provide a more realistic estimate for the effect of academicUniversity.

Rerun `poisson_full_model` by removing `academic`

##                        Predictor      P_Value            IRR       CI_Lower
## 1                         gender 0.0004052041   0.0001794285 0.000001577188
## 2              hold_time_minutes 0.0403542942   5.1010672992 1.078788441501
## 3                            age 0.0631738778   0.8078885085 0.645570901010
## 4 Medicaid_to_Medicare_Fee_Index 0.1296461022   0.8919231411 0.769558824876
## 5                        Med_sch 0.1789727708   0.0121337674 0.000019773598
## 6                Grd_yr_category 0.1855833254 122.8246619928 0.100535232272
##          CI_Upper  Wait_Time_Effect
## 1      0.02041266 shorter wait time
## 2     24.12047311  longer wait time
## 3      1.01101806 shorter wait time
## 4      1.03374409 shorter wait time
## 5      7.44570167 shorter wait time
## 6 150055.82871536  longer wait time

##                        Predictor P_Value    IRR CI_Lower  CI_Upper
## 1                         gender   <0.01   0.00     0.00      0.02
## 2              hold_time_minutes   0.040   5.10     1.08     24.12
## 3                            age   0.063   0.81     0.65      1.01
## 4 Medicaid_to_Medicare_Fee_Index   0.130   0.89     0.77      1.03
## 5                        Med_sch   0.179   0.01     0.00      7.45
## 6                Grd_yr_category   0.186 122.82     0.10 150055.83
##    Wait_Time_Effect
## 1 shorter wait time
## 2  longer wait time
## 3 shorter wait time
## 4 shorter wait time
## 5 shorter wait time
## 6  longer wait time

Significant Variables Predicting Number of Business Days until Appointment WITHOUT ACADEMIC
Predictor	P_Value	IRR	CI_Lower	CI_Upper	Wait_Time_Effect
gender	<0.01	0.00	0.00	0.02	shorter wait time
hold_time_minutes	0.040	5.10	1.08	24.12	longer wait time
age	0.063	0.81	0.65	1.01	shorter wait time
Medicaid_to_Medicare_Fee_Index	0.130	0.89	0.77	1.03	shorter wait time
Med_sch	0.179	0.01	0.00	7.45	shorter wait time
Grd_yr_category	0.186	122.82	0.10	150055.83	longer wait time

Robust LMM with `log_business_days_until_appointments` with `academic`

## 
## Private Practice       University 
##              537               45

## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: formula_simple
##    Data: df3_filtered
## 
##      AIC      BIC   logLik deviance df.resid 
##   5388.4   5405.9  -2690.2   5380.4      578 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3249 -0.4584 -0.2239  0.2451  6.3292 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  NPI      (Intercept) 296.6    17.22   
##  Residual             354.8    18.84   
## Number of obs: 582, groups:  NPI, 401
## 
## Fixed effects:
##                    Estimate Std. Error      df t value             Pr(>|t|)    
## (Intercept)          16.851      1.230 348.861  13.701 < 0.0000000000000002 ***
## academicUniversity   13.269      4.313 385.358   3.076              0.00224 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## acdmcUnvrst -0.285

## Robust linear mixed model fit by DAStau 
## Formula: formula_simple 
##    Data: df3_filtered 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.0400 -0.7605 -0.2985  0.7410 13.3397 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  NPI      (Intercept)   0.0     0.00   
##  Residual             299.8    17.32   
## Number of obs: 582, groups: NPI, 401
## 
## Fixed effects:
##                    Estimate Std. Error t value
## (Intercept)         13.1685     0.7664  17.182
## academicUniversity   4.8409     2.7562   1.756
## 
## Correlation of Fixed Effects:
##             (Intr)
## acdmcUnvrst -0.278
## 
## Robustness weights for the residuals: 
##  483 weights are ~= 1. The remaining 99 ones are summarized as
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.101   0.389   0.570   0.583   0.754   0.995 
## 
## Robustness weights for the random effects: 
##  All 401 weights are ~= 1.
## 
## Rho functions used for fitting:
##   Residuals:
##     eff: smoothed Huber (k = 1.345, s = 10) 
##     sig: smoothed Huber, Proposal 2 (k = 1.345, s = 10) 
##   Random Effects, variance component 1 (NPI):
##     eff: smoothed Huber (k = 1.345, s = 10) 
##     vcp: smoothed Huber, Proposal 2 (k = 1.345, s = 10)

Robust LMM with log_business_days_until_appointments

\[ \begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \underline{\mathbf{\large{\text{{SINGLE PREDICTOR}}}}} \\ & + ( 1 | \text{{Physician NPI}}) \end{align*} \]

## The following predictors were found to be significant predicting business days until new patient appointment:
## -  gender : p = <0.01 
## -  hold_time_minutes : p = 0.04 
## -  age : p = 0.06 
## -  Medicaid_to_Medicare_Fee_Index : p = 0.13 
## -  Med_sch : p = 0.18 
## -  Grd_yr_category : p = 0.19

Model `poisson_significant` Formula with only significant variables

\[ \begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \text{{Patient Insurance}} \\ & + \beta_2 \cdot \text{{US Census Bureau Subdivision}} \\ & + \beta_3 \cdot \text{{Physician Academic Affiliation}} \\ & + ( 1 | \text{{Physician Name}}) \end{align*} \]

where:

Fixed effects include…
Random effects account for variability between physicians, modeled as a random intercept.

The random effect for physician suggests that there is substantial variability in appointment wait times between physician. Physicians with a higher random intercept will tend to have longer wait times compared to Physicians with a lower random intercept.

`poisson` Model with only significant variables

## Generalized linear mixed model fit by maximum likelihood (Adaptive
##   Gauss-Hermite Quadrature, nAGQ = 0) [glmerMod]
##  Family: poisson  ( log )
## Formula: business_days_until_appointment ~ gender + hold_time_minutes +  
##     age + Medicaid_to_Medicare_Fee_Index + Med_sch + Grd_yr_category +  
##     (1 | NPI)
##    Data: df3
## 
##      AIC      BIC   logLik deviance df.resid 
##   4535.9   4576.7  -2258.0   4515.9      425 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -6.5839 -0.7957  0.0010  0.0884  7.0372 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  NPI    (Intercept) 3.613    1.901   
## Number of obs: 435, groups:  NPI, 321
## 
## Fixed effects:
##                                        Estimate Std. Error z value Pr(>|z|)  
## (Intercept)                            2.991991   1.584621   1.888    0.059 .
## genderMale                            -0.362270   0.253352  -1.430    0.153  
## hold_time_minutes                     -0.026211   0.015268  -1.717    0.086 .
## age                                   -0.011317   0.022985  -0.492    0.622  
## Medicaid_to_Medicare_Fee_Index        -0.001761   0.007185  -0.245    0.806  
## Med_schInternational Medical Graduate -0.174718   0.300588  -0.581    0.561  
## Grd_yr_category1990 to 1999            0.021450   0.387624   0.055    0.956  
## Grd_yr_category2000 to 2009           -0.274714   0.525830  -0.522    0.601  
## Grd_yr_category2010 or greater        -0.569103   0.711369  -0.800    0.424  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) gndrMl hld_t_ age    M__M_F Md_IMG G__1t1 G__2t2
## genderMale   0.122                                                 
## hld_tm_mnts -0.032  0.006                                          
## age         -0.926 -0.256  0.019                                   
## Mdcd__M_F_I -0.306  0.051  0.009 -0.018                            
## Md_schIntMG -0.088 -0.080  0.014  0.021  0.114                     
## G__1990t199 -0.644  0.021  0.022  0.550  0.023  0.087              
## G__2000t200 -0.800  0.019  0.010  0.739  0.033  0.039  0.760       
## Grd__2010og -0.866 -0.068  0.017  0.851  0.005 -0.021  0.723  0.833

Table of `poisson_significant` Model Coefficients

Generic Interpretation of Significant Predictors: In a Poisson regression, significant predictors are those with p-values less than a chosen threshold (usually p < 0.05). These predictors have a statistically significant effect on the outcome variable—in this case, business days until an appointment. The Incidence Rate Ratios (IRRs) help interpret the direction and magnitude of these effects:

IRRs > 1: The predictor increases the expected mean number of business days. For example, an IRR of 2 means the expected waiting time is twice as long for that category compared to the reference group.
IRRs < 1: The predictor decreases the expected mean number of business days. For instance, an IRR of 0.5 means the waiting time is halved compared to the reference group.
p-values < 0.05: Indicate that the effect of the predictor is statistically significant, meaning it’s unlikely that the observed effect is due to random chance.

Analysis Based on Current Results

Examples of Significant Predictors:

Hold Time (IRR = 0.97, p = 0.033):
- Interpretation: For each additional minute spent on hold, the waiting time for an appointment decreases by 3% (IRR = 0.97). This effect is small but statistically significant (p = 0.033).
- Example: If a patient spends an extra 5 minutes on hold, the expected waiting time could decrease from 15 days to approximately 14.25 days.

Non-Significant Predictors: 1. Gender (Male) (IRR = 0.74, p = 0.227): - Interpretation: Being male is associated with a 26% reduction in waiting time compared to females (IRR = 0.74), but this effect is not statistically significant (p = 0.227).

Academic Setting (University) (IRR = 1.12, p = 0.770):
- Interpretation: Patients in university-affiliated settings are expected to wait 12% longer than those in non-university settings, but this effect is not statistically significant (p = 0.770).
Age (IRR = 0.99, p = 0.617):
- Interpretation: Increasing age slightly reduces waiting time, but the effect is minimal and not statistically significant.
Graduation Year Category (2010 or greater) (IRR = 0.59, p = 0.451):
- Interpretation: Physicians who graduated in 2010 or later were associated with a 41% reduction in waiting time, though this result is not statistically significant.
International Medical Graduates (IRR = 0.91, p = 0.761):
- Interpretation: Being an international medical graduate is associated with a 9% reduction in waiting time, though this effect is not statistically significant.

Random Effects and Marginal/Conditional R²:

Random Effects:
- Variance (NPI): 3.59 (indicating significant variability between NPIs).
- Intraclass Correlation Coefficient (ICC): 0.98 (suggesting that 98% of the variation in waiting times is explained by differences between NPIs).
Marginal R² (0.856): This means that 85.6% of the variance in waiting time is explained by the fixed effects in the model, such as hold time, gender, and practice setting.
Conditional R² (0.996): When accounting for both fixed effects and random effects (NPI variability), the model explains 99.6% of the total variance in waiting times.

Summary: The random effects model demonstrates that while some predictors, such as hold time, have a small but significant effect on waiting time, other factors like gender, academic setting, and graduation year show non-significant effects in this model. The high ICC (0.98) indicates that the majority of the variability in waiting times is due to differences between providers (NPIs), and the conditional R² (0.996) suggests that the model is highly effective at explaining overall variance when including both fixed and random effects.

	business days until appointment
Predictors	Incidence Rate Ratios	CI	p
(Intercept)	19.93	0.89 – 444.87	0.059
gender [Male]	0.70	0.42 – 1.14	0.153
hold time minutes	0.97	0.95 – 1.00	0.086
age	0.99	0.95 – 1.03	0.622
Medicaid to Medicare Fee Index	1.00	0.98 – 1.01	0.806
Med sch [International Medical Graduate]	0.84	0.47 – 1.51	0.561
Grd yr category [1990 to 1999]	1.02	0.48 – 2.18	0.956
Grd yr category [2000 to 2009]	0.76	0.27 – 2.13	0.601
Grd yr category [2010 or greater]	0.57	0.14 – 2.28	0.424
Random Effects
σ²	0.02
τ₀₀ _NPI	3.61
ICC	0.99
N _NPI	321
Observations	435
Marginal R² / Conditional R²	0.017 / 0.993

Visualize the `poisson_significant` modelFixed Effects

`poisson_significant` Model Performance

## We fitted a poisson mixed model (estimated using ML and BOBYQA optimizer) to
## predict business_days_until_appointment with gender, hold_time_minutes, age,
## Medicaid_to_Medicare_Fee_Index, Med_sch and Grd_yr_category (formula:
## business_days_until_appointment ~ gender + hold_time_minutes + age +
## Medicaid_to_Medicare_Fee_Index + Med_sch + Grd_yr_category). The model included
## NPI as random effect (formula: ~1 | NPI). The model's total explanatory power
## is substantial (conditional R2 = 0.99) and the part related to the fixed
## effects alone (marginal R2) is of 0.02. The model's intercept, corresponding to
## gender = Female, hold_time_minutes = 0, age = 0, Medicaid_to_Medicare_Fee_Index
## = 0, Med_sch = US Senior Medical Student and Grd_yr_category = Less than 1990,
## is at 2.99 (95% CI [-0.11, 6.10], p = 0.059). Within this model:
## 
##   - The effect of gender [Male] is statistically non-significant and negative
## (beta = -0.36, 95% CI [-0.86, 0.13], p = 0.153; Std. beta = -0.36, 95% CI
## [-0.86, 0.13])
##   - The effect of hold time minutes is statistically non-significant and negative
## (beta = -0.03, 95% CI [-0.06, 3.71e-03], p = 0.086; Std. beta = -0.04, 95% CI
## [-0.08, 5.16e-03])
##   - The effect of age is statistically non-significant and negative (beta =
## -0.01, 95% CI [-0.06, 0.03], p = 0.622; Std. beta = -0.12, 95% CI [-0.60,
## 0.36])
##   - The effect of Medicaid to Medicare Fee Index is statistically non-significant
## and negative (beta = -1.76e-03, 95% CI [-0.02, 0.01], p = 0.806; Std. beta =
## -0.03, 95% CI [-0.25, 0.20])
##   - The effect of Med sch [International Medical Graduate] is statistically
## non-significant and negative (beta = -0.17, 95% CI [-0.76, 0.41], p = 0.561;
## Std. beta = -0.17, 95% CI [-0.76, 0.41])
##   - The effect of Grd yr category [1990 to 1999] is statistically non-significant
## and positive (beta = 0.02, 95% CI [-0.74, 0.78], p = 0.956; Std. beta = 0.02,
## 95% CI [-0.74, 0.78])
##   - The effect of Grd yr category [2000 to 2009] is statistically non-significant
## and negative (beta = -0.27, 95% CI [-1.31, 0.76], p = 0.601; Std. beta = -0.27,
## 95% CI [-1.31, 0.76])
##   - The effect of Grd yr category [2010 or greater] is statistically
## non-significant and negative (beta = -0.57, 95% CI [-1.96, 0.83], p = 0.424;
## Std. beta = -0.57, 95% CI [-1.96, 0.83])
## 
## Standardized parameters were obtained by fitting the model on a standardized
## version of the dataset. 95% Confidence Intervals (CIs) and p-values were
## computed using a Wald z-distribution approximation.

## The marginal R² value of the model is 0.017 and the conditional R² value is 0.993

## The marginal R² represents the proportion of variance explained by the fixed effects ( (Intercept), genderMale, hold_time_minutes, age, Medicaid_to_Medicare_Fee_Index, Med_schInternational Medical Graduate, Grd_yr_category1990 to 1999, Grd_yr_category2000 to 2009, Grd_yr_category2010 or greater ) alone ( 1.67 %). The conditional R² represents the proportion of variance explained by both the fixed effects and the random effects ( NPI ) combined ( 99.34 %). This indicates how much of the variability in the outcome can be attributed to the fixed effects versus the entire model, including random effects.

For poisson_significant model: To determine which random effects were significant in your model, you need to look at the variance components for the random effects and their corresponding standard deviations. In mixed models, random effects themselves do not have p-values like fixed effects do. Instead, you evaluate their significance by looking at the variance of the random effects. If the variance is near zero, the random effect may not be contributing much to the model.

Here’s how you can extract and interpret the variance of the random effects to assess their significance for poisson_significant:

## [1] "The random effects in the model are:\n NPI"             
## [2] "The random effects in the model are:\n (Intercept)"     
## [3] "The random effects in the model are:\n NA"              
## [4] "The random effects in the model are:\n 3.61312108307961"
## [5] "The random effects in the model are:\n 1.90082116020409"
## [6] "The random effects in the model are:\n Yes"

## The significant random effects are: NPI

`simr_poisson_full_model` Model Power analysis

The power analysis you’ve conducted with the powerSim function is used to estimate the statistical power of your model for detecting effects of a specific predictor—in this case, the predictor insurance in a Poisson mixed-effects model.

Test the `poisson_significant` model assumptions

Checking the binned residuals because the data is non-parametric the residuals will not be normally distributed. Collinearity was tested as well as heteroscedasticity was checked.

The residuals appear to be spread out more as the fitted values increase. This funnel shape (with wider dispersion of residuals at higher fitted values) is an indication of heteroscedasticity. In a model with homoscedasticity, the residuals would have a consistent spread across all levels of fitted values, without a clear pattern. The data is non-parametric so the residuals will not be within error bounds.

`poisson_significant` Collinearity

Variance Inflation Factors (VIF) were calculated to assess multicollinearity among predictors. All VIF values were below the commonly used threshold of 5, suggesting that multicollinearity is not a concern for this model.

Variable Importance Factors
	GVIF	Df	GVIF^(1/(2*Df))
gender	1.228796	1	1.108511
hold_time_minutes	1.001360	1	1.000679
age	4.677638	1	2.162785
Medicaid_to_Medicare_Fee_Index	1.019788	1	1.009845
Med_sch	1.054505	1	1.026891
Grd_yr_category	4.628542	3	1.290944

## 86 outliers detected: cases 8, 9, 15, 17, 18, 21, 25, 30, 32, 35, 36,
##   43, 54, 55, 58, 60, 63, 68, 69, 72, 73, 92, 100, 105, 113, 133, 137,
##   145, 151, 152, 154, 158, 159, 171, 180, 181, 190, 196, 197, 198, 203,
##   208, 209, 216, 218, 226, 227, 247, 251, 252, 260, 261, 262, 263, 267,
##   268, 278, 284, 285, 295, 296, 298, 301, 309, 313, 319, 321, 331, 338,
##   342, 344, 348, 355, 357, 362, 363, 365, 366, 389, 396, 399, 408, 409,
##   413, 425, 428.
## - Based on the following method and threshold: cook (0.9).
## - For variable: (Whole model).

`poisson` Intraclass Correlation Coefficient

The Intraclass Correlation Coefficient (ICC) is a statistical measure used to evaluate the proportion of variance in a dependent variable that can be attributed to differences between groups or clusters. It is commonly used in the context of hierarchical or mixed models to quantify the degree of similarity within clusters.

## The intraclass correlation (ICC) of the model for the random effect group ' NPI ' is 0.783 .
## This indicates that 78.3 % of the variance in the outcome variable is attributable to differences between the NPI groups.

## 
##  This is considered a high ICC for the NPI group, indicating that most of the variance is due to differences between these groups.

A low to moderate Intraclass Correlation Coefficient (ICC) for the group “physician NPI name” suggests that while there is some variation in the outcome variable (e.g., business days until appointment) that can be attributed to differences between individual physicians, a substantial portion of the variation occurs within these groups—meaning that much of the variability in appointment times is due to factors other than just the differences between physicians.

In practical terms, this indicates that:

Variation Between Physicians: The fact that the ICC is not zero means that there is some consistency in the appointment times associated with each physician. Some physicians might systematically have longer or shorter wait times, contributing to the variance in the data.
Variation Within Physicians: Since the ICC is low to moderate, it means that even within the same physician, there is considerable variability in appointment times. This could be due to a variety of factors, such as the type of insurance, the scenario, or other factors that are not captured by the physician’s identity alone.
Implications: The low to moderate ICC suggests that while the identity of the physician (as indicated by the NPI name) does have an effect, it is not the dominant factor driving differences in appointment times. Other factors—potentially those captured by fixed effects or residual variance—are also playing a significant role.

In summary, while who the physician is does matter to some extent, other variables are likely more influential in determining how long a patient waits for an appointment. This insight can guide you to look more closely at those other factors in your analysis or to consider whether there are ways to reduce variability within physicians, such as through standardized scheduling practices.

`poisson_significant` Dispersion

Overdispersion in your model implies that the variability in the observed data is greater than what the model predicts under the Poisson assumption. Specifically, in a Poisson model, the mean and variance of the count data are assumed to be equal.

## [1] "Significant overdispersion detected. Consider using a Negative Binomial model or adding random effects to account for overdispersion."

## Warning: Autocorrelated residuals detected (p < .001).

## [1] FALSE

Testing assumptions you can use the logLik function to get the log-likelihood of the model, and calculate the residual deviance as -2 * logLik(model). The residual degrees of freedom can be computed as the number of observations minus the number of parameters estimated (which includes both fixed effects and random effects).

The number of parameters estimated can be calculated as the number of fixed effects plus the number of random effects parameters. The number of fixed effects can be obtained from the length of fixef(model), and the number of random effects parameters can be obtained from the length of VarCorr(model).

If the dispersion parameter is considerably greater than 1, it indicates overdispersion. If it is less than 1, it indicates underdispersion. A value around 1 is considered ideal for Poisson regression.

## 'log Lik.' 7.894981 (df=10)

Linearity of logit

The Poisson regression assumes that the log of the expected count is a linear function of the predictors. One way to check this is to plot the observed counts versus the predicted counts and see if the relationship looks linear.

{r} # # geography level: https://api.census.gov/data/2016/acs/acs5/geography.html # # # Vector of all state abbreviations, including DC # state_abbreviations <- c(state.abb, "DC") # # # Retrieve ACS data for all ZIP codes (ZCTAs) without specifying a state # income_data <- tidycensus::get_acs( # geography = "zcta", # variables = "B19013_001", # Median Household Income # survey = "acs5", # year = 2022, # cache_table = TRUE # ) %>% dplyr::select(-moe) # # # Step 1: Prepare all_income_data with correct column names # # Assuming data_dir is the directory where you want to save the file # all_income_data <- income_data %>% # dplyr::rename(zip_code = GEOID, median_income = estimate) %>% # dplyr::mutate(zip_code = as.character(zip_code)) # Ensure zip_code is character # # # Save the all_income_data dataframe to the specified file path # file_path <- file.path(data_dir, "median_household_income_by_zcta.rds") # readr::write_rds(all_income_data, file_path) # # # Step 2: Prepare all_zips with consistent column names # all_zips <- zipcodeR::zip_code_db %>% # dplyr::rename(zip_code = zipcode) %>% # Rename 'zipcode' to 'zip_code' # dplyr::select(zip_code, state) # Keep only relevant columns # # # Construct the full file path # file_path <- file.path(data_dir, "Phase_2.rds") # ucc_data <- readRDS(file_path) %>% # dplyr::rename(id_number = ID) %>% # dplyr::rename(zip_code = zip) %>% # mutate(state = exploratory::statecode(state, output = "alpha_code")) # # # Create a dataframe to store the top 10% affluent ZIP codes for each state # affluent_zip_codes_summary <- data.frame(state = character(), zip_code = character(), median_income = numeric(), stringsAsFactors = FALSE) # # # Loop through each state abbreviation # for (state_abbreviation in state_abbreviations) { # # # Filter for the state's affluent ZIP codes (top 10%) # affluent_zip_codes <- all_income_data %>% # dplyr::inner_join(all_zips %>% dplyr::filter(state == state_abbreviation), by = "zip_code") %>% # dplyr::arrange(desc(median_income)) %>% # dplyr::slice(1:ceiling(n() * 0.10)) # Top 10% affluent ZIP codes in the state # # # Append to the affluent ZIP codes summary # if (nrow(affluent_zip_codes) > 0) { # affluent_zip_codes <- affluent_zip_codes %>% # dplyr::mutate(state = state_abbreviation) %>% # dplyr::select(state, zip_code, median_income) # # affluent_zip_codes_summary <- dplyr::bind_rows(affluent_zip_codes_summary, affluent_zip_codes) %>% dplyr::arrange(zip_code) # } # } # # # View the summary of affluent ZIP codes # print(affluent_zip_codes_summary) #

{r} # # physicians_in_affluent_zips<- affluent_zip_codes_summary # # Calculate the percentage of physicians in affluent ZIP codes for each state # physician_affluent_summary <- ucc_data %>% # dplyr::group_by(state) %>% # dplyr::summarize(total_physicians = n()) %>% # dplyr::left_join( # physicians_in_affluent_zips %>% # dplyr::group_by(state) %>% # dplyr::summarize(physicians_in_affluent = n()), # by = "state" # ) %>% # dplyr::mutate( # percent_in_affluent = (physicians_in_affluent / total_physicians) * 100 # ) # # # Replace NA with 0 for states with no physicians in affluent ZIP codes # physician_affluent_summary <- physician_affluent_summary %>% # dplyr::mutate( # physicians_in_affluent = ifelse(is.na(physicians_in_affluent), 0, physicians_in_affluent), # percent_in_affluent = ifelse(is.na(percent_in_affluent), 0, percent_in_affluent) # ) # # # Calculate the overall percentage for physicians in U.S. affluent ZIP codes # total_physicians_us <- nrow(ucc_data) # physicians_in_affluent_us <- nrow(physicians_in_affluent_zips) # # # Calculate the percentage # percent_in_affluent_us <- (physicians_in_affluent_us / total_physicians_us) * 100 # # # Add US-level summary to the output # us_summary <- data.frame( # state = "US", # total_physicians = total_physicians_us, # physicians_in_affluent = physicians_in_affluent_us, # percent_in_affluent = percent_in_affluent_us # ) # # # Combine state and US-level summaries # physician_affluent_summary <- dplyr::bind_rows(physician_affluent_summary, us_summary) # # # View the summary of physicians in affluent ZIP codes for each state and the U.S. overall # print(physician_affluent_summary, n = nrow(physician_affluent_summary)) #

DRAFT: Urgent GYN Issue Mystery Caller Study

Tyler M. Muffly, MD and Melanie Mandell

11 November, 2024

tyler install

Read in data

Quality Check the Data

Are there any physicians included more than twice?

Variables of those physicians included more than twice?

Find physicians called more than three times

Do they have exclusion and have a business_days_until_appointment >0?

Do they have business_days_until_appointment >0 but are an excluded category?

Do they have NA for business_days_until_appointment but are “Included” in the Reasons for exclusion category?

Check data normality

Results

Create Median Household Income Quantiles

Zip Analysis

National percentage of physicians in most affluent ZIP Codes

Insurance Acceptance Rates

Steps to Calculate Medicaid Acceptance Rate

Told to seek Emergency Care

Appointment Accessibility

Univariate Analysis

Variable Selection

Scenarios for Variable Selection

Number of offices with each of the four scenarios successfully contacted: business_days_until_appointment ~ scenario

Insurance

Exclusions

Visualizing the Each Individual Predictor

Business days by insurance

Log Business Days

told_to_go_to_the_emergency_depa for emergency scenario types

Emergency vs Urgent scenario types

Day of the week by insurance

Central Appointment Line by Insurance

Physician Gender by Insurance

Physician MD vs. DO by Insurance

Scenario

Descriptive Tables

Table 1 - Split across Insurances

Wait Time by Insurance Figures

Line Plot

Scatter Plot

Density Plot

Wait Time by Scenario Figures

Line Plot

Scatter Plot

Density Plot

Understanding a Density Plot:

Statistical testing

Combined plot of Subspecialty and Insurance

Full Poisson Model poisson_full_model

Single predictor models for poisson_full_model

Troubleshooting large IRR for academic

Rerun poisson_full_model by removing academic

Robust LMM with log_business_days_until_appointments with academic

Model poisson_significant Formula with only significant variables

poisson Model with only significant variables

Table of poisson_significant Model Coefficients

Visualize the poisson_significant modelFixed Effects

poisson_significant Model Performance

simr_poisson_full_model Model Power analysis

Test the poisson_significant model assumptions

poisson_significant Collinearity

poisson Intraclass Correlation Coefficient

poisson_significant Dispersion

Linearity of logit

Number of offices with each of the four scenarios successfully contacted: `business_days_until_appointment ~ scenario`

Full Poisson Model `poisson_full_model`

Single predictor models for `poisson_full_model`

Troubleshooting large IRR for `academic`

Rerun `poisson_full_model` by removing `academic`

Robust LMM with `log_business_days_until_appointments` with `academic`

Model `poisson_significant` Formula with only significant variables

`poisson` Model with only significant variables

Table of `poisson_significant` Model Coefficients

Visualize the `poisson_significant` modelFixed Effects

`poisson_significant` Model Performance

`simr_poisson_full_model` Model Power analysis

Test the `poisson_significant` model assumptions

`poisson_significant` Collinearity

`poisson` Intraclass Correlation Coefficient

`poisson_significant` Dispersion