FYI: If someone was told to go to the ED then we make their business days until appoint == 0.

tyler install

Relative Directories

Read in data

## Rows: 585
## Columns: 29
## $ first                                                                                             <chr> …
## $ last                                                                                              <chr> …
## $ scenario                                                                                          <fct> …
## $ Subspecialty                                                                                      <fct> …
## $ state                                                                                             <fct> …
## $ practice_setting                                                                                  <fct> …
## $ NPI                                                                                               <dbl> …
## $ able_to_contact_office                                                                            <fct> …
## $ call_date_wday                                                                                    <ord> …
## $ central_number                                                                                    <fct> …
## $ number_of_transfers                                                                               <fct> …
## $ call_time_minutes                                                                                 <dbl> …
## $ business_days_until_appointment                                                                   <dbl> …
## $ will_this_physician_see_children_if_they_ask_about_insurance_say_they_have_blue_cross_blue_shield <fct> …
## $ reason_for_exclusions                                                                             <fct> …
## $ hold_time_minutes                                                                                 <dbl> …
## $ day_of_the_week                                                                                   <ord> …
## $ contacted                                                                                         <dbl> …
## $ city                                                                                              <fct> …
## $ gender                                                                                            <fct> …
## $ honorrific                                                                                        <fct> …
## $ Teledermatology                                                                                   <fct> …
## $ languages_spoken                                                                                  <chr> …
## $ CareCredit_accepted                                                                               <fct> …
## $ Age                                                                                               <dbl> …
## $ Division                                                                                          <fct> …
## $ Rural_Urban                                                                                       <fct> …
## $ median_household_income_2022                                                                      <dbl> …
## $ total_under_21                                                                                    <dbl> …

DOES THE DATA HAVE LEVELS SET?????

Quality Check the Data

Are there any physicians included more than twice?

Included More than Four Times
NPI N
1427169531 4
1598893851 4

Variables of those physicians included more than twice?

Variables of Physicians Included More Than Four Times
NPI reason_for_exclusions business_days_until_appointment
1427169531 Able to contact 3
1427169531 Went to voicemail NA
1427169531 Able to contact 7
1427169531 Phone not answered or busy signal on repeat calls NA
1598893851 Able to contact 8
1598893851 Able to contact 120
1598893851 Able to contact 120
1598893851 Able to contact 120

Find physicians called more than three times

NPI numbers called more than thrice
NPI calls_count
1427169531 4
1598893851 4

Do they have exclusion and have an appt?
city NPI reason_for_exclusions business_days_until_appointment
Seattle 1861507113 Physician referral required before scheduling appointment 118
Marlton 1245423854 Greater than 5 minutes on hold 5
Salt Lake City 1124040704 Number contacted did not correspond to expected office/specialty 40

Do they have business_days_until_appointment greater than zero but are an excluded category?

Records with Appointments but in Excluded Category
NPI reason_for_exclusions business_days_until_appointment
1124040704 Number contacted did not correspond to expected office/specialty 40
1245423854 Greater than 5 minutes on hold 5
1861507113 Physician referral required before scheduling appointment 118

Do they have NA for business_days_until_appointment but are “Included” in the Reasons for exclusion category?

Included Records with NA for Appointments
NPI reason_for_exclusions business_days_until_appointment
1255569273 Able to contact NA
1386732923 Able to contact NA
1457402109 Able to contact NA
1952501066 Able to contact NA
1487040275 Able to contact NA
1861434094 Able to contact NA
1578506218 Able to contact NA
1710902382 Able to contact NA
1992853154 Able to contact NA
1902947484 Able to contact NA
1386965408 Able to contact NA
1053734905 Able to contact NA
1255893517 Able to contact NA
1578514824 Able to contact NA
1578514824 Able to contact NA
1124552690 Able to contact NA
1184019010 Able to contact NA
1376507913 Able to contact NA
1457662561 Able to contact NA
1306342811 Able to contact NA
1184688657 Able to contact NA
1760915078 Able to contact NA
1801843537 Able to contact NA
1306340914 Able to contact NA
1396057105 Able to contact NA
1215490644 Able to contact NA
1720044134 Able to contact NA
1760449292 Able to contact NA
1568528081 Able to contact NA
1659697431 Able to contact NA
1164560637 Able to contact NA
1740252493 Able to contact NA
1366425290 Able to contact NA
1235633637 Able to contact NA
1245723550 Able to contact NA
1982968103 Able to contact NA
1275506081 Able to contact NA
1063401305 Able to contact NA
1962851345 Able to contact NA
1154454353 Able to contact NA
1912931635 Able to contact NA
1538172051 Able to contact NA
1659543577 Able to contact NA
1063434223 Able to contact NA
1316383730 Able to contact NA
1992757751 Able to contact NA
1669818589 Able to contact NA

Results

Check data normality

The data represented in the Q-Q plot is not normally distributed. Specifically:

Positive Skewness: The data points deviate significantly above the reference line on the right-hand side, indicating a heavy right tail. This suggests that the business_days_until_appointment variable includes a few cases with much longer wait times than the majority.

Non-Normal Distribution: The points diverge from the reference line, especially at the tails, confirming that the data does not follow a normal distribution. This indicates that the assumption of normality for methods like a t-test is violated.

Presence of Outliers: Several points, particularly in the upper-right region of the plot, deviate considerably from the line. These are likely outliers with unusually long wait times.

Given these observations:

A t-test is not appropriate because it assumes normality and the data represents counts.

A better approach would be to use Poisson regression, which is well-suited for count data and allows for a more appropriate comparison of the incidence rate of business_days_until_appointment across categories such as insurance types.

## Starting normality check and summary calculation for variable: business_days_until_appointment
## Data extracted for variable: business_days_until_appointment
## Shapiro-Wilk normality test completed with p-value: 0.000000000000320937002443766
## The p-value is less than or equal to 0.05, indicating that the data is not normally distributed.
## Histogram with Density Plot created.
## Q-Q Plot created.

## Data is NOT normally distributed. Median: 73, IQR: 91
## $median
## [1] 73
## 
## $iqr
## [1] 91
## Summary calculation completed for variable: business_days_until_appointment
## $median
## [1] 73
## 
## $iqr
## [1] 91

Appointment Accessibility

## [1] "Physicians were successfully contacted in 30 states including the District of Columbia. The excluded states include Alabama, Alaska, Delaware, Hawaii, Idaho, Iowa, Kansas, Kentucky, Louisiana, Maine, Mississippi, Montana, Nevada, New Hampshire, North Dakota, Oklahoma, South Carolina, South Dakota, Vermont, West Virginia and Wyoming."

Age Physician Description

Gender Physician Description

## In our dataset, the most common physician gender was Female (n = 448/N = 580, 77.2%).

Exclusions

Table of Exclusions

Count the number of unique physicians contacted

To determine the number of unique physicians contacted, you can count the distinct NPI values in df_filtered, as NPI (National Provider Identifier) is unique to each physician.

## [1] 372
## [1] 230
## [1] 363

To determine the total number of phone calls made and the unique number of physicians contacted, you can run the following R code: Step 1: Count the total number of calls

## [1] 585

Step 2: Count the unique number of physicians contacted

## [1] 372
## Total Phone Calls Made: 585
## Unique Physicians Contacted: 372

To count the number of calls and unique physicians that were excluded, you can use the following R code:

Step 1: Count the total number of excluded phone calls

## [1] 222

Step 2: Count the unique number of excluded physicians

## [1] 181
## Total Excluded Phone Calls: 222
## Unique Excluded Physicians: 181
To count the number of phone calls and unique physicians excluded for each of these categories, you can use the following R code:
Step 2: Count the number of unique physicians excluded by category
## Excluded Phone Calls by Category:
## # A tibble: 7 × 2
##   reason_for_exclusions                                                n
##   <fct>                                                            <int>
## 1 Not accepting new patients                                          88
## 2 Physician referral required before scheduling appointment           40
## 3 Number contacted did not correspond to expected office/specialty    37
## 4 Greater than 5 minutes on hold                                      26
## 5 Went to voicemail                                                   21
## 6 Phone not answered or busy signal on repeat calls                    9
## 7 Physician's personal phone                                           1
## 
## Excluded Physicians by Category:
## # A tibble: 7 × 2
##   reason_for_exclusions                                                n
##   <fct>                                                            <int>
## 1 Not accepting new patients                                          77
## 2 Number contacted did not correspond to expected office/specialty    33
## 3 Physician referral required before scheduling appointment           32
## 4 Greater than 5 minutes on hold                                      23
## 5 Went to voicemail                                                   20
## 6 Phone not answered or busy signal on repeat calls                    9
## 7 Physician's personal phone                                           1

Count Unique Physicians That Were Excluded

## [1] 181
Step 2: Count Phone Calls by Exclusion Category
## Of the excluded calls, 88 (40%) not accepting new patients, 40 (18%) physician referral required before scheduling appointment, 37 (17%) number contacted did not correspond to expected office/specialty, 26 (12%) greater than 5 minutes on hold, 21 (9%) went to voicemail, 9 (4%) phone not answered or busy signal on repeat calls, and 1 (0%) physician's personal phone.

Phone Calls successfully contact

## [1] "Of the total 585 phone calls made, 363 were successfully connected, and 222 were excluded."

Number of patients accepting new patients

Review all Exclusion

Venn Diagram of Exclusions

This Venn diagram provides a visual representation of the overlap among three sets of criteria for General Dermatology physicians:

  1. “Able to Contact” (Red Circle): Calls who were successfully contacted.
  2. “Accepts Pediatric” (Green Circle): Calls who accept pediatric patients.
  3. “Business Days > 0” (Blue Circle): Calls with a positive number of business days until an appointment was available.

Breakdown of the Diagram:

  • 44 (Red Only): These Calls were able to be contacted but neither accept pediatric patients nor have appointments with business days > 0.
  • 37 (Green Only): These Calls accept pediatric patients but were not able to be contacted and do not have business days > 0.
  • 2 (Red & Green Overlap): These Calls were able to be contacted and accept pediatric patients but do not have appointments with business days > 0.
  • 91 (All Three Overlap - Center): These Calls met all three criteria: able to be contacted, accept pediatric patients, and have business days > 0.
  • 1 (Blue & Green Overlap): These Calls accept pediatric patients and have business days > 0 but were not able to be contacted.

This Venn diagram provides a visual representation of the overlap among three sets of criteria for Pediatric Dermatology calls:

Criteria Represented:

  1. “Business Days > 0” (Blue Circle): Calls with an appointment available (i.e., business days until the appointment is greater than 0).
  2. “Able to Contact” (Green Circle): Calls who were successfully contacted.
  3. “Accepts Pediatric” (Pink Circle): Calls who accept pediatric patients.

Breakdown of the Venn Diagram:

  • 66 (Green Only): Calls who accept pediatric patients but were neither successfully contacted nor had an appointment with business days > 0.
  • 2 (Blue Only): Calls who had an appointment with business days > 0 but were not successfully contacted and do not accept pediatric patients.
  • 3 (Green & Blue Overlap): Calls who had an appointment with business days > 0 and accept pediatric patients but were not successfully contacted.
  • 223 (Center - All Three Overlap): Calls who:
    • Had an appointment with business days > 0,
    • Were successfully contacted, and
    • Accept pediatric patients.

City data

Pediatric vs. General Wait Times by City

How many cities had a longer wait time for a pediatric dermatologist compared to a general dermatologist?

Cities ranked by how many business days longer it takes to see a Pediatric Dermatologist versus General Dermatologist
General Dermatology Pediatric Dermatology city_state diff_ped_vs_gen
25.0 235.0 Saint Louis, Missouri 210.0
43.5 213.0 Minneapolis, Minnesota 169.5
11.0 174.5 Chicago, Illinois 163.5
5.0 164.0 Phoenix, Arizona 159.0
6.5 149.0 Atlanta, Georgia 142.5
71.0 200.0 Gainesville, Florida 129.0
14.0 109.5 Gilbert, Arizona 95.5
114.0 207.0 Baltimore, Maryland 93.0
6.0 96.5 Houston, Texas 90.5
9.0 99.0 Austin, Texas 90.0
Cities ranked by how many business days longer it takes to see a General Dermatologist versus Pediatric Dermatologist
city_state General Dermatology Pediatric Dermatology diff_ped_vs_gen
Clackamas, Oregon 314.0 29.0 -285.0
Cincinnati, Ohio 175.5 39.0 -136.5
Sacramento, California 135.0 83.5 -51.5
New Hyde Park, New York 167.0 120.0 -47.0
Tucson, Arizona 159.0 129.5 -29.5
Cleveland, Ohio 101.0 74.0 -27.0
Mesa, Arizona 40.0 23.0 -17.0
Seattle, Washington 120.0 104.0 -16.0
Bronx, New York 26.5 11.5 -15.0
Indianapolis, Indiana 12.0 2.0 -10.0

The models need to be able to deal with NA in the business_days_until_appointment outcome variable (266) and also non-parametric data.

4 Exploratory Data Analysis (EDA)

Graph each variable

4.1 Histogram Plots for Each Predictor

4.2 Business Days by Insurance

Interpretation: ### Interpretation of the Histogram Plot

This histogram displays the distribution of business days until an appointment across two categories: General Dermatology and Pediatric Dermatology. Here’s the interpretation:

  1. General Dermatology:
    • The distribution is heavily right-skewed, with the majority of appointments scheduled within a short wait time (0–50 business days).
    • A significant drop-off is observed as the number of business days increases, with very few appointments exceeding 150 days.
    • This suggests that most general dermatology appointments are scheduled quickly, but there are occasional outliers with extended wait times.
  2. Pediatric Dermatology:
    • The distribution also shows a positive skew, though it appears more concentrated around 50–150 business days compared to general dermatology.
    • Pediatric dermatology has a higher concentration of longer wait times (e.g., 100–200 days) compared to general dermatology, indicating that scheduling pediatric dermatology appointments may generally take longer.
  3. Comparison Between Specialties:
    • General Dermatology has a higher frequency of short wait times (0–50 days) compared to Pediatric Dermatology, where the wait times are more spread out.
    • Pediatric dermatology shows a relatively higher median wait time, with a notable shift of its peak towards longer wait durations.

Conclusion:

  • Both distributions are skewed and represent non-normal count data, suggesting a need for statistical methods like Poisson regression or negative binomial regression for analysis.
  • Pediatric dermatology appointments generally take longer than general dermatology appointments, indicating potential differences in availability or demand across these specialties.

4.3 Log Transformation for Business Days

## Plots saved to: output/density_plot_20250324_152138.tiff and output/density_plot_20250324_152138.png

The log transformation applied to the business_days_until_appointment variable has several significant effects:

  1. Reducing Skewness:
    The original business_days_until_appointment variable is highly skewed to the right, with a large number of values clustered at low numbers and a few extreme values extending into high numbers. By taking the logarithm, we compress these larger values and stretch out the smaller ones, reducing the extreme skewness. This makes it easier to visualize and interpret the underlying distributions.

Interpretation of the Density Plot

This density plot compares the log-transformed business_days_until_appointment variable between General Dermatology and Pediatric Dermatology. Here’s an interpretation:

  1. Log Transformation:
    • The data has been log-transformed, which is a common approach for skewed data. This transformation helps normalize the distribution and highlight relative differences between the two subspecialties.
  2. General Dermatology:
    • The purple density curve shows two peaks: one around a log value of ~2 (corresponding to approximately 7–10 business days) and another around ~4 (approximately 50–60 business days).
    • This bimodal pattern may indicate that appointments in general dermatology fall into two distinct categories: one group with shorter wait times and another with moderate delays.
  3. Pediatric Dermatology:
    • The yellow density curve has a single, prominent peak around a log value of ~4 (approximately 50–60 business days).
    • This suggests that pediatric dermatology appointments generally take longer than general dermatology appointments and are more consistent in their scheduling duration.
  4. Comparison Between Subspecialties:
    • Pediatric dermatology consistently shows longer wait times, as indicated by the higher density around the log value of 4.
    • General dermatology has a wider distribution, with a substantial proportion of shorter wait times compared to pediatric dermatology.

Conclusion:

  • General Dermatology appointments are more variable, with a mix of shorter and moderate wait times, while Pediatric Dermatology appointments tend to cluster around longer wait durations.
  • The log transformation highlights these differences effectively, making it clear that the two subspecialties operate with distinct scheduling patterns.
  • Further statistical analysis, such as Poisson regression, could quantify these differences and test their significance.

5. Appointment

5.1 Scenario Type by Insurance

### Interpretation of the Bar Plot

This bar plot compares the count of cases across different scenarios between General Dermatology and Pediatric Dermatology. Here’s the interpretation:

  1. Scenarios:
    • Four scenarios are represented: Hemangioma Case, Teenage Acne Case, Toddler Eczema Case, and Infantile Hemangioma Case.
    • Each scenario is compared across the two subspecialties.
  2. General Dermatology:
    • The count of cases varies significantly across scenarios:
      • Hemangioma Case: Highest count (107 cases).
      • Teenage Acne Case: Second-highest count (105 cases), similar to Hemangioma Case.
      • Toddler Eczema Case: Lowest count (56 cases), indicating fewer cases in this scenario.
    • This indicates variability in case distribution, with certain scenarios (e.g., Toddler Eczema Case) being less represented.
  3. Pediatric Dermatology:
    • The counts for all scenarios are relatively even:
      • Hemangioma Case, Teenage Acne Case, and Toddler Eczema Case: Counts are consistently around 104–106 cases.
    • This consistency suggests that Pediatric Dermatology handles a similar volume of cases across these scenarios.
  4. Comparison Between Subspecialties:
    • General Dermatology shows more variability in the distribution of cases across scenarios, with a noticeably lower count for the Toddler Eczema Case.
    • Pediatric Dermatology exhibits balanced case counts, indicating a more consistent distribution of workload across scenarios.

Conclusion:

  • General Dermatology focuses more on Hemangioma and Teenage Acne cases while handling fewer Toddler Eczema cases.
  • Pediatric Dermatology manages cases more evenly across scenarios, suggesting less specialization or prioritization by case type.
  • This distribution may reflect differences in subspecialty focus, patient needs, or scheduling practices. Further analysis could explore reasons for the discrepancies, such as patient demand or resource allocation.

5.1 Day of the week

5.1 Central Appointment Line

5.2 Physician Characteristics by Gender

Interpretation of the Bar Plot

This bar plot shows the distribution of physicians’ gender (Female, Male, NA) across three scenarios: Infantile Hemangioma Case, Teenage Acne Case, and Toddler Eczema Case. Here’s the interpretation:

  1. Infantile Hemangioma Case:
    • Female physicians dominate this scenario, with a count of 158.
    • Male physicians account for 53, which is significantly lower.
    • A small number of cases (2) have missing or unreported gender (NA).
  2. Teenage Acne Case:
    • Female physicians again have the highest count (132), though slightly lower than in the Infantile Hemangioma Case.
    • Male physicians account for 26, which is noticeably fewer than female physicians in this scenario.
    • NA values remain minimal (2).
  3. Toddler Eczema Case:
    • Similar to the Infantile Hemangioma Case, female physicians lead with 158 cases.
    • Male physicians account for 52, which is comparable to their count in the Infantile Hemangioma Case.
    • Only 1 case is categorized as NA.
  4. Comparison Across Scenarios:
    • Female physicians consistently represent the majority across all three scenarios, highlighting a possible gender imbalance in these specialties or cases.
    • Male physicians have relatively consistent counts across scenarios but are significantly outnumbered by female physicians.
    • The NA category is negligible and does not affect the overall trends.

Conclusion:

  • Female physicians are the primary contributors across all three scenarios, indicating their dominant representation in these cases.
  • The low representation of male physicians may reflect workforce demographics or preferences in handling these specific scenarios.
  • The consistent dominance of female physicians, especially in the Infantile Hemangioma and Toddler Eczema cases, might warrant further investigation into gender-based workload or specialization patterns.

5.2 Physician MD vs. DO by Insurance

6. Table 1

6.1 Display Summary Table

General Dermatologist (N=269) Pediatric Dermatologist (N=316) Total (N=585) p value
Physician Age < 0.01
- n 261 301 562
- Median (Q1, Q3) 51.0 (42.0, 63.0) 45.0 (40.0, 53.0) 47.0 (41.0, 59.0)
Physician Gender < 0.01
- Female 153 (57.3%) 295 (94.2%) 448 (77.2%)
- Male 114 (42.7%) 18 (5.8%) 132 (22.8%)
Physician Honorrific < 0.01
- Allopathic medical training 250 (93.6%) 307 (99.0%) 557 (96.5%)
- Osteopathic medical training 17 (6.4%) 3 (1.0%) 20 (3.5%)
Practice Setting < 0.01
- University 52 (36.9%) 139 (55.2%) 191 (48.6%)
- Private Practice 89 (63.1%) 113 (44.8%) 202 (51.4%)
Physician Sees Children < 0.01
- Yes 131 (57.7%) 294 (94.2%) 425 (78.8%)
- No 96 (42.3%) 18 (5.8%) 114 (21.2%)
Central Appointment Phone Number 0.08
- Yes 120 (44.6%) 164 (51.9%) 284 (48.5%)
- No 149 (55.4%) 152 (48.1%) 301 (51.5%)
Number of Phone Transfers < 0.01
- No transfers 82 (30.5%) 56 (17.7%) 138 (23.6%)
- One transfer 141 (52.4%) 174 (55.1%) 315 (53.8%)
- Two transfers 30 (11.2%) 52 (16.5%) 82 (14.0%)
- More than two transfers 16 (5.9%) 34 (10.8%) 50 (8.5%)
Call time (minutes) < 0.01
- n 268 315 583
- Median (Q1, Q3) 2.1 (1.4, 3.7) 3.2 (2.0, 5.0) 2.8 (1.6, 4.5)
Hold time (minutes) 0.59
- n 243 300 543
- Median (Q1, Q3) 0.3 (0.0, 1.4) 0.3 (0.0, 1.7) 0.3 (0.0, 1.6)
Offers Teledermatology 0.08
- Yes 64 (24.2%) 45 (32.4%) 109 (27.0%)
- No 201 (75.8%) 94 (67.6%) 295 (73.0%)
Fluent in Language < 0.01
- Physician Speaks Another Language 17 (6.3%) 6 (1.9%) 23 (3.9%)
- Physician Speaks English 238 (88.5%) 307 (97.2%) 545 (93.2%)
- Physician Speaks Spanish 14 (5.2%) 3 (0.9%) 17 (2.9%)
Care Credit Accepted 0.01
- Yes 22 (8.3%) 3 (2.2%) 25 (6.2%)
- No 243 (91.7%) 136 (97.8%) 379 (93.8%)
Day Called Physician < 0.01
- N-Miss 1 0 1
- Monday 3 (1.1%) 59 (18.7%) 62 (10.6%)
- Tuesday 63 (23.5%) 148 (46.8%) 211 (36.1%)
- Wednesday 34 (12.7%) 58 (18.4%) 92 (15.8%)
- Thursday 76 (28.4%) 40 (12.7%) 116 (19.9%)
- Friday 92 (34.3%) 11 (3.5%) 103 (17.6%)
US Census Bureau Subdivision 1.00
- South Atlantic 48 (17.8%) 57 (18.0%) 105 (17.9%)
- East North Central 45 (16.7%) 52 (16.5%) 97 (16.6%)
- East South Central 5 (1.9%) 6 (1.9%) 11 (1.9%)
- Middle Atlantic 28 (10.4%) 34 (10.8%) 62 (10.6%)
- Mountain 35 (13.0%) 39 (12.3%) 74 (12.6%)
- New England 18 (6.7%) 20 (6.3%) 38 (6.5%)
- Pacific 46 (17.1%) 55 (17.4%) 101 (17.3%)
- West North Central 23 (8.6%) 27 (8.5%) 50 (8.5%)
- West South Central 21 (7.8%) 26 (8.2%) 47 (8.0%)
Ruralilty 0.04
- Metropolitan area 264 (100.0%) 313 (100.0%) 577 (100.0%)
Median Household Income by Zip Code 0.75
- n 233 271 504
- Median (Q1, Q3) 82232.0 (57698.0, 106703.0) 82232.0 (57698.0, 106703.0) 82232.0 (57698.0, 106703.0)
Population Less than 21 years old by Zip Code 0.78
- n 265 310 575
- Median (Q1, Q3) 6217.0 (2843.0, 11330.0) 6251.5 (2843.0, 11330.0) 6217.0 (2843.0, 11330.0)
  1. Significant differences exist between General Dermatologists and Pediatric Dermatologists in gender, training, practice setting, phone transfers, call time, and other characteristics.
  2. Pediatric dermatologists tend to have longer call times, more phone transfers, and are overwhelmingly female.
  3. General dermatologists are more likely to be in private practice, speak another language (including Spanish), and accept Care Credit.

6.2 Save Summary Table to Word

7. Wait Time by Insurance Figures

Waiting time in Days (Log Scale) for Blue Cross/Blue Shield versus Medicaid. The code you provided will create a scatter plot with points representing the relationship between the insurance variable (x-axis) and the days variable (y-axis). Additionally, it includes a line plot that connects points with the same npi value.

7.1 Line Plot of Waiting Time by Insurance Type

Interpretation of the Scatter Plot with Mean Line

This scatter plot displays the business days until appointment for three scenarios: Infantile Hemangioma Case, Teenage Acne Case, and Toddler Eczema Case. Each point represents a data entry for a specific scenario, and the red line indicates the mean value for each scenario. Here’s the interpretation:


1. Data Distribution

  • Infantile Hemangioma Case:
    • Most appointments are clustered between 0 and 100 business days.
    • A few outliers extend beyond 300 business days, indicating significantly delayed appointments in rare cases.
  • Teenage Acne Case:
    • The spread of data is broader compared to the Infantile Hemangioma Case.
    • A larger proportion of data points are above 100 business days, suggesting longer average wait times.
  • Toddler Eczema Case:
    • The distribution is similar to the Teenage Acne Case, with most appointments between 0 and 100 business days but some extending beyond 300 days.

3. Variability and Outliers

  • All three scenarios show considerable variability, with a few extreme outliers (>300 days).
  • The presence of outliers in all scenarios suggests that while most appointments are scheduled in a reasonable timeframe, occasional cases face significant delays.

Conclusion

  • The Teenage Acne Case has the highest mean wait time, followed by the Toddler Eczema Case, while the Infantile Hemangioma Case has the shortest average wait time.
  • Outliers in all three scenarios highlight occasional delays that might skew the averages and warrant further investigation into systemic or case-specific factors causing these delays.
  • The observed differences in average wait times could reflect the complexity of cases, availability of specialists, or prioritization policies in scheduling. Statistical testing (e.g., ANOVA or Poisson regression) could help determine if these differences are significant.

7.2 Scatter Plot of Waiting Times by Insurance Type

## Plots saved to: Lizzy/Figures/urgent_gyn_vs_insurance_none_20250324_152143.tiff and Lizzy/Figures/urgent_gyn_vs_insurance_none_20250324_152143.png

7.3 Density Plot of Waiting Times by Insurance

## Plots saved to: Lizzy/Figures/scenario_density_20250324_152144.tiff and Lizzy/Figures/scenario_density_20250324_152144.png

Interpretation of the Density Plot: Waiting Times by Insurance Type (Log Scale)

This density plot visualizes the log-transformed waiting times (in days) for three case types: Infantile Hemangioma Case, Teenage Acne Case, and Toddler Eczema Case. The log transformation helps normalize the data and make patterns more interpretable. Here’s the interpretation:


2. Comparison Between Cases

  • Shortest Waiting Times:
    • The Infantile Hemangioma Case has the shortest waiting times overall, as reflected by its peak at a lower log value and rapid drop-off at longer wait times.
  • Longest Waiting Times:
    • Teenage Acne Case and Toddler Eczema Case have similar peaks at longer wait times, but Teenage Acne Case shows a more pronounced density, indicating it may take longer on average.
  • Variability:
    • Toddler Eczema Case shows the broadest distribution, with notable density at both shorter and longer wait times, suggesting more variability in scheduling compared to the other cases.

3. Insights from Log Transformation

  • Log transformation helps reveal the clustering of waiting times at different scales. Without this transformation, the skewness of the raw data may obscure patterns.
  • The differences in peak locations highlight how the scheduling patterns differ significantly by case type.

Conclusion

  • Infantile Hemangioma Case has the shortest waiting times, with most appointments scheduled within a couple of weeks.
  • Teenage Acne Case consistently has the longest wait times, with a significant number of cases requiring nearly two months or more for scheduling.
  • Toddler Eczema Case shows high variability in waiting times, with a notable portion of appointments taking either short or long durations.
  • These differences suggest case type and complexity likely influence scheduling, warranting further statistical analysis to quantify these disparities.

8. Line Plot of Waiting Times by Scenario

Waiting time in Days (Log Scale) for Blue Cross/Blue Shield versus Medicaid. The code you provided will create a scatter plot with points representing the relationship between the scenario variable (x-axis) and the days variable (y-axis). Additionally, it includes a line plot that connects points with the same NPI name value.

8.1 Create Line Plot for Different Scenarios

## Starting the analyze_pairwise_trends_programmatically function...
## Step 1: Performing pairwise Kruskal-Wallis tests...
## Step 2: Analyzing directionality trends...
## Step 3: Analyzing significance trends...
## Step 4: Combining trends...
## Trends Summary:
## # A tibble: 2 × 9
##   Scenario1       Higher_count Lower_count Total_comparisons.x Higher_percentage
##   <fct>                  <int>       <int>               <int>             <dbl>
## 1 Infantile Hema…            0           1                   1                 0
## 2 Toddler Eczema             1           1                   2                50
## # ℹ 4 more variables: Lower_percentage <dbl>, Significant_count <int>,
## #   Total_comparisons.y <int>, Significant_percentage <dbl>
## analyze_pairwise_trends_programmatically function completed successfully.
## # A tibble: 3 × 5
##   Scenario1            Scenario2            Direction p_value p_value_formatted
##   <fct>                <fct>                <chr>       <dbl> <chr>            
## 1 Toddler Eczema       Infantile Hemangioma Higher    0.111   p=0.111          
## 2 Toddler Eczema       Teenage Acne         Lower     0.249   p=0.249          
## 3 Infantile Hemangioma Teenage Acne         Lower     0.00621 p<0.01
## # A tibble: 2 × 9
##   Scenario1       Higher_count Lower_count Total_comparisons.x Higher_percentage
##   <fct>                  <int>       <int>               <int>             <dbl>
## 1 Infantile Hema…            0           1                   1                 0
## 2 Toddler Eczema             1           1                   2                50
## # ℹ 4 more variables: Lower_percentage <dbl>, Significant_count <int>,
## #   Total_comparisons.y <int>, Significant_percentage <dbl>

Key Observations: 1. Toddler Eczema Case vs. Infantile Hemangioma Case: The Toddler Eczema Case has higher wait times on average, but the difference is not statistically significant (p=0.111). 2. Toddler Eczema Case vs. Teenage Acne Case: The Toddler Eczema Case has lower wait times on average, but the difference is also not statistically significant (p=0.249). 3. Infantile Hemangioma Case vs. Teenage Acne Case: The Teenage Acne Case has significantly higher wait times (p<0.01), indicating a strong difference between these two scenarios.

8.2 Scatter Plot of Waiting Times by Scenario

## Plots saved to: Lizzy/Figures/scenario_none_20250324_152144.tiff and Lizzy/Figures/scenario_none_20250324_152144.png

8.3 Density Plot of Waiting Times by Scenario

Understanding a Density Plot:

A density plot is a smoothed version of a histogram that shows the distribution of a continuous variable. It represents the relative frequency of data points in different ranges of values, with areas under the curve corresponding to proportions of the data.

  • X-axis (Log Waiting Times in Days):
    • The x-axis shows the logarithm of waiting times in days, meaning the waiting times have been transformed to a logarithmic scale to make the distribution more manageable or easier to interpret. A log transformation is often used when the raw data is skewed.
    • Values closer to the left (lower on the x-axis) represent shorter waiting times, while values to the right (higher on the x-axis) represent longer waiting times.
  • Y-axis (Density):
    • The y-axis represents density, which is the relative concentration of data points for a given range of values on the x-axis. The area under the entire curve sums to 1, meaning it reflects the proportion of observations.
    • Higher peaks represent regions where there is a higher concentration of data points, while lower regions represent ranges with fewer data points.
  • Colors (Scenario):
    • The two colors (purple for Blue Cross/Blue Shield and yellow for Medicaid) represent the distribution of waiting times for the two different insurance groups.
    • The overlap between the two distributions is shaded, showing regions where both groups have similar waiting times.

How to Read the Density Plot: 1. Shape of the Distribution: - The shape of each curve tells you about the distribution of waiting times within each insurance group. - A peak indicates the most common waiting times for that group. - A wider curve indicates a more spread-out distribution, meaning the waiting times vary more within that group. - A narrower curve indicates that waiting times are more concentrated around the peak.

## Plots saved to: Lizzy/Figures/scenario_density_20250324_152145.tiff and Lizzy/Figures/scenario_density_20250324_152145.png

9. Statistical testing

9.1 Fit the Interaction Model

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: business_days_until_appointment ~ scenario * Subspecialty + (1 |  
##     NPI)
##    Data: df
## Control: control
## 
##      AIC      BIC   logLik deviance df.resid 
##   4623.6   4650.0  -2304.8   4609.6      312 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -10.6227  -0.7472  -0.0565   0.4454   7.0345 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  NPI    (Intercept) 1.43     1.196   
## Number of obs: 319, groups:  NPI, 186
## 
## Fixed effects:
##                                                                Estimate
## (Intercept)                                                     3.39814
## scenarioInfantile Hemangioma                                   -0.16022
## scenarioToddler Eczema                                         -0.17185
## SubspecialtyPediatric Dermatology                               0.72248
## scenarioInfantile Hemangioma:SubspecialtyPediatric Dermatology -0.08125
## scenarioToddler Eczema:SubspecialtyPediatric Dermatology        0.23085
##                                                                Std. Error
## (Intercept)                                                       0.28071
## scenarioInfantile Hemangioma                                      0.36337
## scenarioToddler Eczema                                            0.33500
## SubspecialtyPediatric Dermatology                                 0.30703
## scenarioInfantile Hemangioma:SubspecialtyPediatric Dermatology    0.36385
## scenarioToddler Eczema:SubspecialtyPediatric Dermatology          0.33545
##                                                                z value
## (Intercept)                                                     12.105
## scenarioInfantile Hemangioma                                    -0.441
## scenarioToddler Eczema                                          -0.513
## SubspecialtyPediatric Dermatology                                2.353
## scenarioInfantile Hemangioma:SubspecialtyPediatric Dermatology  -0.223
## scenarioToddler Eczema:SubspecialtyPediatric Dermatology         0.688
##                                                                           Pr(>|z|)
## (Intercept)                                                    <0.0000000000000002
## scenarioInfantile Hemangioma                                                0.6593
## scenarioToddler Eczema                                                      0.6080
## SubspecialtyPediatric Dermatology                                           0.0186
## scenarioInfantile Hemangioma:SubspecialtyPediatric Dermatology              0.8233
## scenarioToddler Eczema:SubspecialtyPediatric Dermatology                    0.4913
##                                                                   
## (Intercept)                                                    ***
## scenarioInfantile Hemangioma                                      
## scenarioToddler Eczema                                            
## SubspecialtyPediatric Dermatology                              *  
## scenarioInfantile Hemangioma:SubspecialtyPediatric Dermatology    
## scenarioToddler Eczema:SubspecialtyPediatric Dermatology          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) scnrIH scnrTE SbspPD sIH:SD
## scnrInfntlH -0.772                            
## scnrTddlrEc -0.837  0.647                     
## SbspcltyPdD -0.914  0.706  0.766              
## scnrIHm:SPD  0.771 -0.999 -0.646 -0.706       
## scnrTEc:SPD  0.836 -0.646 -0.999 -0.766  0.646

Interpretation of the Results: Interaction Model and Estimated Marginal Means (EMMs)

The analysis fits a Poisson regression model with an interaction between scenario and Subspecialty to examine their effects on the number of business days until an appointment. Here’s a detailed interpretation of the results:

The main effect of SubspecialtyPediatric Dermatology is significant, indicating a difference in wait times between Subspecialties. Other fixed effects, including the interaction terms, are not statistically significant.The significant main effect (p = 0.0186) for SubspecialtyPediatric Dermatology indicates that, on average, patients with Pediatric Dermatology wait longer than those with the reference Subspecialty (likely General Dermatology), regardless of the scenario.


1. Interaction Model Summary

The model formula:
\[ \text{business\_days\_until\_appointment} \sim \text{scenario} * \text{Subspecialty} + (1 | \text{NPI}) \] - Family: Poisson (log link function) is appropriate for modeling count data, such as business days until an appointment. - Random Effect: NPI (National Provider Identifier) accounts for variability among providers, reflecting that appointments may vary by individual providers. - Fixed Effects: Includes the main effects of scenario, Subspecialty, and their interaction, allowing the model to evaluate whether the impact of scenario differs by Subspecialty.

Key Model Metrics: - AIC (Akaike Information Criterion): 4129.19, indicating model fit (lower AIC is better). - Random Effects: Standard deviation of the random intercept for NPI is 1.252, showing substantial variability among providers. - Fixed Effects: - Significant coefficients (e.g., Subspecialty: Pediatric Dermatology) suggest that subspecialty has a notable effect on waiting times. - Interaction terms (e.g., scenario: Subspecialty) help to understand how scenario-specific effects differ between subspecialties.


2. Estimated Marginal Means (EMMs)

The EMMs summarize the predicted waiting times (in days) for combinations of scenario and Subspecialty, adjusted for variability in the data. Confidence intervals (CIs) provide a range of uncertainty.

Scenario Subspecialty Rate (days) SE 95% CI Lower 95% CI Upper
Infantile Hemangioma Case General Dermatology 25.4 6.13 15.8 40.7
Teenage Acne Case General Dermatology 29.7 8.72 16.7 52.8
Toddler Eczema Case General Dermatology 25.0 4.79 17.2 36.4
Infantile Hemangioma Case Pediatric Dermatology 45.3 4.61 37.1 55.3
Teenage Acne Case Pediatric Dermatology 56.7 5.76 46.5 69.2
Toddler Eczema Case Pediatric Dermatology 59.2 6.01 48.5 72.2

3. Key Observations from EMMs

  • General Dermatology:
    • Predicted waiting times range between 25.0 and 29.7 days.
    • Teenage Acne Case has the longest waiting time (29.7 days), but the wide confidence interval suggests variability.
  • Pediatric Dermatology:
    • Predicted waiting times are consistently longer, ranging from 45.3 to 59.2 days.
    • Toddler Eczema Case has the longest predicted waiting time (59.2 days) with the highest upper bound (72.2 days), suggesting significant delays in scheduling for this scenario.
  • Scenario Comparison:
    • Infantile Hemangioma Case has the shortest waiting times for both subspecialties.
    • The difference between subspecialties is most pronounced for Toddler Eczema Case, with Pediatric Dermatology showing a waiting time nearly double that of General Dermatology.

4. Fixed Effect Coefficients

Effect Estimate Interpretation
(Intercept) 3.23403 Baseline log waiting time (business days) for General Dermatology in the Infantile Hemangioma Case scenario.
scenarioTeenage Acne Case 0.15743 Slight increase in log waiting time for the Teenage Acne Case compared to the baseline scenario.
scenarioToddler Eczema Case -0.01502 Minimal decrease in log waiting time for Toddler Eczema Case compared to the baseline.
SubspecialtyPediatric Dermatology 0.57862 Substantial increase in log waiting time for Pediatric Dermatology compared to General Dermatology.
scenarioTeenage Acne Case:SubspecialtyPediatric Dermatology 0.06840 Interaction effect suggesting a slight additional increase for Teenage Acne Case in Pediatric Dermatology.
scenarioToddler Eczema Case:SubspecialtyPediatric Dermatology 0.28275 Larger additional increase in waiting time for Toddler Eczema Case in Pediatric Dermatology.

5. Conclusions

  • Subspecialty Matters: Pediatric Dermatology consistently shows longer waiting times compared to General Dermatology across all scenarios.
  • Scenario-Specific Effects:
    • Toddler Eczema Case and Teenage Acne Case have longer waiting times, especially in Pediatric Dermatology.
    • Infantile Hemangioma Case consistently has the shortest waiting times.
  • Interaction Effects:
    • The interaction terms indicate that scenario-specific effects vary between subspecialties, with Pediatric Dermatology showing amplified delays for certain scenarios.

These findings suggest that both scenario and subspecialty significantly influence waiting times, with Pediatric Dermatology facing more challenges in timely scheduling, particularly for complex cases like Toddler Eczema Case.

9.2 Extract and Plot the Interaction Data

9.3 Combined plot of Scenarios and Subspecialty

9.4 Overall Comparison Plot

9.5 Scenario-Based Dot Plot Creation

9.6 Combined Plot of Scenarios and Subspecialty: Color Version

9.7 Combined Plot of Scenarios and Subspecialty: Black-and-White Version

9.8 Extracting and Reporting Interaction Results

## Teenage Acne: Patients with Pediatric Dermatology wait 61.6 days (95% CI 48.3–78.6). Patients with General Dermatology wait 29.9 days (95% CI 17.3–51.8), which is shorter (51.4% difference) compared to Pediatric Dermatology (p = NA).
## 
## Infantile Hemangioma: Patients with Pediatric Dermatology wait 48.4 days (95% CI 37.9–61.8). Patients with General Dermatology wait 25.5 days (95% CI 16.2–40.1), which is shorter (47.3% difference) compared to Pediatric Dermatology (p = NA).
## 
## Toddler Eczema: Patients with Pediatric Dermatology wait 65.3 days (95% CI 51.2–83.4). Patients with General Dermatology wait 25.2 days (95% CI 17.6–36.1), which is shorter (61.5% difference) compared to Pediatric Dermatology (p = NA).

Poisson Model The models need to be able to deal with NA in the business_days_until_appointment outcome variable (266) and also non-parametric data.

10. Poisson Regression Model Analysis of Wait Time

business_days_until_appointment can be transformed with a square root function so that 0 is not infinity from log(business_days_until_appointment).

10.1 Descriptive Analysis of Wait Times

10.1.1 Wait Time with Single Predictor

In interpreting this output: ### Interpretation of Results

This analysis combines summary statistics for waiting times across subspecialties and the results from a Poisson regression model assessing the effect of subspecialty on waiting times.


1. Summary Statistics

Group Median Wait Time IQR (Q1 – Q3) Summary Sentence
Overall 73.0 days 20.5 – 111.5 days “The overall median wait time was 73.0 business days (IQR: 20.5 – 111.5).”
General Dermatology 32.5 days 7.8 – 82.2 days “For General Dermatology, the median wait time was 32.5 business days (IQR: 7.8 – 82.2).”
Pediatric Dermatology 86.0 days 29.5 – 120.0 days “For Pediatric Dermatology, the median wait time was 86.0 business days (IQR: 29.5 – 120.0).”

Key Observations:

  • Overall: Patients wait a median of 73 days for appointments, with a large variability (IQR: 20.5–111.5 days).
  • General Dermatology: Has a much shorter median wait time (32.5 days) with less variability.
  • Pediatric Dermatology: Wait times are significantly longer, with a median of 86 days and wider variability (IQR: 29.5–120.0 days).

The difference in medians highlights a notable disparity in appointment availability between General and Pediatric Dermatology.


2. Poisson Regression Results

The Poisson regression model estimates the relationship between Subspecialty and business days until appointment, using General Dermatology as the reference group.

Model Summary:

  • Intercept (General Dermatology):
    • The log of the expected waiting time for General Dermatology is 3.971, corresponding to an expected waiting time of \(e^{3.971} \approx 53.2\) days.
  • Subspecialty: Pediatric Dermatology:
    • The coefficient for Pediatric Dermatology is 0.520, indicating a multiplicative increase in waiting time relative to General Dermatology. The expected waiting time for Pediatric Dermatology is \(e^{3.971 + 0.520} \approx 89.2\) days.
    • The p-value (< 0.0001) indicates this difference is highly statistically significant.

Model Fit:

  • AIC: 19384, suggesting the model’s goodness of fit. Lower AIC values indicate better fit, but comparison with alternative models is necessary for full interpretation.
  • Residual Deviance: 17592, slightly lower than the null deviance (18770), indicating the model explains some variation but could likely be improved.

3. Combined Insights

  1. Median Wait Times:
    • Pediatric Dermatology has a significantly longer median wait time compared to General Dermatology (86.0 vs. 32.5 days).
  2. Poisson Regression:
    • The regression model confirms that Pediatric Dermatology results in a significantly longer expected waiting time (\(e^{0.520} \approx 1.68\)), or 68% longer, compared to General Dermatology.

10.2 Is there a Difference in Wait Times by Subspecialty?

## 
## Call:
## glm(formula = business_days_until_appointment ~ as.factor(Subspecialty), 
##     family = "poisson", data = df)
## 
## Coefficients:
##                                              Estimate Std. Error z value
## (Intercept)                                   3.97111    0.01431  277.41
## as.factor(Subspecialty)Pediatric Dermatology  0.52024    0.01595   32.62
##                                                         Pr(>|z|)    
## (Intercept)                                  <0.0000000000000002 ***
## as.factor(Subspecialty)Pediatric Dermatology <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18770  on 318  degrees of freedom
## Residual deviance: 17592  on 317  degrees of freedom
##   (266 observations deleted due to missingness)
## AIC: 19384
## 
## Number of Fisher Scoring iterations: 5

11. Model Interpretation and Variable Extraction

11.1 Correct Extraction of Variable Names and Summary

## Using Poisson regression, the baseline rate of business_days_until_appointment (intercept) is estimated to be 53 (95% CI 52 - 55 ) times the reference category ( General Dermatology ). For Pediatric Dermatology compared to General Dermatology the incidence rate ratio (IRR) of business_days_until_appointment is estimated to be 1.68 (95% CI 1.6 - 1.7 ), indicating that the waiting time for Pediatric Dermatology is 68.2 % higher than for those in General Dermatology (p <0.01 ).

12. Scenarios and Mathematical Representation

\[ \begin{{align*}} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{{e^{{-\lambda}} \cdot \lambda^x}}{{x!}} \\sqrt{{\lambda}} &= \beta_0 \& + \beta_1 \cdot \underline{{\mathbf{{\large{{\textPatient Scenario}}}}}} \& + ( 1 | \text{{Physician NPI}}) \end{{align*}} \] # 13. Scenario-Based Poisson Regression Analysis - NO DIFFERENCES FOUND ### 13.1 Fitting the Poisson Model for Scenario Differences

## Logging inputs...
## Model Object:  glm lm 
## Specs:  ~scenario | scenario 
## Variable of Interest:  scenario 
## Color By:  scenario 
## Output Directory:  Lizzy/Figures 
## Y-Axis Min:  
## Y-Axis Max:  
## Using existing output directory:  Lizzy/Figures 
## Computing estimated marginal means...
## Logging estimated marginal means data...
## # A tibble: 3 × 6
##   scenario              rate    SE    df asymp.LCL asymp.UCL
##   <fct>                <dbl> <dbl> <dbl>     <dbl>     <dbl>
## 1 Teenage Acne          89.5 0.951   Inf      87.7      91.4
## 2 Infantile Hemangioma  64.7 0.796   Inf      63.1      66.3
## 3 Toddler Eczema        82.0 0.834   Inf      80.4      83.7
## Range of estimated marginal means with CIs:  63.13445 91.3984 
## Y-axis min set to:  58.13445 
## Y-axis max set to:  96.3984 
## Creating the plot...
## Plot created successfully.
## Saving plot to:  Lizzy/Figures/interaction_scenario_comparison_plot_20250324_152151.png

## Plot saved successfully to:  Lizzy/Figures/interaction_scenario_comparison_plot_20250324_152151.png 
## Returning the estimated data and plot object.

14. Scenario Summary and Wait Time Analysis

14.1 Calculating Scenario Call Counts

14.2 Scenario Summary Statement

## There were 585 calls made across scenarios including 161 with Teenage Acne, 213 with Infantile Hemangioma, 211 with Toddler Eczema.

15. Wait Time Statistics by Scenario

15.1 Calculating Median Wait Times for Scenarios

15.2 Displaying Wait Time Statistics in Table

Business Days Until Next Appointment Joint Scenario
scenario Median_business_days_until_appointment Q1 Q3
Teenage Acne 85.0 32 129
Infantile Hemangioma 44.5 14 101
Toddler Eczema 78.0 22 112

Number of offices with each of the four scenarios successfully contacted: business_days_until_appointment ~ scenario

\[ \begin{align*} P(\text{{Business Days until New Patient Appointment}} = x) &= \frac{e^{-\lambda} \cdot \lambda^x}{x!} \\ \sqrt{\lambda} &= \beta_0 \\ & + \beta_1 \cdot \underline{\mathbf{\large{\text{{Number of Offices Contacted}}}}} \\ & + ( 1 | \text{{Physician NPI}}) \end{align*} \]

16. Successful Contacts by Scenario

16.1 Filtering for Successful Contacts

16.2 Counting Successful Contacts by Scenario

16.3 Displaying Contact Counts in Table

Number of successful calls contacted for each scenario
scenario count percentage cumulative_count
Teenage Acne 105 28.9 105
Infantile Hemangioma 125 34.4 230
Toddler Eczema 133 36.6 363

17. Poisson Regression Model for Wait Times by Scenario

17.0 Can this be run as a mixed effects Poisson Regression Model?

Can this be run as a mixed effects model?

## Mixed-Effects Model Diagnostic:
## -----------------------------
## Unique NPI levels: 186
## Total observations: 319
## Concerns:
## 1. Number of unique group levels is VERY CLOSE to total observations
## 2. This can lead to unstable or unreliable model estimation
## 
## Recommended actions:
## - Carefully review model convergence
## - Consider alternative modeling approaches
## - Potentially reduce the number of group levels

17.1 Fitting Poisson Regression Model for Scenarios

## 
## Call:
## glm(formula = business_days_until_appointment ~ as.factor(scenario), 
##     family = "poisson", data = df)
## 
## Coefficients:
##                                         Estimate Std. Error z value
## (Intercept)                              4.49441    0.01062 423.096
## as.factor(scenario)Infantile Hemangioma -0.32501    0.01626 -19.987
## as.factor(scenario)Toddler Eczema       -0.08738    0.01470  -5.943
##                                                     Pr(>|z|)    
## (Intercept)                             < 0.0000000000000002 ***
## as.factor(scenario)Infantile Hemangioma < 0.0000000000000002 ***
## as.factor(scenario)Toddler Eczema               0.0000000028 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18770  on 318  degrees of freedom
## Residual deviance: 18341  on 316  degrees of freedom
##   (266 observations deleted due to missingness)
## AIC: 20135
## 
## Number of Fisher Scoring iterations: 5

17.2 Extracting P-Values for Scenario Comparisons

17.3 Calculating Wait Time Statistics by Scenario

17.4 Summary Sentence for Scenario-Based Wait Times

18. Logistic regression to predict people who accept peds patients

## $Odds_Ratios
##                                                         Predictor Odds_Ratio
## (Intercept)                                           (Intercept)  0.2256672
## practice_settingPrivate Practice practice_settingPrivate Practice  9.5441954
## Age                                                           Age  1.0469725
## genderMale                                             genderMale  0.3289797
## languages_spokenSpanish                   languages_spokenSpanish  0.4386240
## CareCredit_acceptedYes                     CareCredit_acceptedYes  0.2441869
##                                      Lower_CI   Upper_CI     Interpretation
## (Intercept)                      0.0003776308  94.253215  77.4% less likely
## practice_settingPrivate Practice 0.9401983812 278.243750 854.4% more likely
## Age                              0.9442608771   1.178423   4.7% more likely
## genderMale                       0.0250479207   3.080304  67.1% less likely
## languages_spokenSpanish          0.0493128709   3.182783  56.1% less likely
## CareCredit_acceptedYes           0.0085245155   3.796738  75.6% less likely
## 
## $Summary
## 
## Call:
## glm(formula = target ~ ., family = binomial, data = replication_data)
## 
## Coefficients:
##                                  Estimate Std. Error z value Pr(>|z|)  
## (Intercept)                      -1.48869    2.99148  -0.498   0.6187  
## practice_settingPrivate Practice  2.25593    1.35878   1.660   0.0969 .
## Age                               0.04590    0.05399   0.850   0.3952  
## genderMale                       -1.11176    1.17964  -0.942   0.3460  
## languages_spokenSpanish          -0.82411    1.02912  -0.801   0.4232  
## CareCredit_acceptedYes           -1.40982    1.44015  -0.979   0.3276  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 30.553  on 23  degrees of freedom
## Residual deviance: 25.458  on 18  degrees of freedom
## AIC: 37.458
## 
## Number of Fisher Scoring iterations: 5

###18.1 Forest plot

##                                                         Predictor Odds_Ratio
## (Intercept)                                           (Intercept)  0.2256672
## practice_settingPrivate Practice practice_settingPrivate Practice  9.5441954
## Age                                                           Age  1.0469725
## genderMale                                             genderMale  0.3289797
## languages_spokenSpanish                   languages_spokenSpanish  0.4386240
## CareCredit_acceptedYes                     CareCredit_acceptedYes  0.2441869
##                                      Lower_CI   Upper_CI
## (Intercept)                      0.0003776308  94.253215
## practice_settingPrivate Practice 0.9401983812 278.243750
## Age                              0.9442608771   1.178423
## genderMale                       0.0250479207   3.080304
## languages_spokenSpanish          0.0493128709   3.182783
## CareCredit_acceptedYes           0.0085245155   3.796738

18.2 AUC of the logistic regression

## Area under the curve: 0.7188

19. Results

19.1 Physician Characteristics

The study population included 585 dermatologists, split between 269 (46%) general dermatologists and 316 (54%) pediatric dermatologists. Among successfully reached physicians, 425 (78.8%) accepted pediatric patients, with general dermatologists being less likely to do so (57.7%) compared to pediatric dermatologists (94.2%).

The median age of the dermatologists was 47 years (IQR 41–59 years), with 77.2% identifying as female. General dermatologists were older (median 51 years) compared to pediatric dermatologists (median 45 years; p < 0.01). A higher proportion of pediatric dermatologists were women (94.2%) compared to general dermatologists (57.3%; p < 0.01).

## # A tibble: 2 × 3
##   Subspecialty          Count Percentage
##   <fct>                 <int>      <dbl>
## 1 General Dermatology     269       46.0
## 2 Pediatric Dermatology   316       54.0
## # A tibble: 2 × 4
##   Subspecialty          Accepted Total Acceptance_Rate
##   <fct>                    <int> <int>           <dbl>
## 1 General Dermatology        131   227            57.7
## 2 Pediatric Dermatology      294   312            94.2
## # A tibble: 1 × 3
##   Median_Age IQR_Lower IQR_Upper
##        <dbl>     <dbl>     <dbl>
## 1         47        41        59
## # A tibble: 2 × 4
##   Subspecialty          Median_Age IQR_Lower IQR_Upper
##   <fct>                      <dbl>     <dbl>     <dbl>
## 1 General Dermatology           51        42        63
## 2 Pediatric Dermatology         45        40        53
## # A tibble: 6 × 4
## # Groups:   Subspecialty [2]
##   Subspecialty          gender Count Percentage
##   <fct>                 <fct>  <int>      <dbl>
## 1 General Dermatology   Female   153     56.9  
## 2 General Dermatology   Male     114     42.4  
## 3 General Dermatology   <NA>       2      0.743
## 4 Pediatric Dermatology Female   295     93.4  
## 5 Pediatric Dermatology Male      18      5.70 
## 6 Pediatric Dermatology <NA>       3      0.949
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  Age by Subspecialty
## W = 47177, p-value = 0.00003875
## alternative hypothesis: true location shift is not equal to 0
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  base::table(df$Subspecialty, df$gender)
## X-squared = 109.79, df = 1, p-value < 0.00000000000000022

Appointment Wait Times

The analysis of appointment wait times revealed significant differences between general dermatologists and pediatric dermatologists. Using Poisson regression, the baseline rate of business days until an appointment (intercept) for general dermatologists was estimated to be 53 days (95% CI: 52–55). Pediatric dermatologists had a significantly longer wait time, with an incidence rate ratio (IRR) of 1.68 (95% CI: 1.60–1.70), representing a 68.2% longer wait time compared to general dermatologists (p < 0.01). For a baseline wait time of 53 days, this corresponds to an additional 36 days, resulting in an estimated wait time of 89 days for pediatric dermatologists.

Wait times varied significantly across medical scenarios. For “Toddler Eczema,” appointments had an IRR of 0.84 (95% CI: 0.78–0.90; p < 0.001), indicating a 16% shorter wait time compared to the reference scenario, “Teenage Acne.” For an estimated baseline wait time of 53 days, this corresponds to a reduction of 8 days, resulting in an estimated wait time of 45 days. Similarly, “Infantile Hemangioma” appointments had an IRR of 0.72 (95% CI: 0.65–0.80; p < 0.001), reflecting a 28% shorter wait time. For the same baseline, this corresponds to a reduction of 15 days, resulting in an estimated wait time of 38 days.

## # A tibble: 1 × 3
##   Median_Wait IQR_Lower IQR_Upper
##         <dbl>     <dbl>     <dbl>
## 1        32.5      7.75      82.2
## 
## Call:
## glm(formula = business_days_until_appointment ~ Subspecialty + 
##     scenario, family = poisson(link = "log"), data = filtered_df)
## 
## Coefficients:
##                                   Estimate Std. Error z value
## (Intercept)                        4.21221    0.02081  202.37
## SubspecialtyPediatric Dermatology  0.37331    0.01973   18.92
## scenarioInfantile Hemangioma      -0.40222    0.02524  -15.93
## scenarioToddler Eczema            -0.26388    0.02273  -11.61
##                                              Pr(>|z|)    
## (Intercept)                       <0.0000000000000002 ***
## SubspecialtyPediatric Dermatology <0.0000000000000002 ***
## scenarioInfantile Hemangioma      <0.0000000000000002 ***
## scenarioToddler Eczema            <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 9756.0  on 164  degrees of freedom
## Residual deviance: 9018.8  on 161  degrees of freedom
##   (207 observations deleted due to missingness)
## AIC: 9908.6
## 
## Number of Fisher Scoring iterations: 5
## Baseline Wait Time (General Dermatologists):
## 32.50 days (95% CI: 2105.86–2284.89)
## 
## Pediatric Dermatologists Wait Time:
## NA days (95% CI: 45.42–49.07)
## 
## Scenario Wait Times:
## scenarioInfantile Hemangioma: 21.74 days (95% CI: 20.69–22.84)
## scenarioToddler Eczema: 24.96 days (95% CI: 23.87–26.10)

20. Full Poisson Model: poisson_full_model

## Creating formula with response variable: business_days_until_appointment 
## Predictor variables identified: first, last, scenario, Subspecialty, state, practice_setting, NPI, able_to_contact_office, call_date_wday, central_number, number_of_transfers, call_time_minutes, will_this_physician_see_children_if_they_ask_about_insurance_say_they_have_blue_cross_blue_shield, reason_for_exclusions, hold_time_minutes, day_of_the_week, contacted, city, gender, honorrific, Teledermatology, languages_spoken, CareCredit_accepted, Age, Division, Rural_Urban, median_household_income_2022, total_under_21 
## Predictor variables after formatting: `first`, `last`, `scenario`, `Subspecialty`, `state`, `practice_setting`, `NPI`, `able_to_contact_office`, `call_date_wday`, `central_number`, `number_of_transfers`, `call_time_minutes`, `will_this_physician_see_children_if_they_ask_about_insurance_say_they_have_blue_cross_blue_shield`, `reason_for_exclusions`, `hold_time_minutes`, `day_of_the_week`, `contacted`, `city`, `gender`, `honorrific`, `Teledermatology`, `languages_spoken`, `CareCredit_accepted`, `Age`, `Division`, `Rural_Urban`, `median_household_income_2022`, `total_under_21` 
## Initial formula string: business_days_until_appointment ~ `first` + `last` + `scenario` + `Subspecialty` + `state` + `practice_setting` + `NPI` + `able_to_contact_office` + `call_date_wday` + `central_number` + `number_of_transfers` + `call_time_minutes` + `will_this_physician_see_children_if_they_ask_about_insurance_say_they_have_blue_cross_blue_shield` + `reason_for_exclusions` + `hold_time_minutes` + `day_of_the_week` + `contacted` + `city` + `gender` + `honorrific` + `Teledermatology` + `languages_spoken` + `CareCredit_accepted` + `Age` + `Division` + `Rural_Urban` + `median_household_income_2022` + `total_under_21` 
## Final formula object created:
## business_days_until_appointment ~ first + last + scenario + Subspecialty + 
##     state + practice_setting + NPI + able_to_contact_office + 
##     call_date_wday + central_number + number_of_transfers + call_time_minutes + 
##     will_this_physician_see_children_if_they_ask_about_insurance_say_they_have_blue_cross_blue_shield + 
##     reason_for_exclusions + hold_time_minutes + day_of_the_week + 
##     contacted + city + gender + honorrific + Teledermatology + 
##     languages_spoken + CareCredit_accepted + Age + Division + 
##     Rural_Urban + median_household_income_2022 + total_under_21
## <environment: 0x7f866243c0a0>
## business_days_until_appointment ~ first + last + scenario + Subspecialty + 
##     state + practice_setting + NPI + able_to_contact_office + 
##     call_date_wday + central_number + number_of_transfers + call_time_minutes + 
##     will_this_physician_see_children_if_they_ask_about_insurance_say_they_have_blue_cross_blue_shield + 
##     reason_for_exclusions + hold_time_minutes + day_of_the_week + 
##     contacted + city + gender + honorrific + Teledermatology + 
##     languages_spoken + CareCredit_accepted + Age + Division + 
##     Rural_Urban + median_household_income_2022 + total_under_21
## <environment: 0x7f866243c0a0>
## [1] "formula"
## Filtering data for model...
## Near-zero variance variables removed:
## [1] "contacted"   "honorrific"  "Rural_Urban"
## Variables retained for the model:
##  [1] "scenario"                        "Subspecialty"                   
##  [3] "state"                           "practice_setting"               
##  [5] "NPI"                             "call_date_wday"                 
##  [7] "central_number"                  "number_of_transfers"            
##  [9] "call_time_minutes"               "business_days_until_appointment"
## [11] "hold_time_minutes"               "day_of_the_week"                
## [13] "city"                            "gender"                         
## [15] "languages_spoken"                "CareCredit_accepted"            
## [17] "Age"                             "Division"                       
## [19] "median_household_income_2022"    "total_under_21"

20.1 Specifying Full Model for Predictors

20.2 Single Predictor Models for Evaluating Significance

This analysis explores the significance of various predictors on the outcome variable business_days_until_appointment, accounting for the random effects associated with physicians. The goal is to identify which variables significantly influence the time to appointment while controlling for variability across individual physicians.

The step-by-step approach demonstrates how individual predictors are assessed for their significance in influencing the response variable while accounting for the random effects associated with repeated measures on physicians. Significant variables will be used in the final multivariate model to better understand their impact on appointment wait times.

20.2.1 Preparing Data for Single Predictor Models

For poisson_full_model: This analysis explores the significance of various predictors on the outcome variable business_days_until_appointment, accounting for the random effects associated with physicians. The goal is to identify which variables significantly influence the time to appointment while controlling for variability across individual physicians.

The step-by-step approach demonstrates how individual predictors are assessed for their significance in influencing the response variable while accounting for the random effects associated with repeated measures on physicians. Significant variables will be used in the final multivariate model to better understand their impact on appointment wait times.

##  [1] "scenario"                        "Subspecialty"                   
##  [3] "practice_setting"                "NPI"                            
##  [5] "able_to_contact_office"          "call_date_wday"                 
##  [7] "central_number"                  "number_of_transfers"            
##  [9] "call_time_minutes"               "hold_time_minutes"              
## [11] "city"                            "gender"                         
## [13] "honorrific"                      "Teledermatology"                
## [15] "languages_spoken"                "CareCredit_accepted"            
## [17] "Age"                             "Division"                       
## [19] "Rural_Urban"                     "median_household_income_2022"   
## [21] "total_under_21"                  "business_days_until_appointment"
## 
## Call:
## glm(formula = business_days_until_appointment ~ scenario, family = poisson(link = "log"), 
##     data = df_filtered)
## 
## Coefficients:
##                              Estimate Std. Error z value             Pr(>|z|)
## (Intercept)                   4.49441    0.01062 423.096 < 0.0000000000000002
## scenarioInfantile Hemangioma -0.32501    0.01626 -19.987 < 0.0000000000000002
## scenarioToddler Eczema       -0.08738    0.01470  -5.943         0.0000000028
##                                 
## (Intercept)                  ***
## scenarioInfantile Hemangioma ***
## scenarioToddler Eczema       ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18770  on 318  degrees of freedom
## Residual deviance: 18341  on 316  degrees of freedom
##   (266 observations deleted due to missingness)
## AIC: 20135
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ Subspecialty, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                                   Estimate Std. Error z value
## (Intercept)                        3.97111    0.01431  277.41
## SubspecialtyPediatric Dermatology  0.52024    0.01595   32.62
##                                              Pr(>|z|)    
## (Intercept)                       <0.0000000000000002 ***
## SubspecialtyPediatric Dermatology <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18770  on 318  degrees of freedom
## Residual deviance: 17592  on 317  degrees of freedom
##   (266 observations deleted due to missingness)
## AIC: 19384
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ practice_setting, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                                   Estimate Std. Error z value
## (Intercept)                       4.609343   0.009515  484.44
## practice_settingPrivate Practice -0.389459   0.014703  -26.49
##                                             Pr(>|z|)    
## (Intercept)                      <0.0000000000000002 ***
## practice_settingPrivate Practice <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 12932  on 226  degrees of freedom
## Residual deviance: 12220  on 225  degrees of freedom
##   (358 observations deleted due to missingness)
## AIC: 13516
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ call_date_wday, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                   Estimate Std. Error z value             Pr(>|z|)    
## (Intercept)       4.334407   0.007247 598.109 < 0.0000000000000002 ***
## call_date_wday.L -0.318682   0.018310 -17.404 < 0.0000000000000002 ***
## call_date_wday.Q -0.078145   0.017245  -4.531           0.00000586 ***
## call_date_wday.C  0.069436   0.014827   4.683           0.00000283 ***
## call_date_wday^4  0.193470   0.014065  13.755 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18770  on 318  degrees of freedom
## Residual deviance: 18218  on 314  degrees of freedom
##   (266 observations deleted due to missingness)
## AIC: 20016
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ central_number, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                    Estimate Std. Error z value            Pr(>|z|)    
## (Intercept)        4.381149   0.008554 512.202 <0.0000000000000002 ***
## central_numberYes -0.030871   0.012663  -2.438              0.0148 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18770  on 318  degrees of freedom
## Residual deviance: 18764  on 317  degrees of freedom
##   (266 observations deleted due to missingness)
## AIC: 20556
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ number_of_transfers, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                                            Estimate Std. Error z value
## (Intercept)                                 4.16515    0.01420  293.32
## number_of_transfersOne transfer             0.24737    0.01647   15.02
## number_of_transfersTwo transfers            0.23070    0.02195   10.51
## number_of_transfersMore than two transfers  0.37947    0.02538   14.95
##                                                       Pr(>|z|)    
## (Intercept)                                <0.0000000000000002 ***
## number_of_transfersOne transfer            <0.0000000000000002 ***
## number_of_transfersTwo transfers           <0.0000000000000002 ***
## number_of_transfersMore than two transfers <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18770  on 318  degrees of freedom
## Residual deviance: 18454  on 315  degrees of freedom
##   (266 observations deleted due to missingness)
## AIC: 20250
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ call_time_minutes, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                   Estimate Std. Error z value            Pr(>|z|)    
## (Intercept)       4.019668   0.015732  255.51 <0.0000000000000002 ***
## call_time_minutes 0.108935   0.004365   24.96 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18770  on 318  degrees of freedom
## Residual deviance: 18145  on 317  degrees of freedom
##   (266 observations deleted due to missingness)
## AIC: 19937
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ hold_time_minutes, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                   Estimate Std. Error z value            Pr(>|z|)    
## (Intercept)       4.315117   0.007798  553.38 <0.0000000000000002 ***
## hold_time_minutes 0.052915   0.004393   12.04 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18643  on 315  degrees of freedom
## Residual deviance: 18503  on 314  degrees of freedom
##   (269 observations deleted due to missingness)
## AIC: 20278
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ city, family = poisson(link = "log"), 
##     data = df_filtered)
## 
## Coefficients:
##                                 Estimate            Std. Error z value
## (Intercept)         3.663561646129641414  0.060522753247079744  60.532
## cityNew Hyde Park   0.904252753314679913  0.073437779560899585  12.313
## citySan Francisco   0.610786974746050482  0.077351078049734304   7.896
## cityAnn Arbor       0.948726007627312740  0.071284243538940362  13.309
## cityAtlanta         0.325422400434626780  0.099175817871493099   3.281
## cityBaltimore       1.414732296440427417  0.072253969159906978  19.580
## cityBronx          -0.719122666963202928  0.129695406231088217  -5.545
## cityBrooklyn       -0.845163387858570481  0.105478387481150626  -8.013
## cityChapel Hill     0.375583366737843483  0.074966331262983521   5.010
## cityChicago         1.102268599871224497  0.069853766346363602  15.780
## cityCincinnati      0.730887508542799269  0.072154095775287225  10.130
## cityClackamas       0.671548068756488203  0.076461818964563250   8.783
## cityCleveland       0.841788204576238486  0.074193155255440651  11.346
## cityDetroit         0.805215914952894796  0.071383950696594964  11.280
## cityDurham          0.620024915730986281  0.084328402822986509   7.353
## cityGainesville     1.284184559417679283  0.068390075621893587  18.777
## cityGilbert         0.737722805148654626  0.070890066152408462  10.407
## cityHouston         0.444028142842479490  0.083381516805677805   5.325
## cityKansas City     0.762780872318751246  0.071817629949189513  10.621
## cityLittle Rock    -0.994351278343695744  0.116464560101418022  -8.538
## cityMemphis         1.134979932838989125  0.080064076887575802  14.176
## cityMesa            0.514664400073160389  0.074690779032001511   6.891
## cityMiami          -0.791882021245630563  0.114441975610784494  -6.920
## cityMilwaukee       0.896872646017058295  0.070499597183888721  12.722
## cityMinneapolis     1.486110689983824562  0.066243239823534389  22.434
## cityNew York        0.546712383099519306  0.074285201799593720   7.360
## cityOakland         0.784954729813072793  0.074915428313521187  10.478
## cityPalo Alto       0.016949558313777650  0.109847007194083149   0.154
## cityPhoenix         0.798026811380426748  0.071456316050560090  11.168
## cityPortland        0.613104472886524698  0.084470256565076507   7.258
## cityProvidence      0.105360515657831555  0.106561303262087681   0.989
## cityRichmond        1.093716318055908410  0.069928517256106548  15.640
## cityRochester       0.715335095535312382  0.082448771360472226   8.676
## citySacramento      0.903387327038252974  0.070439370804318113  12.825
## citySaint Louis     0.993568278789160542  0.072424320096892536  13.719
## citySan Diego       0.529118816813320958  0.081756661483090326   6.472
## citySeattle         0.942857759258880712  0.070081688844473050  13.454
## cityTucson          1.235343684023308741  0.068760209328589852  17.966
## cityWashington     -0.157003748809659477  0.116888851595432539  -1.343
## cityAustin          0.325422400434632053  0.091063896744015008   3.574
## cityColumbus        0.487478259769003941  0.079294864755653519   6.148
## cityJacksonville   -0.127444946568117390  0.092289612593322246  -1.381
## cityMadison         1.087150845149109424  0.080900235110970969  13.438
## citySalt Lake City  0.148026961976734373  0.078230467962357025   1.892
## cityWorcester       1.066771682812456268  0.081266106903155777  13.127
## cityAlbuquerque     0.961411167154628488  0.092547093119776755  10.388
## cityEl Paso        -0.202300023724837325  0.090262442872669935  -2.241
## cityIndianapolis   -1.717651497074332623  0.274028420221657576  -6.268
## cityAurora          0.911149332373741627  0.071664201039853589  12.714
## cityBoston          0.000000000000005458  0.171184196997364840   0.000
## cityDallas         -0.619039208406217178  0.165748386025719063  -3.735
## cityMarlton        -1.977162692559410129  0.201742510883212683  -9.800
## cityNew Haven       0.796582767808191483  0.080951695911892579   9.840
## cityOmaha           0.156346070390697311  0.089716044121927324   1.743
## citySilver Spring   0.566915090417038692  0.085436334758523858   6.636
## cityDenver         -1.717651497074328848  0.382779501172344827  -4.487
## cityMonroe          1.523824159711113069  0.096174819154380767  15.844
##                                Pr(>|z|)    
## (Intercept)        < 0.0000000000000002 ***
## cityNew Hyde Park  < 0.0000000000000002 ***
## citySan Francisco   0.00000000000000287 ***
## cityAnn Arbor      < 0.0000000000000002 ***
## cityAtlanta                    0.001033 ** 
## cityBaltimore      < 0.0000000000000002 ***
## cityBronx           0.00000002944514786 ***
## cityBrooklyn        0.00000000000000112 ***
## cityChapel Hill     0.00000054422312780 ***
## cityChicago        < 0.0000000000000002 ***
## cityCincinnati     < 0.0000000000000002 ***
## cityClackamas      < 0.0000000000000002 ***
## cityCleveland      < 0.0000000000000002 ***
## cityDetroit        < 0.0000000000000002 ***
## cityDurham          0.00000000000019453 ***
## cityGainesville    < 0.0000000000000002 ***
## cityGilbert        < 0.0000000000000002 ***
## cityHouston         0.00000010080949323 ***
## cityKansas City    < 0.0000000000000002 ***
## cityLittle Rock    < 0.0000000000000002 ***
## cityMemphis        < 0.0000000000000002 ***
## cityMesa            0.00000000000555569 ***
## cityMiami           0.00000000000453219 ***
## cityMilwaukee      < 0.0000000000000002 ***
## cityMinneapolis    < 0.0000000000000002 ***
## cityNew York        0.00000000000018441 ***
## cityOakland        < 0.0000000000000002 ***
## cityPalo Alto                  0.877372    
## cityPhoenix        < 0.0000000000000002 ***
## cityPortland        0.00000000000039219 ***
## cityProvidence                 0.322795    
## cityRichmond       < 0.0000000000000002 ***
## cityRochester      < 0.0000000000000002 ***
## citySacramento     < 0.0000000000000002 ***
## citySaint Louis    < 0.0000000000000002 ***
## citySan Diego       0.00000000009679479 ***
## citySeattle        < 0.0000000000000002 ***
## cityTucson         < 0.0000000000000002 ***
## cityWashington                 0.179211    
## cityAustin                     0.000352 ***
## cityColumbus        0.00000000078631917 ***
## cityJacksonville               0.167302    
## cityMadison        < 0.0000000000000002 ***
## citySalt Lake City             0.058466 .  
## cityWorcester      < 0.0000000000000002 ***
## cityAlbuquerque    < 0.0000000000000002 ***
## cityEl Paso                    0.025010 *  
## cityIndianapolis    0.00000000036536057 ***
## cityAurora         < 0.0000000000000002 ***
## cityBoston                     1.000000    
## cityDallas                     0.000188 ***
## cityMarlton        < 0.0000000000000002 ***
## cityNew Haven      < 0.0000000000000002 ***
## cityOmaha                      0.081390 .  
## citySilver Spring   0.00000000003233456 ***
## cityDenver          0.00000721270140830 ***
## cityMonroe         < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 17423  on 313  degrees of freedom
## Residual deviance: 11094  on 257  degrees of freedom
##   (271 observations deleted due to missingness)
## AIC: 12963
## 
## Number of Fisher Scoring iterations: 6
## 
## Call:
## glm(formula = business_days_until_appointment ~ gender, family = poisson(link = "log"), 
##     data = df_filtered)
## 
## Coefficients:
##              Estimate Std. Error z value            Pr(>|z|)    
## (Intercept)  4.409704   0.006878  641.11 <0.0000000000000002 ***
## genderMale  -0.475817   0.019763  -24.08 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 17423  on 313  degrees of freedom
## Residual deviance: 16770  on 312  degrees of freedom
##   (271 observations deleted due to missingness)
## AIC: 18529
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ honorrific, family = poisson(link = "log"), 
##     data = df_filtered)
## 
## Coefficients:
##               Estimate Std. Error z value            Pr(>|z|)    
## (Intercept)   4.348082   0.006565  662.27 <0.0000000000000002 ***
## honorrificDO -0.093114   0.036516   -2.55              0.0108 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 17258  on 310  degrees of freedom
## Residual deviance: 17251  on 309  degrees of freedom
##   (274 observations deleted due to missingness)
## AIC: 18995
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ Teledermatology, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                    Estimate Std. Error z value            Pr(>|z|)    
## (Intercept)         4.13134    0.01112 371.686 <0.0000000000000002 ***
## TeledermatologyYes  0.17272    0.01922   8.989 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 10527  on 184  degrees of freedom
## Residual deviance: 10448  on 183  degrees of freedom
##   (400 observations deleted due to missingness)
## AIC: 11444
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ languages_spoken, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                         Estimate Std. Error z value            Pr(>|z|)    
## (Intercept)              3.36198    0.05164  65.104 <0.0000000000000002 ***
## languages_spokenSpanish  0.18700    0.07656   2.443              0.0146 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 997.29  on 21  degrees of freedom
## Residual deviance: 991.36  on 20  degrees of freedom
##   (563 observations deleted due to missingness)
## AIC: 1088.3
## 
## Number of Fisher Scoring iterations: 6
## 
## Call:
## glm(formula = business_days_until_appointment ~ CareCredit_accepted, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                        Estimate Std. Error z value             Pr(>|z|)    
## (Intercept)            4.173326   0.009408 443.593 < 0.0000000000000002 ***
## CareCredit_acceptedYes 0.192664   0.035260   5.464         0.0000000465 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 10527  on 184  degrees of freedom
## Residual deviance: 10499  on 183  degrees of freedom
##   (400 observations deleted due to missingness)
## AIC: 11495
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ Age, family = poisson(link = "log"), 
##     data = df_filtered)
## 
## Coefficients:
##               Estimate Std. Error z value            Pr(>|z|)    
## (Intercept)  5.0887866  0.0296846  171.43 <0.0000000000000002 ***
## Age         -0.0151475  0.0006301  -24.04 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 17713  on 306  degrees of freedom
## Residual deviance: 17102  on 305  degrees of freedom
##   (278 observations deleted due to missingness)
## AIC: 18835
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ Division, family = poisson(link = "log"), 
##     data = df_filtered)
## 
## Coefficients:
##                            Estimate Std. Error z value             Pr(>|z|)    
## (Intercept)                 4.37296    0.01528 286.120 < 0.0000000000000002 ***
## DivisionEast North Central  0.12236    0.02072   5.905  0.00000000353145625 ***
## DivisionEast South Central  0.42558    0.05460   7.795  0.00000000000000644 ***
## DivisionMiddle Atlantic    -0.59613    0.03118 -19.122 < 0.0000000000000002 ***
## DivisionMountain            0.03448    0.02174   1.586               0.1127    
## DivisionNew England         0.45536    0.02837  16.049 < 0.0000000000000002 ***
## DivisionPacific            -0.05321    0.02201  -2.417               0.0156 *  
## DivisionWest North Central  0.27236    0.02331  11.683 < 0.0000000000000002 ***
## DivisionWest South Central -0.80512    0.03691 -21.814 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18770  on 318  degrees of freedom
## Residual deviance: 16733  on 310  degrees of freedom
##   (266 observations deleted due to missingness)
## AIC: 18538
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ median_household_income_2022, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                                   Estimate    Std. Error z value
## (Intercept)                   4.4130839229  0.0165435302 266.756
## median_household_income_2022 -0.0000009136  0.0000001877  -4.868
##                                          Pr(>|z|)    
## (Intercept)                  < 0.0000000000000002 ***
## median_household_income_2022           0.00000113 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 15801  on 277  degrees of freedom
## Residual deviance: 15777  on 276  degrees of freedom
##   (307 observations deleted due to missingness)
## AIC: 17336
## 
## Number of Fisher Scoring iterations: 5
## 
## Call:
## glm(formula = business_days_until_appointment ~ total_under_21, 
##     family = poisson(link = "log"), data = df_filtered)
## 
## Coefficients:
##                    Estimate   Std. Error z value             Pr(>|z|)    
## (Intercept)     4.404275184  0.010227247 430.641 < 0.0000000000000002 ***
## total_under_21 -0.000005249  0.000001020  -5.145          0.000000268 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 18706  on 315  degrees of freedom
## Residual deviance: 18680  on 314  degrees of freedom
##   (269 observations deleted due to missingness)
## AIC: 20452
## 
## Number of Fisher Scoring iterations: 5

Changing to a Negative Binomial Model as Poisson does not fit well

## Skipping predictor 'Rural_Urban' because it has only one unique value.
##                                     Predictor P_Value  IRR CI_Lower CI_Upper
## 1           SubspecialtyPediatric Dermatology   <0.01 1.68     1.32     2.14
## 2                                         Age   <0.01 0.98     0.97     0.99
## 3                                 cityMarlton   <0.01 0.14     0.05     0.39
## 4                             cityMinneapolis   <0.01 4.42     1.88    10.38
## 5                  DivisionWest South Central   <0.01 0.45     0.28     0.71
## 6                                  genderMale   <0.01 0.62     0.47     0.83
## 7            practice_settingPrivate Practice   <0.01 0.68     0.53     0.87
## 8                           call_time_minutes   <0.01 1.12     1.04     1.21
## 9                             cityGainesville   <0.01 3.61     1.50     8.72
## 10                                 cityTucson   <0.01 3.44     1.42     8.31
## 11                    DivisionMiddle Atlantic   <0.01 0.55     0.36     0.85
## 12                              cityBaltimore   <0.01 4.12     1.47    11.55
## 13                                cityChicago   0.014 3.01     1.25     7.28
## 14                               cityRichmond   0.015 2.99     1.24     7.21
## 15                           cityIndianapolis   0.017 0.18     0.04     0.74
## 16               scenarioInfantile Hemangioma   0.022 0.72     0.55     0.95
## 17                            cityLittle Rock   0.031 0.37     0.15     0.91
## 18                                citySeattle   0.031 2.57     1.09     6.04
## 19                            citySaint Louis   0.034 2.70     1.08     6.77
## 20                              cityAnn Arbor   0.035 2.58     1.07     6.24
## 21                             citySacramento   0.038 2.47     1.05     5.80
## 22                           call_date_wday.L   0.040 0.73     0.54     0.98
## 23                              cityMilwaukee   0.040 2.45     1.04     5.77
## 24                                 cityAurora   0.043 2.49     1.03     6.01
## 25                                cityMemphis   0.050 3.11     1.00     9.69
## 26                          cityNew Hyde Park   0.054 2.47     0.99     6.19
## 27                               cityBrooklyn   0.057 0.43     0.18     1.02
## 28                                cityMadison   0.061 2.97     0.95     9.24
## 29                                cityDetroit   0.065 2.24     0.95     5.26
## 30                              cityWorcester   0.066 2.91     0.93     9.05
## 31                                cityPhoenix   0.068 2.22     0.94     5.23
## 32                              cityCleveland   0.073 2.32     0.93     5.82
## 33            number_of_transfersOne transfer   0.073 1.28     0.98     1.68
## 34                                 cityDenver   0.076 0.18     0.03     1.20
## 35                            cityKansas City   0.081 2.14     0.91     5.05
## 36                                cityGilbert   0.083 2.09     0.91     4.81
## 37                                 cityMonroe   0.089 4.59     0.79    26.50
## 38                             cityCincinnati   0.094 2.08     0.88     4.89
## 39                                cityOakland   0.094 2.19     0.87     5.50
## 40                                  cityMiami   0.097 0.45     0.18     1.15
## 41 number_of_transfersMore than two transfers   0.107 1.46     0.92     2.32
## 42                        DivisionNew England   0.118 1.58     0.89     2.79
## 43                           call_date_wday^4   0.125 1.21     0.95     1.55
## 44                              cityNew Haven   0.131 2.22     0.79     6.24
## 45                              cityClackamas   0.152 1.96     0.78     4.91
## 46                            cityAlbuquerque   0.153 2.62     0.70     9.79
## 47                              cityRochester   0.175 2.04     0.73     5.76
## 48                                  cityBronx   0.181 0.49     0.17     1.40
## 49                          hold_time_minutes   0.187 1.06     0.97     1.15
## 50                          citySan Francisco   0.193 1.84     0.73     4.62
##     Wait_Time_Effect reference_level
## 1   longer wait time            <NA>
## 2  shorter wait time            <NA>
## 3  shorter wait time            <NA>
## 4   longer wait time            <NA>
## 5  shorter wait time            <NA>
## 6  shorter wait time            <NA>
## 7  shorter wait time            <NA>
## 8   longer wait time            <NA>
## 9   longer wait time            <NA>
## 10  longer wait time            <NA>
## 11 shorter wait time            <NA>
## 12  longer wait time            <NA>
## 13  longer wait time            <NA>
## 14  longer wait time            <NA>
## 15 shorter wait time            <NA>
## 16 shorter wait time            <NA>
## 17 shorter wait time            <NA>
## 18  longer wait time            <NA>
## 19  longer wait time            <NA>
## 20  longer wait time            <NA>
## 21  longer wait time            <NA>
## 22 shorter wait time          Monday
## 23  longer wait time            <NA>
## 24  longer wait time            <NA>
## 25  longer wait time            <NA>
## 26  longer wait time            <NA>
## 27 shorter wait time            <NA>
## 28  longer wait time            <NA>
## 29  longer wait time            <NA>
## 30  longer wait time            <NA>
## 31  longer wait time            <NA>
## 32  longer wait time            <NA>
## 33  longer wait time            <NA>
## 34 shorter wait time            <NA>
## 35  longer wait time            <NA>
## 36  longer wait time            <NA>
## 37  longer wait time            <NA>
## 38  longer wait time            <NA>
## 39  longer wait time            <NA>
## 40 shorter wait time            <NA>
## 41  longer wait time            <NA>
## 42  longer wait time            <NA>
## 43  longer wait time          Monday
## 44  longer wait time            <NA>
## 45  longer wait time            <NA>
## 46  longer wait time            <NA>
## 47  longer wait time            <NA>
## 48 shorter wait time            <NA>
## 49  longer wait time            <NA>
## 50  longer wait time            <NA>
Significant Variables Predicting Number of Business Days until Appointment
Predictor P_Value IRR CI_Lower CI_Upper Wait_Time_Effect reference_level
SubspecialtyPediatric Dermatology <0.01 1.68 1.32 2.14 longer wait time NA
Age <0.01 0.98 0.97 0.99 shorter wait time NA
cityMarlton <0.01 0.14 0.05 0.39 shorter wait time NA
cityMinneapolis <0.01 4.42 1.88 10.38 longer wait time NA
DivisionWest South Central <0.01 0.45 0.28 0.71 shorter wait time NA
genderMale <0.01 0.62 0.47 0.83 shorter wait time NA
practice_settingPrivate Practice <0.01 0.68 0.53 0.87 shorter wait time NA
call_time_minutes <0.01 1.12 1.04 1.21 longer wait time NA
cityGainesville <0.01 3.61 1.50 8.72 longer wait time NA
cityTucson <0.01 3.44 1.42 8.31 longer wait time NA
DivisionMiddle Atlantic <0.01 0.55 0.36 0.85 shorter wait time NA
cityBaltimore <0.01 4.12 1.47 11.55 longer wait time NA
cityChicago 0.014 3.01 1.25 7.28 longer wait time NA
cityRichmond 0.015 2.99 1.24 7.21 longer wait time NA
cityIndianapolis 0.017 0.18 0.04 0.74 shorter wait time NA
scenarioInfantile Hemangioma 0.022 0.72 0.55 0.95 shorter wait time NA
cityLittle Rock 0.031 0.37 0.15 0.91 shorter wait time NA
citySeattle 0.031 2.57 1.09 6.04 longer wait time NA
citySaint Louis 0.034 2.70 1.08 6.77 longer wait time NA
cityAnn Arbor 0.035 2.58 1.07 6.24 longer wait time NA
citySacramento 0.038 2.47 1.05 5.80 longer wait time NA
call_date_wday.L 0.040 0.73 0.54 0.98 shorter wait time Monday
cityMilwaukee 0.040 2.45 1.04 5.77 longer wait time NA
cityAurora 0.043 2.49 1.03 6.01 longer wait time NA
cityMemphis 0.050 3.11 1.00 9.69 longer wait time NA
cityNew Hyde Park 0.054 2.47 0.99 6.19 longer wait time NA
cityBrooklyn 0.057 0.43 0.18 1.02 shorter wait time NA
cityMadison 0.061 2.97 0.95 9.24 longer wait time NA
cityDetroit 0.065 2.24 0.95 5.26 longer wait time NA
cityWorcester 0.066 2.91 0.93 9.05 longer wait time NA
cityPhoenix 0.068 2.22 0.94 5.23 longer wait time NA
cityCleveland 0.073 2.32 0.93 5.82 longer wait time NA
number_of_transfersOne transfer 0.073 1.28 0.98 1.68 longer wait time NA
cityDenver 0.076 0.18 0.03 1.20 shorter wait time NA
cityKansas City 0.081 2.14 0.91 5.05 longer wait time NA
cityGilbert 0.083 2.09 0.91 4.81 longer wait time NA
cityMonroe 0.089 4.59 0.79 26.50 longer wait time NA
cityCincinnati 0.094 2.08 0.88 4.89 longer wait time NA
cityOakland 0.094 2.19 0.87 5.50 longer wait time NA
cityMiami 0.097 0.45 0.18 1.15 shorter wait time NA
number_of_transfersMore than two transfers 0.107 1.46 0.92 2.32 longer wait time NA
DivisionNew England 0.118 1.58 0.89 2.79 longer wait time NA
call_date_wday^4 0.125 1.21 0.95 1.55 longer wait time Monday
cityNew Haven 0.131 2.22 0.79 6.24 longer wait time NA
cityClackamas 0.152 1.96 0.78 4.91 longer wait time NA
cityAlbuquerque 0.153 2.62 0.70 9.79 longer wait time NA
cityRochester 0.175 2.04 0.73 5.76 longer wait time NA
cityBronx 0.181 0.49 0.17 1.40 shorter wait time NA
hold_time_minutes 0.187 1.06 0.97 1.15 longer wait time NA
citySan Francisco 0.193 1.84 0.73 4.62 longer wait time NA

Troubleshooting large IRR for academic

From the analysis and boxplot you provided, the issue with the high IRR seems clearer now. Let’s break down the results and address what might be going on:

Key Insights: 1. Sample Imbalance: - There is a major imbalance in the number of observations between Private Practice (556 cases) and University (47 cases). This discrepancy could lead to inflated coefficients, especially if the smaller group (University) has greater variability in wait times. This could explain why the estimate for academicUniversity is so large and significant.

  1. Fixed Effects:
    • The model indicates that being at a University is associated with a longer wait time, with an Estimate of 13.905 (p = 0.00124). This suggests that patients at University settings wait, on average, about 13.9 more days than those at private practices.
    • However, due to the imbalance in the dataset and some high variance in wait times for university cases, this estimate might be exaggerated. The few outliers seen in the boxplot for University settings could be contributing to this as well.
  2. Random Effects:
    • The random effects (NPI) show variability among individual providers (standard deviation of 17.53). This means that individual providers still account for a fair amount of variation in wait times, which is typical in mixed-effects models.

Recommendations to Address the IRR Issue:

  1. Consider Balancing the Dataset:
    • The imbalance between University and Private Practice may lead to inflated estimates. You could try down-sampling the larger group (Private Practice) or performing bootstrapping to create a more balanced dataset. This might provide a more realistic estimate for the effect of academicUniversity.

20.3 Refitting Full Model Without academic Predictor

## Skipping predictor 'Rural_Urban' because it has only one unique value.
## [1] Predictor        P_Value          IRR              CI_Lower        
## [5] CI_Upper         Wait_Time_Effect
## <0 rows> (or 0-length row.names)
Significant Variables Predicting Number of Business Days until Appointment WITHOUT ACADEMIC
Predictor P_Value IRR CI_Lower CI_Upper Wait_Time_Effect

20.4 Robust LMM With Log-Transformed Business Days and Academic Predictor

20.4.1 Exploring Relationship Between Academic Status and Wait Times

–> –> –> –> –> –> –> –>

–> –> –> –> –> –> –> –> –> –>

  • Fixed effects include…

  • Random effects account for variability between physicians, modeled as a random intercept.

The random effect for physician suggests that there is substantial variability in appointment wait times between physician. Physicians with a higher random intercept will tend to have longer wait times compared to Physicians with a lower random intercept.

poisson Model with only significant variables

Read in data

A PRIORI POWER ANALYSIS

## 
## ============================================
## DERMATOLOGY WAIT TIME POWER ANALYSIS RESULTS
## ============================================
## 
## Analysis type: ANOVA 
## Number of groups: 2 
## Effect size (Cohen's f):0.25 (medium)
## Group sample size: 64 
## Significance level (alpha): 0.05 
## Statistical power: 0.8 
## 
## REQUIRED SAMPLE SIZE:
## ---------------------
## Basic sample size: 128 
## 
## RECOMMENDATION:
## --------------
## To detect an effect size of 0.25 ( medium ) with 80 % power using anova with 2 groups,
##  recruit a total of 128 participants ( 64 per group).
## 
## CONTEXT FROM DERMATOLOGY STUDY:
## ------------------------------
## The original study included 585 total phone calls with 363 (62%) successfully connected.
## City-specific factors accounted for 28% of variability in wait times (ICC = 0.28).
## Consider these findings when designing similar studies.
## 
## ============================================
## DERMATOLOGY WAIT TIME POWER ANALYSIS RESULTS
## ============================================
## 
## Analysis type: ANOVA 
## Number of groups: 3 
## Effect size (Cohen's f):0.2 (small)
## Group sample size: 107 
## Significance level (alpha): 0.05 
## Statistical power: 0.9 
## 
## REQUIRED SAMPLE SIZE:
## ---------------------
## Basic sample size: 320 
## 
## RECOMMENDATION:
## --------------
## To detect an effect size of 0.2 ( small ) with 90 % power using anova with 3 groups,
##  recruit a total of 320 participants ( 107 per group).
## 
## CONTEXT FROM DERMATOLOGY STUDY:
## ------------------------------
## The original study included 585 total phone calls with 363 (62%) successfully connected.
## City-specific factors accounted for 28% of variability in wait times (ICC = 0.28).
## Consider these findings when designing similar studies.
## 
## ============================================
## DERMATOLOGY WAIT TIME POWER ANALYSIS RESULTS
## ============================================
## 
## Analysis type: Regression 
## Number of predictors: 8 
## Effect size (f²):0.15 (medium)
## Significance level (alpha): 0.05 
## Statistical power: 0.8 
## 
## REQUIRED SAMPLE SIZE:
## ---------------------
## Basic sample size: 109 
## 
## RECOMMENDATION:
## --------------
## To detect an effect size of 0.15 ( medium ) with 80 % power using regression with 8 predictors,
##  recruit a total of 109 participants.
## 
## CONTEXT FROM DERMATOLOGY STUDY:
## ------------------------------
## The original study included 585 total phone calls with 363 (62%) successfully connected.
## City-specific factors accounted for 28% of variability in wait times (ICC = 0.28).
## Consider these findings when designing similar studies.

STATISTICAL SUPPLEMENT

Introduction

This supplement describes the power analyses conducted for our study, The Pediatric Dermatology Drought: Are General Dermatologists Leaving Kids Behind?. We conducted a series of a priori power analyses to determine the appropriate sample size required to detect clinically meaningful differences in appointment wait times between general and pediatric dermatologists. The analyses included ANOVA and linear regression models. All calculations were performed using the R statistical environment (version 4.3.3), with the pwr and simr packages.

Methods

ANOVA-Based Power Analysis (Primary Outcome)

To compare the number of business days until the first available appointment between general and pediatric dermatologists, we used a one-way analysis of variance (ANOVA) framework. This approach allowed us to assess whether the mean wait time differed significantly by subspecialty.

The ANOVA model used can be represented by:

\[ Y_{ij} = \mu + \alpha_i + \epsilon_{ij} \]

Where: - \(Y_{ij}\) is the number of business days until an appointment for the \(j\)th patient in the \(i\)th group (general or pediatric dermatologist) - \(\mu\) is the overall mean wait time - \(\alpha_i\) is the effect of the \(i\)th group (with \(\sum \alpha_i = 0\)) - \(\epsilon_{ij}\) is the random error term, assumed to follow \(\mathcal{N}(0, \sigma^2)\)

We specified an effect size of \(f = 0.25\) (Cohen’s medium effect size), a Type I error rate of \(\alpha = 0.05\), and a desired power of 0.80. These inputs yielded a required total sample size of 128 participants, or 64 per group.

Regression-Based Power Analysis

We also conducted a regression-based power analysis to account for multiple predictors beyond physician type. This approach is appropriate for modeling multiple practice- and physician-level characteristics that may influence wait times.

The regression model can be expressed as:

\[ Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + ... + \beta_k X_{ik} + \epsilon_i \]

Where: - \(Y_i\) is the wait time in business days for the \(i\)th observation - \(\beta_0\) is the intercept - \(\beta_1 \dots \beta_k\) are coefficients for \(k\) predictor variables (e.g., subspecialty, practice setting, physician gender) - \(X_{i1} \dots X_{ik}\) are corresponding predictor variables - \(\epsilon_i \sim \mathcal{N}(0, \sigma^2)\) is the error term

We assumed a medium effect size (\(f^2 = 0.15\)), 8 predictors, \(\alpha = 0.05\), and a desired power of 0.80. The estimated sample size was 109 participants.

Alternative Scenario: Comparing Three Clinical Scenarios

To evaluate power in a design comparing wait times across three different pediatric dermatology clinical scenarios, we conducted an additional ANOVA-based power analysis with 3 groups, assuming an effect size of \(f = 0.20\) (small effect) and power of 0.90. This yielded a required total sample size of 320 participants (107 per group).

Results

All power analyses were completed successfully. The following table summarizes the required sample sizes under each scenario:

Analysis Type Effect Size Alpha Power Groups/ Predictors Required Total Sample Size
ANOVA (2 groups) 0.25 (medium) 0.05 0.80 2 128 (64 per group)
ANOVA (3 groups) 0.20 (small) 0.05 0.90 3 320 (107 per group)
Regression 0.15 (medium) 0.05 0.80 8 predictors 109

The original study sample exceeded these thresholds, with 585 total phone calls and 363 successful connections (62%). These results indicate that the study was sufficiently powered to detect medium or larger effects in primary and secondary outcomes.

City-specific variation accounted for 28% of the variability in wait times (intraclass correlation coefficient = 0.28), underscoring the importance of modeling multilevel data structures.

Interpretation

To detect a medium effect (\(f = 0.25\)) in mean wait times between general and pediatric dermatologists, a total of 128 participants (64 per group) would be sufficient. For analyses involving additional clinical scenarios or predictors, larger sample sizes were required. The observed study sample was well above these thresholds, suggesting robust statistical power for primary and secondary analyses.

References

  1. Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Lawrence Erlbaum Associates; 1988.
  2. Faul F, Erdfelder E, Lang AG, Buchner A. GPower 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods*. 2007;39(2):175-191.
  3. Lenth RV. Some practical guidelines for effective sample size determination. The American Statistician. 2001;55(3):187-193.
  4. Gelman A, Hill J. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press; 2006.
  5. Peng CYJ, Long H, Abaci S. Power analysis software for educational researchers. The Journal of Experimental Education. 2012;80(2):113-136.

–>

–>

–>

–> –> –> –> –> –> –> –> –> –> –> –> –> –>

–> –> –> –> –> –>

–> –>

–> –> –>

–> –>

–> –> –>

–>

–> –> –> –> –> –>

–> –> –> –>

–> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –> –>

–> –> –> –> –> –> –> –>

–> –>

–> –> –> –> –> –> –> –>

–> –> –>

–> –> –> –> –> –> –> –> –> –>

–> –> –>

–> –> –> –>

–> –> –>

–> –> –> –>