The goal of this analysis is to understand whether healthcare infrastructure, particularly diagnostic capacity and resource availability, meaningfully influences hospital length of stay, and to what extent targeted investments can help reduce unnecessary inpatient days. The dataset originates from Kaggle and captures healthcare infrastructure indicators for 32 countries, including hospital stay duration, MRI unit availability, and CT scanner capacity. While the dataset reflects broad international health system characteristics, it is used here for analytical and educational purposes rather than for evaluating any specific country or hospital network.
By examining variation in imaging resources and their relationship to patient stay duration, the project investigates how differences in diagnostic capacity may contribute to delays, or extended hospitalizations. To quantify these relationships and support evidence‑based planning, we develop a regression model that links infrastructure indicators to hospital stay length, allowing us to assess which investments such as additional MRI or CT units in different countries, are most strongly associated with reduced inpatient time. This modeling approach provides both interpretability and predictive insight, forming a foundation for evaluating how strategic healthcare investments can improve patient flow, reduce costs, and alleviate logistical pressures across the system.
| Statistic | Hospital_Stay | MRI_Unints | CT_Scanners |
|---|---|---|---|
| Min | 3.40 | 0.10 | 1.48 |
| 1st Quartile | 5.80 | 4.07 | 10.33 |
| Median | 6.65 | 8.77 | 15.38 |
| Mean | 7.14 | 10.57 | 19.65 |
| 3rd Quartile | 7.50 | 13.88 | 26.59 |
| Max | 32.70 | 55.21 | 111.49 |
The summary statistics offer a clear snapshot of healthcare capacity across countries. Hospital stays average 7.14 days, with most patients staying between roughly 5.8 and 7.5 days, suggesting a typical inpatient experience centered on short to moderate recovery periods. MRI availability varies widely; from as few as 0.1 units to over 55. Yet the median of 8.77 units indicates that most facilities operate with a moderate level of imaging resources. CT scanner counts show an even broader spread, ranging from 1.48 to more than 111 units, with a median of 15.38, reflecting substantial differences in diagnostic infrastructure across countries. Together, these patterns highlight a healthcare landscape marked by uneven resource distribution, where some facilities operate with minimal imaging capacity while others are equipped for high‑volume diagnostic demand.
##
## ===============================================
## Dependent variable:
## ---------------------------
## log(Hospital_Stay)
## -----------------------------------------------
## LocationAUT 0.320***
## (0.031)
##
## LocationBEL 0.248***
## (0.038)
##
## LocationCAN 0.201***
## (0.037)
##
## LocationCZE 0.132***
## (0.036)
##
## LocationDEU 0.546***
## (0.035)
##
## LocationDNK -0.476***
## (0.055)
##
## LocationESP 0.130***
## (0.045)
##
## LocationEST 0.015
## (0.040)
##
## LocationFIN 0.154***
## (0.037)
##
## LocationFRA -0.082**
## (0.039)
##
## LocationGBR 0.043
## (0.049)
##
## LocationGRC 0.099***
## (0.038)
##
## LocationHUN -0.052
## (0.041)
##
## LocationIRL 0.074*
## (0.042)
##
## LocationISL 0.151***
## (0.034)
##
## LocationISR -0.230***
## (0.040)
##
## LocationITA 0.293***
## (0.032)
##
## LocationJPN 1.612***
## (0.039)
##
## LocationKOR 0.601***
## (0.031)
##
## LocationLTU 0.183***
## (0.032)
##
## LocationLUX 0.322***
## (0.034)
##
## LocationLVA 0.147***
## (0.030)
##
## LocationNLD 0.098**
## (0.044)
##
## LocationNZL 0.007
## (0.043)
##
## LocationPOL 0.141***
## (0.037)
##
## LocationPRT 0.431***
## (0.054)
##
## LocationRUS 0.454***
## (0.047)
##
## LocationSVK 0.136***
## (0.037)
##
## LocationSVN 0.081*
## (0.043)
##
## LocationTUR -0.329***
## (0.042)
##
## LocationUSA 0.162***
## (0.036)
##
## log(MRI_Units) -0.094***
## (0.012)
##
## log(CT_Scanners) -0.096***
## (0.024)
##
## Constant 2.209***
## (0.073)
##
## -----------------------------------------------
## Observations 518
## R2 0.905
## Adjusted R2 0.899
## Residual Std. Error 0.086 (df = 484)
## F Statistic 140.303*** (df = 33; 484)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
##
## Call:
## randomForest(formula = log(Hospital_Stay) ~ Location + MRI_Units + CT_Scanners, data = df, ntree = 300, mtry = 2, importance = TRUE)
## Type of random forest: regression
## Number of trees: 300
## No. of variables tried at each split: 2
##
## Mean of squared residuals: 0.006480983
## % Var explained: 91.2
Japan has the highest number of MRI Units among the countries in the dataset, followed by the United States and Germany. This suggests that these countries have comparatively greater MRI availability, which can influence diagnostic capacity and potentially affect hospital stay durations. Oddly enough, Japan also records the highest number of hospital stays in the dataset, which may indicate on the surface that greater equipment availability of diagnostic equipment is associated with higher hospital utilization. However, other factors such as population size, admission practices, and even the whole healthcare system itself may contribute to this pattern. The scatter plot, showing the relationship between hospital stay and MRI Units as poor or nearly non-existent, can also support the idea of external and structural factors affecting hospital durations.
Because both the dependent variable and the imaging variables are in logs, the coefficients on log(MRI_Units) and log(CT_Scanners) are interpreted as elasticities. The coefficient on log(MRI_Units) is -0.094, which implied that a 1% increase in MRI units is associated with about a 0.094% decrease in average hospital stay, holding country and CT scanners constant. Similarly, the coefficient on log(CT_Scanners) is − 0.096, meaning a 1% increase in CT scanners is associated with about a 0.096% decrease in hospital stay. For a broader perspective, we can also interpret them as: a 10% increase in CT scanners corresponds to roughly a 0.96% reduction in hospital stay, and a 10% increase in MRI units leads to a rough reduction of 0.94% in hospital stay. These results suggest that expanding diagnostic capacity, through more MRI and CT units is systematically associated with shorter inpatient stays, consistent with faster diagnosis and treatment reducing time spent in hospital.
The country fixed effects reveal substantial cross‑national differences in average hospital stay even after controlling for MRI and CT capacity. Several countries exhibit longer stays relative to the reference category, including Japan, Korea, Germany, Italy, Austria, and the United States. Japan stands out with the largest positive coefficient, indicating hospital stays more than four times higher than the baseline. Korea and Germany also show large increases, with stays roughly 80% and 70% longer, respectively. The United States displays a more moderate but still significant increase of about 18%. In contrast, countries such as Denmark, Turkey, Israel, and France show significantly shorter stays, with Denmark exhibiting the largest reduction (approximately 38% below the reference level). These cross-national patterns may suggest that structural, cultural, or policy differences across health systems play a major role in shaping hospital stay duration beyond diagnostic capacity alone.
Noticeably, model performance is strong, with an R-square of 0.905 and an adjusted R-square of 0.899, indicating that the model explains roughly 90% of the variation in hospital stay across countries. The F‑statistic confirms that the full set of predictors is jointly highly significant. This high explanatory power reflects both the substantial contribution of country‑level differences and the meaningful, systematic relationship between diagnostic resources and inpatient duration.