(Weight: ±25%)
Students are required to:
(Weight: ±20%)
Students are required to:
(Weight: ±20%)
Students are required to apply an advanced analytical approach that is appropriate to the dataset, such as:
(Weight: ±25%)
Students are required to:
(Weight: ±10%)
Students are required to:
Jumlah observasi setelah cleaning: 4636
Jumlah kolom: 16
tibble [4,636 × 16] (S3: tbl_df/tbl/data.frame)
$ date : Date[1:4636], format: "2010-01-01" "2010-01-02" ...
$ patient_visits : num [1:4636] 3.11e+14 3.34e+13 4.59e+14 3.55e+13 3.59e+14 ...
$ staff_workload : num [1:4636] 2.70e+14 3.23e+14 3.90e+14 3.49e+14 3.41e+14 ...
$ avg_treatment_cost : num [1:4636] 1.49e+14 NA 8.21e+12 4.93e+13 1.29e+14 ...
$ bed_occupancy_rate : num [1:4636] 6.51e+14 8.06e+14 6.91e+14 5.31e+14 7.21e+14 ...
$ treatment_intensity : num [1:4636] 1.02e+14 4.71e+14 6.23e+14 4.16e+13 8.91e+14 ...
$ operational_cost : num [1:4636] 2.11e+14 2.19e+14 2.12e+14 2.34e+14 2.61e+14 ...
$ num_procedures : num [1:4636] 3.83e+14 4.98e+14 3.65e+14 2.46e+14 5.71e+13 ...
$ patient_satisfaction: num [1:4636] 7.31e+14 7.76e+14 6.96e+14 6.32e+12 8.40e+14 ...
$ efficiency_index : num [1:4636] 1.15e+14 1.03e+14 1.18e+14 1.02e+14 1.05e+14 ...
$ clinical_noise : num [1:4636] -8.88e+14 3.60e+14 4.22e+14 -7.43e+14 1.80e+14 ...
$ revenue : num [1:4636] 3.14e+14 2.18e+14 2.51e+14 1.93e+14 2.96e+14 ...
$ profit : num [1:4636] 1.14e+14 8.74e+14 9.17e+14 8.09e+14 1.04e+14 ...
$ patient_category : chr [1:4636] "High Risk" "High Risk" "High Risk" "Low Risk" ...
$ hospital_region : chr [1:4636] "Central" "East" "West" "East" ...
$ churn : chr [1:4636] "No" "Yes" "No" "No" ...
date patient_visits staff_workload
Min. :2010-01-01 Min. :3.148e+11 Min. :2.423e+11
1st Qu.:2013-05-13 1st Qu.:2.805e+14 1st Qu.:2.512e+14
Median :2016-10-16 Median :3.402e+14 Median :3.068e+14
Mean :2016-10-25 Mean :3.162e+14 Mean :2.875e+14
3rd Qu.:2020-04-06 3rd Qu.:3.913e+14 3rd Qu.:3.540e+14
Max. :2023-09-09 Max. :5.912e+14 Max. :5.441e+14
avg_treatment_cost bed_occupancy_rate treatment_intensity
Min. :5.999e+10 Min. :5.618e+10 Min. :1.001e+11
1st Qu.:1.188e+14 1st Qu.:5.483e+14 1st Qu.:2.904e+14
Median :3.047e+14 Median :6.329e+14 Median :5.324e+14
Mean :3.755e+14 Mean :5.850e+14 Mean :5.195e+14
3rd Qu.:6.459e+14 3rd Qu.:7.050e+14 3rd Qu.:7.793e+14
Max. :9.994e+14 Max. :9.814e+14 Max. :9.999e+14
NA's :281
operational_cost num_procedures patient_satisfaction
Min. :8.000e+04 Min. :2.202e+11 Min. :1.000e+02
1st Qu.:1.606e+14 1st Qu.:1.990e+14 1st Qu.:6.461e+14
Median :1.933e+14 Median :3.326e+14 Median :7.335e+14
Mean :1.835e+14 Mean :3.322e+14 Mean :6.681e+14
3rd Qu.:2.233e+14 3rd Qu.:4.759e+14 3rd Qu.:8.081e+14
Max. :9.896e+14 Max. :9.981e+14 Max. :9.990e+14
NA's :327
efficiency_index clinical_noise revenue
Min. :1.011e+11 Min. :-9.996e+14 Min. :3.129e+10
1st Qu.:1.056e+14 1st Qu.:-2.436e+14 1st Qu.:2.181e+14
Median :1.111e+14 Median :-8.603e+12 Median :2.581e+14
Mean :1.400e+14 Mean : 1.522e+12 Mean :2.409e+14
3rd Qu.:1.168e+14 3rd Qu.: 2.563e+14 3rd Qu.:2.967e+14
Max. :9.998e+14 Max. : 9.998e+14 Max. :9.423e+14
profit patient_category hospital_region churn
Min. :1.137e+11 Length:4636 Length:4636 Length:4636
1st Qu.:1.087e+14 Class :character Class :character Class :character
Median :6.674e+14 Mode :character Mode :character Mode :character
Mean :5.086e+14
3rd Qu.:8.574e+14
Max. :1.000e+15
Data valid untuk regresi. Baris tersedia: 4043
# A tibble: 6 × 6
term estimate std.error statistic p.value significance
<chr> <dbl> <dbl> <dbl> <dbl> <chr>
1 (Intercept) 3.50e+14 3.07e+13 11.4 1.26e-29 "***"
2 patient_visits -1.12e- 1 5.24e- 2 -2.14 3.26e- 2 "*"
3 avg_treatment_cost 4.16e- 1 2.06e- 2 20.2 1.43e-85 "***"
4 operational_cost 9.94e- 2 8.13e- 2 1.22 2.22e- 1 ""
5 bed_occupancy_rate 2.13e- 2 2.98e- 2 0.713 4.76e- 1 ""
6 efficiency_index 3.91e- 2 3.43e- 2 1.14 2.55e- 1 ""
Dataset RMSE MAE
1 Train 3.510826e+14 3.15377e+14
2 Test 3.537675e+14 3.16967e+14
Data valid. Baris tersedia: 4262
No Yes
3927 335
Train set: 2984 rows
Test set: 1278 rows
# A tibble: 7 × 6
term estimate std.error statistic p.value significance
<chr> <dbl> <dbl> <dbl> <dbl> <chr>
1 (Intercept) -2.58e+ 0 3.81e- 1 -6.76 1.38e-11 "***"
2 patient_visits -1.58e-16 5.69e-16 -0.278 7.81e- 1 ""
3 avg_treatment_cost 1.40e-15 2.17e-16 6.48 9.29e-11 "***"
4 operational_cost -4.01e-16 8.76e-16 -0.457 6.47e- 1 ""
5 bed_occupancy_rate -4.45e-17 3.32e-16 -0.134 8.94e- 1 ""
6 patient_satisfaction -4.90e-16 2.59e-16 -1.89 5.84e- 2 "."
7 efficiency_index -1.97e-16 3.98e-16 -0.495 6.21e- 1 ""
Confusion Matrix:
Prediction Reference Freq
1 No No 1178
2 Yes No 0
3 No Yes 100
4 Yes Yes 0
Evaluation Metrics:
Metric Value
Accuracy Accuracy 0.9217527
Kappa Kappa 0.0000000
Sensitivity Sensitivity 0.0000000
Specificity Specificity 1.0000000
Precision Precision NA
Recall Recall 0.0000000
F1 F1 NA
AUC 0.6508913
Data clustering bersih. Observasi: 4262 Variabel: 5
# A tibble: 3 × 7
cluster patient_visits_mean avg_treatment_cost_mean operational_cost_mean
<fct> <dbl> <dbl> <dbl>
1 1 3.16e14 3.97e14 1.81e14
2 2 3.22e14 6.97e14 1.81e14
3 3 3.10e14 1.45e14 1.85e14
# ℹ 3 more variables: bed_occupancy_rate_mean <dbl>,
# patient_satisfaction_mean <dbl>, n_obs <int>
Data valid. Baris tersedia: 4636
Dataset mulai dari: 14610 sampai 19609
Augmented Dickey-Fuller Test
data: ts_profit
Dickey-Fuller = -15.974, Lag order = 16, p-value = 0.01
alternative hypothesis: stationary
Data stasioner.
Series: ts_train
ARIMA(5,1,0)
Coefficients:
ar1 ar2 ar3 ar4 ar5
-0.8504 -0.6615 -0.5032 -0.3423 -0.1622
s.e. 0.0162 0.0207 0.0218 0.0207 0.0162
sigma^2 = 153129: log likelihood = -27387.03
AIC=54786.06 AICc=54786.08 BIC=54823.37
Training set error measures:
ME RMSE MAE MPE MAPE MASE
Training set 0.09331779 391.0008 345.7127 -506.767 547.346 0.8714264
ACF1
Training set -0.02368917
ME RMSE MAE MPE MAPE ACF1 Theil's U
Test set -78.23293 391.1101 385.1575 -552.4264 587.2661 0.01419953 1.771845
INTERPRETASI & CATATAN:
1. Dataset bulanan, profit sudah diskalakan untuk stabilitas numerik.
2. Data stasioner → cocok untuk ARIMA.
3. Komponen TS: trend, seasonal, noise bisa dianalisis dari decompose.
4. Model ARIMA dipilih otomatis oleh auto.arima.
5. Evaluasi forecasting: MAE, RMSE, MAPE memberikan performa model.
6. Interval kepercayaan ARIMA menunjukkan ketidakpastian prediksi.
7. Potensi sumber ketidakpastian: outlier, faktor eksternal, perubahan struktural.
Mahasiswa diwajibkan untuk:
Mengintegrasikan temuan dari semua analisis yang dilakukan.
Menyoroti pola, tren, atau hubungan terpenting yang ditemukan.
Menerjemahkan hasil analisis menjadi wawasan yang bermakna.
Memberikan kesimpulan berbasis data yang didukung oleh bukti.
Menawarkan rekomendasi yang dapat ditindaklanjuti dan realistis berdasarkan analisis.