project stage 3

Author

Azra Ozcirpan and Ozge Yilmaz

Final Report

library(farff)
library(tidyverse)
Warning: package 'ggplot2' was built under R version 4.5.2
Warning: package 'purrr' was built under R version 4.5.2
Warning: package 'dplyr' was built under R version 4.5.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.1     ✔ readr     2.1.5
✔ forcats   1.0.1     ✔ stringr   1.5.2
✔ ggplot2   4.0.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidymodels)
── Attaching packages ────────────────────────────────────── tidymodels 1.4.1 ──
✔ broom        1.0.10     ✔ rsample      1.3.2 
✔ dials        1.4.2      ✔ tailor       0.1.0 
✔ infer        1.1.0      ✔ tune         2.0.1 
✔ modeldata    1.5.1      ✔ workflows    1.3.0 
✔ parsnip      1.4.1      ✔ workflowsets 1.1.1 
✔ recipes      1.3.2      ✔ yardstick    1.4.0 
Warning: package 'infer' was built under R version 4.5.2
Warning: package 'parsnip' was built under R version 4.5.2
Warning: package 'recipes' was built under R version 4.5.2
Warning: package 'rsample' was built under R version 4.5.2
Warning: package 'yardstick' was built under R version 4.5.2
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter()   masks stats::filter()
✖ recipes::fixed()  masks stringr::fixed()
✖ dplyr::lag()      masks stats::lag()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step()   masks stats::step()
library(yardstick)

Economic Question

The purpose of this study is to investigate whether financial ratios can predict corporate bankruptcy. Bankruptcy prediction is an important topic in economics and finance because firm failures affect investors, creditors, employees, and overall economic stability.

Research Question:

Can financial ratios predict whether a firm will go bankrupt one year in advance?

Dataset

The analysis uses the Polish Company Bankruptcy Dataset obtained from the UCI Machine Learning Repository. The dataset contains financial information for firms together with a binary bankruptcy indicator.

To avoid time-series dependence, only the one-year subset of the data was used. After removing observations with missing values, the final dataset contained 3,194 firms.

The dependent variable is:

  • is_bankrupt (Yes/No)

The explanatory variables are:

  • Attr1: Profitability

  • Attr2: Leverage

  • Attr5: Liquidity

  • Attr7: Operating Performance

  • Attr10: Asset Efficiency

These variables were selected because they represent key dimensions of firm financial health.

# Load raw dataset (7,027 observations before cleaning)
bankruptcy_raw <- readARFF("1year.arff")
Parse with reader=readr : 1year.arff
header: 0.003000; preproc: 0.006000; data: 0.059000; postproc: 0.000000; total: 0.068000
# Stage 1 cleaned dataset: used for descriptive statistics only
# Missing values removed -> 3194 observations
data_clean <- bankruptcy_raw %>%
  drop_na() %>%
  mutate(class = as.numeric(as.character(class))) %>%
  rename(is_bankrupt = class)


glimpse(data_clean)
Rows: 3,194
Columns: 65
$ Attr1       <dbl> 0.200550, 0.009020, 0.266690, 0.067731, -0.029182, 0.02808…
$ Attr2       <dbl> 0.37951, 0.63202, 0.34994, 0.19885, 0.21131, 0.24231, 0.23…
$ Attr3       <dbl> 0.396410, 0.053735, 0.611470, 0.081562, 0.452640, 0.432240…
$ Attr4       <dbl> 2.04720, 1.12630, 3.02430, 2.95760, 7.57460, 3.01280, 2.47…
$ Attr5       <dbl> 32.3510, -37.8420, 43.0870, 90.6060, 57.8440, 47.9350, 8.1…
$ Attr6       <dbl> 0.388250, 0.000000, 0.559830, 0.212650, 0.010387, 0.021598…
$ Attr7       <dbl> 0.249760, 0.014434, 0.332070, 0.078063, -0.034653, 0.03972…
$ Attr8       <dbl> 1.33050, 0.58223, 1.85770, 4.02900, 3.73240, 3.10370, 2.54…
$ Attr9       <dbl> 1.13890, 1.33320, 1.12680, 1.25700, 1.02410, 1.01250, 1.05…
$ Attr10      <dbl> 0.50494, 0.36798, 0.65006, 0.80115, 0.78869, 0.75206, 0.61…
$ Attr11      <dbl> 0.249760, 0.043162, 0.332070, 0.078063, -0.034653, 0.03972…
$ Attr12      <dbl> 0.659800, 0.033921, 1.099300, 1.873600, -0.503330, 0.18500…
$ Attr13      <dbl> 0.166600, 0.038938, 0.120470, 0.310360, 0.004191, 0.044190…
$ Attr14      <dbl> 0.249760, 0.014434, 0.332070, 0.078063, -0.034653, 0.03972…
$ Attr15      <dbl> 497.42, 4443.70, 367.04, 926.03, 23292.00, 1059.30, 742.04…
$ Attr16      <dbl> 0.733780, 0.082138, 0.994440, 0.394150, 0.015671, 0.344580…
$ Attr17      <dbl> 2.63490, 1.58220, 2.85770, 5.02900, 4.73240, 4.12700, 4.17…
$ Attr18      <dbl> 0.249760, 0.014434, 0.332070, 0.078063, -0.034653, 0.03972…
$ Attr19      <dbl> 0.149420, 0.010827, 0.114960, 0.309120, -0.043861, 0.02102…
$ Attr20      <dbl> 43.3700, 36.6230, 38.1830, 44.4460, 105.3500, 36.2320, 59.…
$ Attr21      <dbl> 1.24790, 1.07520, 1.05810, 1.18480, 0.99083, 0.78628, 1.11…
$ Attr22      <dbl> 0.214020, 0.030778, 0.304710, 0.053730, -0.050624, 0.05495…
$ Attr23      <dbl> 0.119980, 0.006766, 0.092322, 0.268200, -0.036936, 0.01486…
$ Attr24      <dbl> 0.477060, 0.000222, 0.515860, 0.246040, 0.016014, 0.034470…
$ Attr25      <dbl> 0.50494, 0.34828, 0.65006, 0.80115, 0.78869, 0.75206, 0.61…
$ Attr26      <dbl> 0.604110, 0.073572, 0.807590, 0.342200, 0.041561, 0.296520…
$ Attr27      <dbl> 1.458200, 1.071400, 1.188500, 2.674400, -0.656220, 0.29448…
$ Attr28      <dbl> 1.761500, 0.103190, 7.072800, 0.093025, 0.945950, 1.224400…
$ Attr29      <dbl> 5.9443, 5.9479, 3.9412, 5.2684, 4.8827, 4.4729, 4.4070, 4.…
$ Attr30      <dbl> 0.117880, 0.474050, 0.088635, 0.780700, 0.119350, -0.03304…
$ Attr31      <dbl> 0.149420, 0.010827, 0.114960, 0.309120, -0.043861, 0.02102…
$ Attr32      <dbl> 94.140, 142.090, 43.006, 75.693, 32.575, 42.004, 47.948, 1…
$ Attr33      <dbl> 3.87720, 2.62840, 8.48710, 4.82210, 11.20500, 8.68960, 7.6…
$ Attr34      <dbl> 0.563930, 1.769700, 0.870750, 0.270210, -0.239570, 0.22679…
$ Attr35      <dbl> 0.214020, 0.240130, 0.304710, 0.053730, -0.050624, 0.05495…
$ Attr36      <dbl> 1.74100, 1.33320, 2.91610, 0.27897, 0.80603, 1.93410, 1.78…
$ Attr37      <dbl> 593.27000, 2.75380, 12.77300, 0.58833, 2.05990, 16.67000, …
$ Attr38      <dbl> 0.50591, 0.49344, 0.69793, 0.95834, 0.93115, 0.77962, 0.63…
$ Attr39      <dbl> 0.128040, 0.180110, 0.105480, 0.212760, -0.064076, 0.02908…
$ Attr40      <dbl> 0.662950, 0.072677, 0.325250, 0.048375, 3.250200, 1.428600…
$ Attr41      <dbl> 0.051402, 0.308650, 0.035883, 0.120970, -0.548790, 0.08069…
$ Attr42      <dbl> 0.128040, 0.023085, 0.105480, 0.212760, -0.064076, 0.02908…
$ Attr43      <dbl> 114.420, 122.740, 103.020, 175.190, 137.540, 65.719, 94.99…
$ Attr44      <dbl> 71.050, 86.122, 64.834, 130.740, 32.193, 29.487, 35.718, 3…
$ Attr45      <dbl> 1.009700, 0.067430, 0.882530, 2.202500, -0.127970, 0.14973…
$ Attr46      <dbl> 1.52250, 0.81192, 2.02390, 2.21950, 4.26240, 2.13940, 1.16…
$ Attr47      <dbl> 49.39400, 43.65400, 43.02300, 55.86800, 107.89000, 36.6860…
$ Attr48      <dbl> 0.185300, -0.006701, 0.288790, 0.053417, -0.088588, 0.0111…
$ Attr49      <dbl> 0.110850, -0.005026, 0.099972, 0.211520, -0.112130, 0.0059…
$ Attr50      <dbl> 2.04200, 0.75832, 2.61060, 0.61971, 2.46790, 2.67010, 2.24…
$ Attr51      <dbl> 0.378540, 0.425540, 0.302070, 0.041664, 0.068848, 0.214750…
$ Attr52      <dbl> 0.257920, 0.380460, 0.117830, 0.207380, 0.089246, 0.115080…
$ Attr53      <dbl> 2.24370, 0.70666, 7.51920, 0.91375, 1.64820, 2.13040, 1.32…
$ Attr54      <dbl> 2.24800, 0.94760, 8.07280, 1.09300, 1.94600, 2.20850, 1.37…
$ Attr55      <dbl> 348690.0000, 1.1263, 5340.0000, 15132.0000, 34549.0000, 12…
$ Attr56      <dbl> 0.121960, 0.180110, 0.112500, 0.204440, 0.023565, 0.012367…
$ Attr57      <dbl> 0.397180, 0.024512, 0.410250, 0.084542, -0.037001, 0.03734…
$ Attr58      <dbl> 0.87804, 0.84165, 0.88750, 0.79556, 0.97644, 0.98763, 0.94…
$ Attr59      <dbl> 0.001924, 0.340940, 0.073630, 0.196190, 0.180630, 0.036647…
$ Attr60      <dbl> 8.4160, 9.9665, 9.5593, 8.2122, 3.4646, 10.0740, 6.1579, 8…
$ Attr61      <dbl> 5.1372, 4.2382, 5.6298, 2.7917, 11.3380, 12.3780, 10.2190,…
$ Attr62      <dbl> 82.658, 116.500, 38.168, 60.218, 31.807, 41.485, 45.422, 1…
$ Attr63      <dbl> 4.41580, 3.13300, 9.56290, 6.06130, 11.47500, 8.79840, 8.0…
$ Attr64      <dbl> 7.42770, 2.56030, 33.41300, 0.28803, 1.65110, 5.35230, 3.8…
$ is_bankrupt <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
cat("Number of observations:", nrow(data_clean), "\n")
Number of observations: 3194 
cat("Number of variables:", ncol(data_clean), "\n")
Number of variables: 65 

Two datasets are used at different stages:

  • data_clean (3,194 observations, all NAs removed): used for descriptive statistics and probability analysis in Stage 1.
  • bankruptcy (subset of the raw 7,027-observation dataset, retaining rows with complete values for the five selected predictors): used for model estimation in Stage 2. Observations with missing values in any of the five predictor variables are dropped during model fitting by glm() automatically, as reported in each model summary.

This distinction is intentional: removing all 65 variables’ NAs for descriptive work gives a cleaner picture of the full financial profile, while keeping more observations for modeling reduces information loss in the predictive stage.

Variable Selection

# Rename selected variables for readability
bankruptcy <- bankruptcy_raw %>%
  select(Attr1, Attr2, Attr5, Attr7, Attr10, class) %>%
  rename(
    profitability    = Attr1,
    leverage         = Attr2,
    liquidity        = Attr5,
    operating_perf   = Attr7,
    asset_efficiency = Attr10,
    bankrupt_status  = class
  ) %>%
  mutate(
    bankrupt_status = factor(bankrupt_status,
                             levels = c("0", "1"),
                             labels = c("No", "Yes"))
  )

The five explanatory variables represent key dimensions of firm financial health:

Variable Attribute Economic Dimension
profitability Attr1 Net income / total assets (return on assets)
leverage Attr2 Total liabilities / total assets
liquidity Attr5 Working capital ratio
operating_perf Attr7 EBIT / total assets
asset_efficiency Attr10 Net sales / total assets
## Class Distribution
table(bankruptcy_raw$class)

   0    1 
6756  271 
cat("\nBankruptcy rate (raw dataset):",
    round(mean(as.numeric(as.character(bankruptcy_raw$class)), na.rm = TRUE), 4),
    "\n")

Bankruptcy rate (raw dataset): 0.0386 
cat("Bankruptcy rate (cleaned dataset):",
    round(mean(data_clean$is_bankrupt), 4), "\n")
Bankruptcy rate (cleaned dataset): 0.0094 

Bankrupt firms represent roughly 3.9% of the raw dataset (271 out of 7,027). The bankruptcy rate in the cleaned dataset differs slightly because listwise deletion of missing values changes sample composition. This severe class imbalance — the majority class outnumbering the minority class by more than 24 to 1 — will be the central challenge for all classification models.

#Bankruptcy distribution in the raw dataset. Non-bankrupt firms vastly outnumber bankrupt firms, confirming severe class imbalance.
bankruptcy %>%
  ggplot(aes(x = bankrupt_status, fill = bankrupt_status)) +
  geom_bar(color = "black") +
  scale_fill_manual(values = c("No" = "cyan", "Yes" = "tomato")) +
  labs(
    title = "Bankruptcy Distribution",
    x     = "Bankruptcy Status",
    y     = "Count"
  ) +
  theme_minimal()

bankruptcy_data <- data_clean %>%
  select(is_bankrupt, Attr1, Attr2, Attr5, Attr7, Attr10) %>%
  mutate(
    is_bankrupt = factor(is_bankrupt,
                         levels = c(0, 1),
                         labels = c("No", "Yes"))
  )

bankruptcy_data %>%
  summarise(
    Mean_Profitability    = mean(Attr1),
    Mean_Leverage         = mean(Attr2),
    Mean_Liquidity        = mean(Attr5),
    Mean_OperatingPerf    = mean(Attr7),
    Mean_AssetEfficiency  = mean(Attr10),
    Observations          = n()
  )
  Mean_Profitability Mean_Leverage Mean_Liquidity Mean_OperatingPerf
1         0.08235786     0.5352492       192.1768         0.09801264
  Mean_AssetEfficiency Observations
1            0.4386079         3194
bankruptcy_data %>%
  summarise(
    Mean          = mean(Attr1,                  na.rm = TRUE),
    Median        = median(Attr1,                na.rm = TRUE),
    Std_Deviation = sd(Attr1,                    na.rm = TRUE),
    Q1            = quantile(Attr1, 0.25,        na.rm = TRUE),
    Q3            = quantile(Attr1, 0.75,        na.rm = TRUE),
    Minimum       = min(Attr1,                   na.rm = TRUE),
    Maximum       = max(Attr1,                   na.rm = TRUE)
  )
        Mean   Median Std_Deviation       Q1       Q3 Minimum Maximum
1 0.08235786 0.065399     0.1175964 0.019102 0.126595 -1.1533  1.5399

Profitability (Attr1) ranges from −1.15 to 1.54 with a mean of 0.08 and a median of 0.07. The gap between mean and median is small, but the wide range from minimum to maximum suggests the presence of outliers on both tails. Negative values indicate firms that are generating losses — a warning sign for financial distress. No outlier trimming or winsorization was applied in this analysis; this is acknowledged as a limitation.

## Comparing Bankrupt vs. Non-Bankrupt Firms
bankruptcy_raw %>%
  group_by(class) %>%
  summarise(
    avg_profitability    = mean(Attr1,  na.rm = TRUE),
    avg_leverage         = mean(Attr2,  na.rm = TRUE),
    avg_liquidity        = mean(Attr5,  na.rm = TRUE),
    avg_operating_perf   = mean(Attr7,  na.rm = TRUE),
    avg_asset_efficiency = mean(Attr10, na.rm = TRUE),
    n                    = n()
  )
# A tibble: 2 × 7
  class avg_profitability avg_leverage avg_liquidity avg_operating_perf
  <fct>             <dbl>        <dbl>         <dbl>              <dbl>
1 0                0.0444        0.490         -248.              0.334
2 1               -0.208         2.30          -636.             -0.193
# ℹ 2 more variables: avg_asset_efficiency <dbl>, n <int>

Bankrupt firms consistently show weaker financial characteristics:

  • Profitability: bankrupt firms average −0.21 vs. +0.04 for non-bankrupt firms, indicating that bankrupt firms are on average loss-making.
  • Leverage: bankrupt firms carry average leverage of 2.30, more than four times the non-bankrupt average of 0.49. A ratio above 1.0 means total liabilities exceed total assets — a direct sign of insolvency risk.
  • Liquidity: both groups show negative average liquidity, but bankrupt firms average −636 compared to −248 for non-bankrupt firms. The negative values reflect the specific construction of this liquidity measure in the dataset (working capital can be negative when current liabilities exceed current assets); what matters is that bankrupt firms are substantially more negative, indicating greater short-term funding pressure.
  • Operating performance and asset efficiency follow the same pattern — bankrupt firms score lower on both.

These differences support the hypothesis that financial ratios contain economically meaningful information about bankruptcy risk.

bankruptcy_data %>%
  ggplot(aes(x = Attr1)) +
  geom_histogram(fill = "pink", color = "red", bins = 30) +
  labs(
    title = "Distribution of Profitability (Attr1)",
    x     = "Profitability (Net Income / Total Assets)",
    y     = "Frequency"
  ) +
  theme_minimal()

The profitability distribution is not normally distributed. The vast majority of firms are clustered tightly around zero, producing a sharp, narrow central peak — much taller and thinner than a normal distribution would produce (leptokurtic). Extreme outliers extend in both directions: some firms report deep losses (down to −1.15) and a small number achieve high profitability (up to 1.54), creating heavy tails on both sides. As a result, the mean (0.08) is a poor summary of the typical firm; the median (0.07) is more representative. No outlier treatment was applied in this analysis, which is acknowledged as a limitation in Section 6.

set.seed(465)
bankruptcy_split <- initial_split(bankruptcy, prop = 0.8)
bankruptcy_train <- training(bankruptcy_split)
bankruptcy_test  <- testing(bankruptcy_split)

cat("Training set:", nrow(bankruptcy_train), "observations\n")
Training set: 5621 observations
cat("Test set:    ", nrow(bankruptcy_test),  "observations\n")
Test set:     1406 observations

An 80/20 split was used. set.seed(465) ensures the split is reproducible. The training sample is used to estimate models; the test sample is reserved for evaluating predictive performance on unseen observations.

Model 1: Full Logistic Regression (5 Financial Ratios)

bankruptcy_model1 <- glm(
  bankrupt_status ~ profitability + leverage + liquidity +
                    operating_perf + asset_efficiency,
  data   = bankruptcy_train,
  family = binomial
)
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(bankruptcy_model1)

Call:
glm(formula = bankrupt_status ~ profitability + leverage + liquidity + 
    operating_perf + asset_efficiency, family = binomial, data = bankruptcy_train)

Coefficients:
                   Estimate Std. Error    z value Pr(>|z|)    
(Intercept)      -2.082e+15  9.083e+05 -2.292e+09   <2e-16 ***
profitability     2.078e+13  4.514e+05  4.604e+07   <2e-16 ***
leverage          1.954e+13  1.749e+05  1.117e+08   <2e-16 ***
liquidity         1.684e+08  2.164e+01  7.780e+06   <2e-16 ***
operating_perf   -3.828e+12  2.440e+05 -1.569e+07   <2e-16 ***
asset_efficiency  1.798e+12  6.619e+04  2.716e+07   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance:  1785.4  on 5610  degrees of freedom
Residual deviance: 14994.2  on 5605  degrees of freedom
  (10 observations deleted due to missingness)
AIC: 15006

Number of Fisher Scoring iterations: 11

Important note on Model 1 — Complete Separation: The model produces astronomically large coefficients (e.g., 2.08 × 10¹³) and the warning “fitted probabilities numerically 0 or 1 occurred.” This is a well-known phenomenon called complete separation: the five financial ratios jointly discriminate so well between bankrupt and non-bankrupt firms in the training data that the logistic regression algorithm pushes coefficients toward ±∞ in search of the maximum likelihood estimate. When this happens, standard errors are unreliable and formal significance testing is invalid. The model cannot be trusted for inference, and its probability predictions will be degenerate (essentially 0 or 1 for all observations). Model 1 is retained here for comparison purposes but should not be used for economic interpretation.

Model 2: Simple Logistic Regression (Profitability + Leverage)

bankruptcy_model2 <- glm(
  bankrupt_status ~ profitability + leverage,
  data   = bankruptcy_train,
  family = binomial
)
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(bankruptcy_model2)

Call:
glm(formula = bankrupt_status ~ profitability + leverage, family = binomial, 
    data = bankruptcy_train)

Coefficients:
               Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -3.398194   0.086679 -39.204  < 2e-16 ***
profitability -0.002224   0.011962  -0.186  0.85250    
leverage       0.258193   0.089344   2.890  0.00385 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1786  on 5618  degrees of freedom
Residual deviance: 1770  on 5616  degrees of freedom
  (2 observations deleted due to missingness)
AIC: 1776

Number of Fisher Scoring iterations: 8

Model 2 produces stable, interpretable estimates:

  • Profitability (coefficient: −0.002, p = 0.853): not statistically significant. While theory predicts lower profitability should raise bankruptcy risk, profitability loses its individual predictive contribution once leverage is in the model — likely due to collinearity between the two, or because the one-year profitability signal is too noisy.
  • Leverage (coefficient: 0.258, p = 0.004): positive and statistically significant. A one-unit increase in leverage (total liabilities / total assets) is associated with a 0.258 increase in the log-odds of bankruptcy. In odds-ratio terms: exp(0.258) ≈ 1.29, meaning each additional unit of leverage multiplies the odds of bankruptcy by roughly 1.3, holding profitability constant. This is consistent with financial distress theory — firms with heavier debt burdens are more vulnerable to insolvency.

The modest reduction from null deviance (1,786) to residual deviance (1,770) reflects the difficulty of predicting a rare event with only two predictors.

## Probability Analysis

mean_profitability <- mean(bankruptcy_train$profitability, na.rm = TRUE)

prob_scenarios <- tibble(
  profitability = mean_profitability,
  leverage      = c(0.25, 0.50, 0.75, 1.00, 1.50, 2.00, 2.50, 3.00)
)

prob_scenarios <- prob_scenarios %>%
  mutate(
    predicted_prob     = predict(bankruptcy_model2,
                                 newdata = prob_scenarios,
                                 type    = "response"),
    predicted_prob_pct = round(predicted_prob * 100, 3)
  )

prob_scenarios %>%
  select(leverage, predicted_prob_pct) %>%
  rename(
    `Leverage (liabilities/assets)`        = leverage,
    `Predicted Bankruptcy Probability (%)` = predicted_prob_pct
  )
# A tibble: 8 × 2
  `Leverage (liabilities/assets)` `Predicted Bankruptcy Probability (%)`
                            <dbl>                                  <dbl>
1                            0.25                                   3.44
2                            0.5                                    3.66
3                            0.75                                   3.90
4                            1                                      4.15
5                            1.5                                    4.69
6                            2                                      5.31
7                            2.5                                    5.99
8                            3                                      6.76

The table illustrates how predicted bankruptcy probability increases with leverage, holding profitability at its sample mean (≈ 0.04). A firm with leverage of 0.50 faces a predicted bankruptcy probability of 3.66%, while a firm with leverage of 2.00 faces 5.31% — nearly a 45% relative increase. Although the absolute probabilities remain low due to the rarity of bankruptcy in the sample (~3.9%), the monotonic increase across leverage levels confirms the economic relevance of the leverage coefficient estimated in Model 2.

# Model 1 predictions
m1_probs <- predict(bankruptcy_model1, bankruptcy_test, type = "response")
m1_pred  <- factor(ifelse(m1_probs > 0.5, "Yes", "No"), levels = c("No", "Yes"))

# Model 2 predictions
m2_probs <- predict(bankruptcy_model2, bankruptcy_test, type = "response")
m2_pred  <- factor(ifelse(m2_probs > 0.5, "Yes", "No"), levels = c("No", "Yes"))
# Confusion matrix — Model 1
cm1 <- table(Predicted = m1_pred, Actual = bankruptcy_test$bankrupt_status)
cat("Confusion Matrix — Model 1:\n"); print(cm1)
Confusion Matrix — Model 1:
         Actual
Predicted   No  Yes
      No  1342   62
      Yes    1    0
# Confusion matrix — Model 2
cm2 <- table(Predicted = m2_pred, Actual = bankruptcy_test$bankrupt_status)
cat("\nConfusion Matrix — Model 2:\n"); print(cm2)

Confusion Matrix — Model 2:
         Actual
Predicted   No  Yes
      No  1343   62
      Yes    0    0
# Helper function: compute metrics with "Yes" as the positive class
compute_metrics <- function(cm) {
  TP <- cm["Yes", "Yes"]
  TN <- cm["No",  "No"]
  FP <- cm["Yes", "No"]
  FN <- cm["No",  "Yes"]
  accuracy  <- (TP + TN) / sum(cm)
  precision <- if ((TP + FP) == 0) NA else TP / (TP + FP)
  recall    <- if ((TP + FN) == 0) NA else TP / (TP + FN)
  list(accuracy = accuracy, precision = precision, recall = recall)
}

m1_metrics <- compute_metrics(cm1)
m2_metrics <- compute_metrics(cm2)

tibble(
  Model     = c("Model 1: Full (5 ratios)",
                "Model 2: Simple (profitability + leverage)"),
  Accuracy  = round(c(m1_metrics$accuracy,  m2_metrics$accuracy),  3),
  Precision = c(round(m1_metrics$precision, 3), ifelse(is.na(m2_metrics$precision), "NA (no positive predictions)", NA)),
  Recall    = round(c(m1_metrics$recall,    m2_metrics$recall),    3)
)
# A tibble: 2 × 4
  Model                                      Accuracy Precision           Recall
  <chr>                                         <dbl> <chr>                <dbl>
1 Model 1: Full (5 ratios)                      0.955 0                        0
2 Model 2: Simple (profitability + leverage)    0.956 NA (no positive pr…      0

Interpreting these results honestly:

Both models achieve approximately 95.5–95.6% accuracy on the test set. However, a trivial “always predict No” baseline achieves 96.1% accuracy (6,756 / 7,027 ≈ 96.1%), which is higher than either model. This confirms that accuracy is entirely uninformative here: both models perform worse than the trivial baseline at the 0.5 threshold, because the rare “Yes” predictions they do make are mostly wrong.

The economically important metrics are precision and recall for the positive (bankrupt) class:

  • Model 1 predicted “Yes” exactly once — for a firm that was actually not bankrupt (FP = 1, TP = 0). It missed all 62 actual bankruptcies. Recall = 0/62 = 0; Precision = 0/1 = 0.
  • Model 2 never predicted “Yes” at all. Recall = 0; Precision is undefined (NA), which is meaningfully different from Precision = 0: NA means the model never raised the alarm, while 0 would mean it raised the alarm but was always wrong. In practice, both outcomes are equivalent failures for a credit-risk or regulatory application.

In any real financial context — a bank, a regulator, or a credit rating agency — missing 62 out of 62 bankrupt firms is catastrophic. The asymmetric cost structure of bankruptcy prediction (false negatives are far more costly than false positives) means recall is the primary performance metric, and both models fail entirely on this dimension at the 0.5 threshold.

Why does this happen? At threshold 0.5, both models assign very low bankruptcy probabilities (roughly 0.03–0.05) to almost all observations, because only ~3.9% of firms in the training data are bankrupt. The model effectively “learns” that predicting No for every firm minimizes total prediction error. This is a structural consequence of class imbalance, not a modeling mistake.

Cross-Validation

logistic_spec <- logistic_reg() %>%
  set_engine("glm") %>%
  set_mode("classification")

set.seed(465)
bankruptcy_folds <- vfold_cv(bankruptcy_train, v = 5)
# IMPORTANT: yardstick's binary metrics default to the FIRST factor level
# as the "event" (positive class). Our factor levels are c("No","Yes"),
# so by default "No" is treated as the positive class.
# We must set event_level = "second" throughout to evaluate performance
# on the bankrupt (Yes) class, which is the economically relevant outcome.
cv_model1 <- fit_resamples(
  logistic_spec,
  bankrupt_status ~ profitability + leverage + liquidity +
                    operating_perf + asset_efficiency,
  resamples = bankruptcy_folds,
  metrics   = metric_set(accuracy, precision, recall),
  control   = control_resamples(event_level = "second")
)
→ A | warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
There were issues with some computations   A: x1
→ B | warning: While computing binary `precision()`, no predicted events were detected (i.e.
               `true_positive + false_positive = 0`).
               Precision is undefined in this case, and `NA` will be returned.
               Note that 40 true event(s) actually occurred for the problematic event level,
               Yes
There were issues with some computations   A: x1
→ C | warning: While computing binary `precision()`, no predicted events were detected (i.e.
               `true_positive + false_positive = 0`).
               Precision is undefined in this case, and `NA` will be returned.
               Note that 42 true event(s) actually occurred for the problematic event level,
               Yes
There were issues with some computations   A: x1
→ D | warning: While computing binary `precision()`, no predicted events were detected (i.e.
               `true_positive + false_positive = 0`).
               Precision is undefined in this case, and `NA` will be returned.
               Note that 30 true event(s) actually occurred for the problematic event level,
               Yes
There were issues with some computations   A: x1
→ E | warning: While computing binary `precision()`, no predicted events were detected (i.e.
               `true_positive + false_positive = 0`).
               Precision is undefined in this case, and `NA` will be returned.
               Note that 51 true event(s) actually occurred for the problematic event level,
               Yes
There were issues with some computations   A: x1
There were issues with some computations   A: x9   B: x1   C: x1   D: x1   E: x1
collect_metrics(cv_model1)
# A tibble: 3 × 6
  .metric   .estimator    mean     n  std_err .config        
  <chr>     <chr>        <dbl> <int>    <dbl> <chr>          
1 accuracy  binary     0.963       5  0.00306 pre0_mod0_post0
2 precision binary     0.667       1 NA       pre0_mod0_post0
3 recall    binary     0.00870     5  0.00870 pre0_mod0_post0
cv_model2 <- fit_resamples(
  logistic_spec,
  bankrupt_status ~ profitability + leverage,
  resamples = bankruptcy_folds,
  metrics   = metric_set(accuracy, precision, recall),
  control   = control_resamples(event_level = "second")
)
→ A | warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
→ B | warning: While computing binary `precision()`, no predicted events were detected (i.e.
               `true_positive + false_positive = 0`).
               Precision is undefined in this case, and `NA` will be returned.
               Note that 30 true event(s) actually occurred for the problematic event level,
               Yes
→ C | warning: While computing binary `precision()`, no predicted events were detected (i.e.
               `true_positive + false_positive = 0`).
               Precision is undefined in this case, and `NA` will be returned.
               Note that 51 true event(s) actually occurred for the problematic event level,
               Yes
collect_metrics(cv_model2)
# A tibble: 3 × 6
  .metric   .estimator    mean     n std_err .config        
  <chr>     <chr>        <dbl> <int>   <dbl> <chr>          
1 accuracy  binary     0.962       5 0.00311 pre0_mod0_post0
2 precision binary     0.167       3 0.167   pre0_mod0_post0
3 recall    binary     0.00870     5 0.00870 pre0_mod0_post0
full_metrics   <- collect_metrics(cv_model1) %>%
  mutate(model = "Full (5 ratios)")
simple_metrics <- collect_metrics(cv_model2) %>%
  mutate(model = "Simple (profitability + leverage)")

comparison <- bind_rows(full_metrics, simple_metrics) %>%
  filter(.metric %in% c("accuracy", "precision", "recall")) %>%
  select(model, .metric, mean, std_err)

comparison
# A tibble: 6 × 4
  model                             .metric      mean  std_err
  <chr>                             <chr>       <dbl>    <dbl>
1 Full (5 ratios)                   accuracy  0.963    0.00306
2 Full (5 ratios)                   precision 0.667   NA      
3 Full (5 ratios)                   recall    0.00870  0.00870
4 Simple (profitability + leverage) accuracy  0.962    0.00311
5 Simple (profitability + leverage) precision 0.167    0.167  
6 Simple (profitability + leverage) recall    0.00870  0.00870

Interpreting the cross-validation results — with corrected event level:

With event_level = "second", all metrics now evaluate performance on the bankrupt (Yes) class — the economically relevant outcome.

Cross-validation confirms the test-set findings. Both models achieve high cross-validated accuracy (~96%) but near-zero recall across all five folds. This consistency means the models’ failure to detect bankrupt firms is structural — it is not a fluke of one particular train/test split. The models are not overfitting; they are simply predicting the majority class for nearly every observation because class imbalance dominates the loss function.

Note: Without event_level = "second", yardstick defaults to treating “No” as the positive class, which would produce misleadingly high recall (~1.0) because the models correctly predict non-bankruptcy for almost all firms. This is a common pitfall in imbalanced binary classification in R.

recall_full   <- comparison %>%
  filter(model == "Full (5 ratios)", .metric == "recall") %>%
  pull(mean)
recall_simple <- comparison %>%
  filter(model == "Simple (profitability + leverage)", .metric == "recall") %>%
  pull(mean)
se_full   <- comparison %>%
  filter(model == "Full (5 ratios)", .metric == "recall") %>%
  pull(std_err)
se_simple <- comparison %>%
  filter(model == "Simple (profitability + leverage)", .metric == "recall") %>%
  pull(std_err)

se_diff <- sqrt(se_full^2 + se_simple^2)
diff    <- recall_full - recall_simple

cat("Recall — Full model:   ", round(recall_full,   4), "\n")
Recall — Full model:    0.0087 
cat("Recall — Simple model: ", round(recall_simple, 4), "\n")
Recall — Simple model:  0.0087 
cat("Difference in recall:  ", round(diff,          4), "\n")
Difference in recall:   0 
cat("SE of difference:      ", round(se_diff,       4), "\n")
SE of difference:       0.0123 
cat("Difference / SE:       ", round(diff / se_diff, 2), "\n")
Difference / SE:        0 

The difference in recall between the two models is negligibly small relative to its standard error (ratio well below 2), confirming that neither model outperforms the other in detecting bankrupt firms. Both models are essentially equivalent in their failure to identify the minority class.

Results

Summary of Model Performance

Both logistic regression models were trained on 80% of the data and evaluated on the remaining 20%. The key findings are:

Accuracy is misleading. Both models achieve ~95.5% test accuracy, which sounds impressive but falls below the 96.1% accuracy achievable by the trivial “always predict No” rule. Accuracy is not a valid performance criterion when classes are this imbalanced.

Recall is zero for both models. Neither model correctly identified a single bankrupt firm in the test set. Model 1 made one “Yes” prediction and was wrong (precision = 0). Model 2 never predicted bankruptcy at all (recall and precision both undefined). This is the most important finding of the analysis.

Cross-validation confirms the pattern. Five-fold cross-validation — with the positive class correctly set to “Yes” (bankrupt) — produces near-zero recall in every fold, for both models. The problem is systemic, not a sampling artifact.

Model 2 is the preferred model — not because it achieves better predictive performance (both models fail equally at the 0.5 threshold), but because it produces statistically reliable, interpretable coefficients. Model 1 suffers from complete separation, making its coefficient estimates and standard errors meaningless for inference.

Answer to the Research Question

The original question asked: Can financial ratios predict whether a firm will go bankrupt one year in advance?

The answer is partially yes. Financial ratios are clearly associated with bankruptcy risk: descriptive analysis shows that bankrupt firms systematically differ from non-bankrupt firms on all five dimensions examined. In Model 2, leverage is a statistically significant predictor (p = 0.004), and its positive coefficient is consistent with financial distress theory — each unit increase in the debt-to-assets ratio multiplies the odds of bankruptcy by approximately 1.29 (exp(0.258) ≈ 1.29).

However, the selected financial ratios alone are not sufficient to reliably classify bankrupt firms in the test data. At the standard 0.5 decision threshold, both models miss every single actual bankruptcy. This does not mean financial ratios are uninformative — it means that five ratios in a simple logistic regression framework, applied to a severely imbalanced dataset without resampling or threshold adjustment, cannot deliver operationally useful predictions. The two problems are distinct: association (do financial ratios correlate with bankruptcy risk?) and classification (can a model correctly flag bankrupt firms?) require different tools and evaluation criteria.

Economic Interpretation

Coefficient Interpretation (Model 2)

Model 2 provides the only set of economically interpretable coefficients, because Model 1 suffers from complete separation.

Leverage (coefficient = 0.258, p = 0.004): Higher leverage — measured as total liabilities divided by total assets — significantly increases the log-odds of bankruptcy. In probability terms: exp(0.258) ≈ 1.29, meaning that a one-unit increase in the leverage ratio (for example, moving from 0.5 to 1.5, i.e., from 50% debt-financed to 150% debt-financed) is associated with odds of bankruptcy roughly 29% higher, holding profitability constant. This finding is consistent with the financial distress literature: firms with higher debt burdens face larger fixed interest payments, are more sensitive to revenue shocks, and have less financial slack to absorb adverse events.

Profitability (coefficient = −0.002, p = 0.853): Profitability is not statistically significant once leverage is controlled for. Two explanations are plausible. First, profitability and leverage are correlated: firms that lose money tend to accumulate debt, so leverage may absorb the profitability signal. Second, a single year of low profitability may not be sufficient to discriminate between transitory losses and structural financial decline at the one-year prediction horizon.

Policy and Business Implications

The finding that leverage is the dominant predictor has practical implications:

  • For creditors and banks: loan covenant frameworks that trigger review when a borrower’s leverage ratio exceeds a threshold are supported by this analysis. A leverage ratio approaching or exceeding 1.0 (liabilities ≥ assets) should be treated as a significant warning signal.
  • For regulators: monitoring aggregate leverage across sectors can serve as an early indicator of systemic financial stress, particularly in industries prone to cyclical downturns.
  • For investors: the disconnect between short-run profitability and bankruptcy risk suggests that earnings-based screens alone are insufficient for credit risk assessment. Balance sheet analysis — particularly the liability structure — provides incremental information.

Limitations and Reproducibility

Limitations

1. Severe class imbalance. Only ~3.9% of firms in the dataset experienced bankruptcy. At the standard 0.5 classification threshold, models trained on imbalanced data will overwhelmingly predict the majority class. Techniques such as SMOTE (Synthetic Minority Over-sampling Technique), undersampling, or a lowered decision threshold (e.g., 0.1 or 0.2) would likely produce non-zero recall. This analysis did not apply such corrections, which limits the practical usefulness of the classification output.

2. Restricted predictor set. Only five financial ratios were included. Bankruptcy is a multidimensional phenomenon influenced by firm size, industry sector, cash flow dynamics, management quality, and macroeconomic conditions. The dataset contains 64 financial attributes; using only five inevitably omits potentially useful information.

3. Listwise deletion of missing values. Observations with missing values in any variable were removed. While this simplifies the analysis, it reduces the effective sample size and may introduce selection bias if missing data are not random — for example, if struggling firms are systematically less likely to report certain financial figures.

4. Outliers not treated. The profitability variable has extreme values (range: −1.15 to 1.54), and leverage values above 2.0 indicate negative book equity. No winsorization or trimming was applied. Outliers can disproportionately influence logistic regression coefficients.

Reproducibility

The following steps were taken to ensure the analysis is fully reproducible:

  • All code is contained in a single Quarto document (.qmd), which integrates code, output, and narrative.
  • The dataset is loaded using a relative file path ("data/1year.arff"), so the project can be reproduced on any machine without path modification.
  • A fixed random seed (set.seed(465)) is applied before the train/test split and before cross-validation fold creation.
  • All packages are publicly available on CRAN. Package versions used in this analysis:
cat("R version:    ", R.version$major, ".", R.version$minor, "\n", sep = "")
R version:    4.5.1
cat("tidyverse:    ", as.character(packageVersion("tidyverse")),    "\n")
tidyverse:     2.0.0 
cat("tidymodels:   ", as.character(packageVersion("tidymodels")),   "\n")
tidymodels:    1.4.1 
cat("yardstick:    ", as.character(packageVersion("yardstick")),    "\n")
yardstick:     1.4.0 
cat("farff:        ", as.character(packageVersion("farff")),        "\n")
farff:         1.1.1 

AI Use Log

Tools Used

ChatGPT (OpenAI) was used.

Example Interaction

Exact prompt given to the AI:

“In R, when I use fit_resamples() with metric_set(accuracy, precision, recall) on a binary classification problem where the positive class is the second factor level, I get recall close to 1.0 even though my model never predicts the positive class. Why is this happening, and how do I fix it?”

How the output was used:

The AI explained that yardstick defaults to treating the first factor level as the positive (event) class in binary metrics. Since our factor levels are c("No", "Yes"), the default behavior evaluates recall for “No” (non-bankrupt), which is always near 1.0 because models almost never predict bankruptcy. The fix is to pass event_level = "second" to control_resamples().

Verification and modification:

The explanation was verified by reading the yardstick documentation for metric_set() and control_resamples(), and by manually checking that changing event_level from default to "second" changed the recall output from ~1.0 to near 0 — confirming that the original output was measuring non-bankrupt recall. The fix was applied throughout the cross-validation section. The original incorrect output (recall ≈ 1.0) is documented in the cross-validation section with an explanation of why it was wrong.

Final Reflections

Improvement with More Time or Better Data

The most impactful improvement would be to address the class imbalance problem directly. Specifically:

  1. Lower the decision threshold from 0.5 to something closer to the empirical base rate (~0.04). A threshold of 0.10–0.15 would force the model to predict “Yes” more frequently, trading off precision for recall. In a bankruptcy prediction context, this trade-off is economically justified: the cost of a missed bankruptcy (false negative) is typically far higher than the cost of a false alarm (false positive).
  2. Apply SMOTE or undersampling to the training data to create a more balanced class distribution before model estimation.
  3. Use ensemble methods (Random Forest, Gradient Boosting) that handle class imbalance more naturally and capture non-linear relationships between financial ratios and bankruptcy risk.

New Economic Question Inspired by This Analysis

This analysis raises the following question for future research:

Does incorporating macroeconomic variables — such as GDP growth, credit spreads, interest rates, and sector-level distress indicators — improve bankruptcy prediction beyond what firm-level financial ratios alone can achieve, and does this incremental gain vary across economic cycles?

This question matters because financial ratios are backward-looking (they reflect past performance), while macroeconomic conditions affect future cash flows and refinancing capacity. A model that combines balance-sheet signals with forward-looking economic indicators may provide more timely and accurate early warnings — particularly during periods of economic stress when firm-level leverage interacts with deteriorating credit conditions to sharply increase default probabilities.

Conclusion

This study investigated whether financial ratios can predict corporate bankruptcy one year in advance using the Polish Company Bankruptcy Dataset. The analysis combined descriptive statistics, probability analysis, and logistic regression modeling across two model specifications.

The key finding is that financial ratios — particularly leverage — are meaningfully associated with bankruptcy risk, but are not sufficient for reliable classification under standard conditions. Leverage is a statistically significant predictor (p = 0.004): a firm with leverage of 2.00 faces a predicted bankruptcy probability of 5.31%, compared to 3.66% at leverage of 0.50 — a nearly 45% relative increase. However, at the standard 0.5 decision threshold, both models failed to identify a single bankrupt firm in the test set, reflecting the structural challenge posed by severe class imbalance (~3.9% bankruptcy rate).

These results suggest that while balance sheet analysis provides economically meaningful signals, practical bankruptcy prediction requires additional tools: threshold adjustment, resampling techniques such as SMOTE, or more complex ensemble models. Future research should explore whether incorporating macroeconomic variables alongside firm-level ratios improves early warning performance across different phases of the economic cycle.