1 Introduction

The aim of the InBillo project was to build an automated scoring technology that assesses the economic situation and cooperation risk of market participants, using general information available online and basic firm characteristics. The system was tested on tens of thousands of scoring attempts and marked around 10% of entities as untrusted.

1.1 Project goal

This report uses a reference sample of that dataset and applies association rule mining with Apriori alogithm to find interpretable patterns related to customer trust. The sample can be found at the Poland’s Data Portal: https://dane.gov.pl/pl/dataset/3572,inbillo/resource/54309/table. The main goal of this project is to determine which combinations of non-score business attributes (e.g. age, size, legal form, VAT status, debtor indicator, subsidies, online footprint) are strongly associated with low customer trust.

1.2 Dataset

The sample dataset contains 1000 rows with firm attributes and scoring outputs such as:

Score components:

  • refer_scorerefer_score (main risk score),

  • refer_scorecustomer_trust (customer trust score),

  • refer_scoredevelopment_advance (technology advancement score),

  • refer_scoreorganization_maturity (organizational maturity score),

  • refer_scorepayment_morality (payment morality score).

Business attributes:

  • activity_firmy (firm active status),

  • company_dataemployment (employment category),

  • company_datalegal_datalegal_form (legal form),

  • company_dataestablishment_date (establishment date),

  • dluznicy_mr (debt amount),

  • dotacje_sudopsuma_swiadczen (subsidies),

  • whiteliststatusVat (presence on the White list of VAT),

  • wwwsocialmedia_list (social media presence list),

  • wwwtechnologies_list (used technology list).

1.3 Methodology

The Apriori algorithm works on sets of discrete items (transactions), however the dataset includes numeric scores and mixed formats, so each row has to be converted into a “basket” of categorical items.

The standard rule interest measures were used:

  • Support: frequency of itemset/rule in all transactions - starting threshold - 0.05,

  • Confidence: conditional probability of RHS given LHS - starting threshold - 0.65,

  • Lift: confidence normalized by frequency of RHS (Lift > 1 means positive association) - starting threshold - 1.2,

These parameters can later be tuned to control the number and strength of rules.

2 Data loading and preprocessing

DATA_PATH <- "inBillo.csv"
df <- read_csv2(DATA_PATH, show_col_types = FALSE)

summary(df)
##      _id                date           refer_scorerefer_score
##  Length:1000        Length:1000        Min.   :  378         
##  Class :character   Class :character   1st Qu.:35298         
##  Mode  :character   Mode  :character   Median :38960         
##                                        Mean   :43743         
##                                        3rd Qu.:43225         
##                                        Max.   :99984         
##                                                              
##  refer_scorecustomer_trust refer_scoredevelopment_advance
##  Min.   :   72             Min.   :  100                 
##  1st Qu.:   72             1st Qu.:76495                 
##  Median :   72             Median :77504                 
##  Mean   :26396             Mean   :65904                 
##  3rd Qu.:57361             3rd Qu.:79931                 
##  Max.   :99999             Max.   :99999                 
##                                                          
##  refer_scoreorganization_maturity refer_scorepayment_morality
##  Min.   :    0                    Length:1000                
##  1st Qu.:  100                    Class :character           
##  Median :  100                    Mode  :character           
##  Mean   : 2868                                               
##  3rd Qu.:  100                                               
##  Max.   :99999                                               
##                                                              
##   wwwdomain         activity_firmy     company_dataemployment
##  Length:1000        Length:1000        Length:1000           
##  Class :character   Class :character   Class :character      
##  Mode  :character   Mode  :character   Mode  :character      
##                                                              
##                                                              
##                                                              
##                                                              
##  company_datalegal_datalegal_form company_dataestablishment_date
##  Length:1000                      Length:1000                   
##  Class :character                 Class :character              
##  Mode  :character                 Mode  :character              
##                                                                 
##                                                                 
##                                                                 
##                                                                 
##   dluznicy_mr        dotacje_sudopsuma_swiadczen whiteliststatusVat
##  Min.   :     1312   Length:1000                 Length:1000       
##  1st Qu.:   314098   Class :character            Class :character  
##  Median :  1182152   Mode  :character            Mode  :character  
##  Mean   :  8547942                                                 
##  3rd Qu.:  5340970                                                 
##  Max.   :469632239                                                 
##  NA's   :526                                                       
##  wwwsocialmedia_list wwwtechnologies_list
##  Length:1000         Length:1000         
##  Class :character    Class :character    
##  Mode  :character    Mode  :character    
##                                          
##                                          
##                                          
## 

The raw sample dataset contains heterogeneous data formats: dates and numeric values stored as strings, numeric scores and list fields representing online presence. To enable association rule mining dedicated parsers were implemented to standardize dates, numeric amounts and list fields into consistent formats. Continuous score variables were categorized using quantile-based binning to reduce noise and transform numerical indicators into categorical tokens suitable for transaction-based analysis.

The categories were established as follows:

List-like features have also been extracted from the columns:

# Parsers
strip_ordinal <- function(s) stringr::str_replace_all(s, "(\\d+)(st|nd|rd|th)", "\\1")

parse_dt <- function(x){
  if (is.na(x) || stringr::str_trim(x) == "") return(as.POSIXct(NA))
  s <- strip_ordinal(as.character(x))
  s <- sub("\\.[0-9]+$", "", s)  # remove .000

  dt <- strptime(s, format = "%B %e %Y, %H:%M:%S", tz = "UTC")
  if (!is.na(dt[1])) return(as.POSIXct(dt))

  dt <- strptime(s, format = "%B %d %Y, %H:%M:%S", tz = "UTC")
  if (!is.na(dt[1])) return(as.POSIXct(dt))

  dt <- strptime(s, format = "%B %e %Y", tz = "UTC")
  if (!is.na(dt[1])) return(as.POSIXct(dt))

  dt <- strptime(s, format = "%B %d %Y", tz = "UTC")
  if (!is.na(dt[1])) return(as.POSIXct(dt))

  as.POSIXct(NA)
}

parse_amount <- function(x){
  if (is.na(x) || str_trim(x) == "") return(NA_real_)
  suppressWarnings(as.numeric(str_replace_all(as.character(x), ",", "")))
}

parse_list <- function(x){
  if (is.na(x)) return(character(0))
  s <- str_trim(as.character(x))
  if (s == "" || s == "[]" || tolower(s) %in% c("nan","none")) return(character(0))
  s <- str_remove_all(s, "\\[|\\]|\"|\'")
  parts <- str_trim(unlist(str_split(s, ",")))
  unique(parts[parts != ""])
}

# Quantile binning
bin_quantile_3 <- function(v){
  v <- suppressWarnings(as.numeric(v))
  out <- rep(NA_character_, length(v))
  ok <- !is.na(v)
  if (!any(ok)) return(out)

  qs <- quantile(v[ok], probs = c(1/3, 2/3), na.rm = TRUE, names = FALSE)

  if (length(unique(qs)) < 2) {
    r <- rank(v[ok], ties.method = "average") / sum(ok)
    out[ok] <- as.character(cut(
      r,
      breaks = c(0, 1/3, 2/3, 1),
      labels = SCORE_LABELS,
      include.lowest = TRUE
    ))
    return(out)
  }

  out[ok] <- as.character(cut(
    v[ok],
    breaks = c(-Inf, qs[1], qs[2], Inf),
    labels = SCORE_LABELS,
    include.lowest = TRUE
  ))
  out
}

# Tokenizer
mk <- function(prefix, value){
  if (length(value) == 0) return(NA_character_)
  if (length(value) > 1) value <- value[[1]]
  value <- as.character(value)

  if (is.na(value) || str_trim(value) == "" || tolower(value) %in% c("nan","none")) {
    return(NA_character_)
  }
  paste0(prefix, "=", value)
}

# Lists
mk_list_items <- function(prefix, x){
  vals <- parse_list(x)
  if (length(vals) == 0) return(character(0))
  paste0(prefix, "=", vals)
}

3 Transaction building

After parsing and categorizing the selected features each observation was transformed into a transaction consisting of categorical items representing firm characteristics. Each transaction aggregates activity status, firm size (number of employees), legal form, VAT status, debtor and subsidy indicators, age category, score bins and social media and technology presence. The resulting dataset contains 1000 transactions and 125 distinct items.

# Building transactions
tx_list <- lapply(seq_len(nrow(df)), function(i){
  row <- as.list(df[i, ])

  items <- c(
    mk("Activity", row$activity_firmy),
    mk("Size", row$company_dataemployment),
    mk("LegalForm", row$company_datalegal_datalegal_form),
    mk("VAT", row$whiteliststatusVat),

    mk("Debtor", row$Debtor),
    mk("Subsidy", row$Subsidy),
    mk("Age", row$Age),

    # bins as categorical
    unlist(lapply(score_cols, function(sc) mk(sc, row[[paste0(sc, "_bin")]]))),

    mk_list_items("SM", row$wwwsocialmedia_list),
    mk_list_items("TECH", row$wwwtechnologies_list)
  )

  items <- items[!is.na(items)]
  unique(items)
})

trans <- as(tx_list, "transactions")
summary(trans)
## transactions as itemMatrix in sparse format with
##  1000 rows (elements/itemsets/transactions) and
##  125 columns (items) and a density of 0.12132 
## 
## most frequent items:
##                     Activity=Aktywny refer_scoreorganization_maturity=Mid 
##                                 1000                                  953 
##                          Subsidy=Yes                           VAT=Czynny 
##                                  915                                  854 
##        refer_scorecustomer_trust=Low                              (Other) 
##                                  610                                10833 
## 
## element (itemset/transaction) length distribution:
## sizes
##  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  27 
##   3  18 173 164 137 113  94  81  70  59  46  14  16   6   2   3   1 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   10.00   13.00   15.00   15.16   17.00   27.00 
## 
## includes extended item information - examples:
##             labels
## 1 Activity=Aktywny
## 2         Age=0-3y
## 3       Age=10-20y

4 Exploratory Data Analysis - identifying frequent itemsets

After inspection it is clear that every company in the dataset is still active and operational, because Activity=Aktywny item appears in every transaction. Keeping this attribute could yield trivial rules.

Using a minimum support threshold of 5% and limiting the maximum itemset length to four, the Apriori algorithm identified 8147 frequent itemsets. The most frequent patterns are dominated by medium organizational maturity, subsidy reception, VAT activity, and low customer trust indicators, reflecting common structural and financial characteristics across the analyzed firms.

# Remove constant items (mainly Activity=Aktywny)
trans <- trans[, itemFrequency(trans) < 0.99]
# Identifying frequent itemsets
freq_sets <- apriori(trans, parameter = list(supp = MIN_SUP, target = "frequent itemsets", maxlen = MAXLEN))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##          NA    0.1    1 none FALSE            TRUE       5    0.05      1
##  maxlen            target  ext
##       4 frequent itemsets TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 50 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[124 item(s), 1000 transaction(s)] done [0.00s].
## sorting and recoding items ... [40 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4
##  done [0.01s].
## sorting transactions ... done [0.00s].
## writing ... [8147 set(s)] done [0.00s].
## creating S4 object  ... done [0.00s].
cat("Frequent itemsets:", length(freq_sets), "\n")
## Frequent itemsets: 8147
inspect(head(sort(freq_sets, by = "support", decreasing = TRUE), 15))
##      items                                   support count
## [1]  {refer_scoreorganization_maturity=Mid}    0.953   953
## [2]  {Subsidy=Yes}                             0.915   915
## [3]  {refer_scoreorganization_maturity=Mid,               
##       Subsidy=Yes}                             0.885   885
## [4]  {VAT=Czynny}                              0.854   854
## [5]  {refer_scoreorganization_maturity=Mid,               
##       VAT=Czynny}                              0.818   818
## [6]  {Subsidy=Yes,                                        
##       VAT=Czynny}                              0.810   810
## [7]  {refer_scoreorganization_maturity=Mid,               
##       Subsidy=Yes,                                        
##       VAT=Czynny}                              0.784   784
## [8]  {refer_scorecustomer_trust=Low}           0.610   610
## [9]  {refer_scorecustomer_trust=Low,                      
##       refer_scoreorganization_maturity=Mid}    0.573   573
## [10] {refer_scorecustomer_trust=Low,                      
##       Subsidy=Yes}                             0.553   553
## [11] {refer_scorecustomer_trust=Low,                      
##       refer_scoreorganization_maturity=Mid,               
##       Subsidy=Yes}                             0.532   532
## [12] {Debtor=Unknown}                          0.526   526
## [13] {TECH=Form}                               0.517   517
## [14] {refer_scorecustomer_trust=Low,                      
##       VAT=Czynny}                              0.514   514
## [15] {Debtor=Unknown,                                     
##       refer_scoreorganization_maturity=Mid}    0.501   501

5 Association rule mining

In the next step, association rules were generated using the Apriori algorithm with minimum support of 5%, minimum confidence of 65% and minimum lift of 1.2, while again restricting the maximum rule length to four items. This procedure resulted in 5873 rules, which were further filtered by removing redundant rules in order to retain only the most informative and non-overlapping patterns.

# Rule mining - Apriori
rules <- apriori(trans, parameter = list(supp = MIN_SUP, conf = MIN_CONF, maxlen = MAXLEN, target = "rules"))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##        0.65    0.1    1 none FALSE            TRUE       5    0.05      1
##  maxlen target  ext
##       4  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 50 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[124 item(s), 1000 transaction(s)] done [0.00s].
## sorting and recoding items ... [40 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4
##  done [0.01s].
## writing ... [12647 rule(s)] done [0.00s].
## creating S4 object  ... done [0.01s].
rules <- subset(rules, lift >= MIN_LIFT)
rules <- sort(rules, by = "lift", decreasing = TRUE)

cat("Rules:", length(rules), "\n")
## Rules: 5873
inspect(head(rules, 20))
##      lhs                                                     rhs                                support confidence coverage     lift count
## [1]  {Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.133  1.0000000    0.133 5.405405   133
## [2]  {Age=20y+,                                                                                                                           
##       Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.078  1.0000000    0.078 5.405405    78
## [3]  {Debtor=Unknown,                                                                                                                     
##       refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.118  1.0000000    0.118 5.405405   118
## [4]  {LegalForm=SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ,                                                                                  
##       refer_scoredevelopment_advance=High,                                                                                                
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.050  1.0000000    0.050 5.405405    50
## [5]  {Debtor=Unknown,                                                                                                                     
##       refer_scoredevelopment_advance=High,                                                                                                
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.086  1.0000000    0.086 5.405405    86
## [6]  {Debtor=Unknown,                                                                                                                     
##       LegalForm=SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ,                                                                                  
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.073  1.0000000    0.073 5.405405    73
## [7]  {Age=10-20y,                                                                                                                         
##       Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.052  1.0000000    0.052 5.405405    52
## [8]  {Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       TECH=Google Analytics}                              => {refer_scorepayment_morality=High}   0.067  1.0000000    0.067 5.405405    67
## [9]  {Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       TECH=Google Tag Manager}                            => {refer_scorepayment_morality=High}   0.076  1.0000000    0.076 5.405405    76
## [10] {Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       SM=Facebook}                                        => {refer_scorepayment_morality=High}   0.082  1.0000000    0.082 5.405405    82
## [11] {Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       TECH=Form}                                          => {refer_scorepayment_morality=High}   0.073  1.0000000    0.073 5.405405    73
## [12] {Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       VAT=Czynny}                                         => {refer_scorepayment_morality=High}   0.126  1.0000000    0.126 5.405405   126
## [13] {Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       Subsidy=Yes}                                        => {refer_scorepayment_morality=High}   0.129  1.0000000    0.129 5.405405   129
## [14] {Debtor=Unknown,                                                                                                                     
##       refer_scoreorganization_maturity=Mid,                                                                                               
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.132  1.0000000    0.132 5.405405   132
## [15] {Age=20y+,                                                                                                                           
##       Debtor=Unknown,                                                                                                                     
##       refer_scorecustomer_trust=High}                     => {refer_scorepayment_morality=High}   0.083  0.9880952    0.084 5.341055    83
## [16] {LegalForm=SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ,                                                                                  
##       refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.065  0.9701493    0.067 5.244050    65
## [17] {refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       TECH=Google Analytics}                              => {refer_scorepayment_morality=High}   0.060  0.9677419    0.062 5.231037    60
## [18] {refer_scoredevelopment_advance=High,                                                                                                
##       refer_scoreorganization_maturity=Mid,                                                                                               
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.090  0.9677419    0.093 5.231037    90
## [19] {Age=20y+,                                                                                                                           
##       refer_scoredevelopment_advance=High,                                                                                                
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.059  0.9672131    0.061 5.228179    59
## [20] {refer_scorecustomer_trust=High,                                                                                                     
##       refer_scoredevelopment_advance=High,                                                                                                
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.080  0.9638554    0.083 5.210029    80

This step resulted in dropping 2908 rules, proving high overlap between patterns within the dataset. The strongest rules predominantly link combinations of firm characteristics and high overall scores with high payment morality, indicating consistent relationships between organizational stability and financial reliability.

# Removing the most redundant rules
rules_nr <- rules[!is.redundant(rules)]
cat("Non-redundant rules:", length(rules_nr), "\n")
## Non-redundant rules: 2965
inspect(head(rules_nr, 20))
##      lhs                                                     rhs                                support confidence coverage     lift count
## [1]  {Debtor=Unknown,                                                                                                                     
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.133  1.0000000    0.133 5.405405   133
## [2]  {LegalForm=SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ,                                                                                  
##       refer_scoredevelopment_advance=High,                                                                                                
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.050  1.0000000    0.050 5.405405    50
## [3]  {Age=20y+,                                                                                                                           
##       Debtor=Unknown,                                                                                                                     
##       refer_scorecustomer_trust=High}                     => {refer_scorepayment_morality=High}   0.083  0.9880952    0.084 5.341055    83
## [4]  {LegalForm=SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ,                                                                                  
##       refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.065  0.9701493    0.067 5.244050    65
## [5]  {refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       TECH=Google Analytics}                              => {refer_scorepayment_morality=High}   0.060  0.9677419    0.062 5.231037    60
## [6]  {refer_scoredevelopment_advance=High,                                                                                                
##       refer_scoreorganization_maturity=Mid,                                                                                               
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.090  0.9677419    0.093 5.231037    90
## [7]  {Age=20y+,                                                                                                                           
##       refer_scoredevelopment_advance=High,                                                                                                
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.059  0.9672131    0.061 5.228179    59
## [8]  {refer_scorecustomer_trust=High,                                                                                                     
##       refer_scoredevelopment_advance=High,                                                                                                
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.080  0.9638554    0.083 5.210029    80
## [9]  {Age=20y+,                                                                                                                           
##       refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.073  0.9605263    0.076 5.192034    73
## [10] {refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       SM=Facebook}                                        => {refer_scorepayment_morality=High}   0.072  0.9600000    0.075 5.189189    72
## [11] {refer_scoredevelopment_advance=High,                                                                                                
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.090  0.9574468    0.094 5.175388    90
## [12] {refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       VAT=Czynny}                                         => {refer_scorepayment_morality=High}   0.116  0.9508197    0.122 5.139566   116
## [13] {refer_scorerefer_score=High,                                                                                                        
##       SM=Facebook,                                                                                                                        
##       TECH=Google Tag Manager}                            => {refer_scorepayment_morality=High}   0.057  0.9500000    0.060 5.135135    57
## [14] {refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       TECH=Google Tag Manager}                            => {refer_scorepayment_morality=High}   0.068  0.9444444    0.072 5.105105    68
## [15] {Age=20y+,                                                                                                                           
##       refer_scorerefer_score=High,                                                                                                        
##       TECH=Google Tag Manager}                            => {refer_scorepayment_morality=High}   0.050  0.9433962    0.053 5.099439    50
## [16] {refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High,                                                                                                        
##       Subsidy=Yes}                                        => {refer_scorepayment_morality=High}   0.119  0.9370079    0.127 5.064907   119
## [17] {refer_scorecustomer_trust=High,                                                                                                     
##       refer_scorerefer_score=High}                        => {refer_scorepayment_morality=High}   0.123  0.9318182    0.132 5.036855   123
## [18] {refer_scoreorganization_maturity=Mid,                                                                                               
##       refer_scorerefer_score=High,                                                                                                        
##       TECH=Google Tag Manager}                            => {refer_scorepayment_morality=High}   0.076  0.9268293    0.082 5.009888    76
## [19] {refer_scorerefer_score=High,                                                                                                        
##       SM=Facebook,                                                                                                                        
##       TECH=Google Analytics}                              => {refer_scorepayment_morality=High}   0.051  0.9107143    0.056 4.922780    51
## [20] {refer_scoreorganization_maturity=Mid,                                                                                               
##       refer_scorerefer_score=High,                                                                                                        
##       TECH=Google Analytics}                              => {refer_scorepayment_morality=High}   0.068  0.9066667    0.075 4.900901    68

However, this process still produced a large number of rules - 2965 non-redundant rules - many of which involved relationships between different scoring components. To align the analysis with the research objective and to improve interpretability, the rule collection was narrowed down using domain-driven constraints.

Firstly, the analysis was restricted to rules predicting low customer trust (refer_scorecustomer_trust = Low) and only rules with a single-item right-hand side were kept. This step reduced the rule set to 170 rules, ensuring that each rule directly addresses the outcome of interest.

# Keeping only rules with single-item RHS, that indicate low customer trust
rules_nr_1rhs <- rules_nr[size(rhs(rules_nr)) == 1]
rules_trust_low <- subset(rules_nr_1rhs, rhs %pin% "refer_scorecustomer_trust=Low")
rules_trust_low <- sort(rules_trust_low, by = "lift", decreasing = TRUE)

cat("\nRules with RHS = refer_scorecustomer_trust=Low :", length(rules_trust_low), "\n")
## 
## Rules with RHS = refer_scorecustomer_trust=Low : 170
inspect(head(rules_trust_low, 10))
##      lhs                                                               rhs                             support confidence coverage     lift count
## [1]  {refer_scoredevelopment_advance=Mid,                                                                                                        
##       refer_scorepayment_morality=Mid,                                                                                                           
##       refer_scorerefer_score=Mid}                                   => {refer_scorecustomer_trust=Low}   0.087  0.9886364    0.088 1.620715    87
## [2]  {Debtor=Unknown,                                                                                                                            
##       refer_scorepayment_morality=Mid,                                                                                                           
##       refer_scorerefer_score=Mid}                                   => {refer_scorecustomer_trust=Low}   0.156  0.9873418    0.158 1.618593   156
## [3]  {Age=20y+,                                                                                                                                  
##       Debtor=Unknown,                                                                                                                            
##       refer_scorepayment_morality=Mid}                              => {refer_scorecustomer_trust=Low}   0.059  0.9833333    0.060 1.612022    59
## [4]  {Debtor=Unknown,                                                                                                                            
##       refer_scorerefer_score=Mid,                                                                                                                
##       SM=Facebook}                                                  => {refer_scorecustomer_trust=Low}   0.107  0.9816514    0.109 1.609265   107
## [5]  {Debtor=Unknown,                                                                                                                            
##       refer_scoredevelopment_advance=Mid,                                                                                                        
##       refer_scorerefer_score=Mid}                                   => {refer_scorecustomer_trust=Low}   0.091  0.9784946    0.093 1.604090    91
## [6]  {Debtor=Unknown,                                                                                                                            
##       refer_scorerefer_score=Mid,                                                                                                                
##       Size=1-10}                                                    => {refer_scorecustomer_trust=Low}   0.128  0.9696970    0.132 1.589667   128
## [7]  {Debtor=Unknown,                                                                                                                            
##       refer_scorerefer_score=Mid,                                                                                                                
##       TECH=Form}                                                    => {refer_scorecustomer_trust=Low}   0.121  0.9680000    0.125 1.586885   121
## [8]  {refer_scoredevelopment_advance=Low,                                                                                                        
##       refer_scorepayment_morality=Low,                                                                                                           
##       refer_scorerefer_score=High}                                  => {refer_scorecustomer_trust=Low}   0.150  0.9677419    0.155 1.586462   150
## [9]  {Debtor=Unknown,                                                                                                                            
##       refer_scorerefer_score=Mid,                                                                                                                
##       TECH=Google Tag Manager}                                      => {refer_scorecustomer_trust=Low}   0.110  0.9649123    0.114 1.581823   110
## [10] {LegalForm=OSOBA FIZYCZNA PROWADZĄCA DZIAŁALNOŚĆ GOSPODARCZĄ,                                                                               
##       refer_scorepayment_morality=Mid,                                                                                                           
##       refer_scorerefer_score=Mid}                                   => {refer_scorecustomer_trust=Low}   0.137  0.9647887    0.142 1.581621   137

Secondly, all rules containing score-based variables on the left-hand side were excluded to avoid meaningless “score results in score” relationships. This resulted in a concise set of 29 rules that link non-score business attributes, such as firm age, size, legal form, debtor status, VAT activity and online footprint to low customer trust.

# Exclude score-items from LHS to avoid "score --> score"
is_score_item <- function(x) grepl("^refer_score", x)

lhs_list <- LIST(lhs(rules_trust_low), decode = TRUE)
lhs_has_score <- vapply(lhs_list, function(items) any(is_score_item(items)), logical(1))

rules_trust_low_noscorelhs <- rules_trust_low[!lhs_has_score]
rules_trust_low_noscorelhs <- sort(rules_trust_low_noscorelhs, by = "lift", decreasing = TRUE)

cat("\nRules (trust=Low) with no score-items on LHS:", length(rules_trust_low_noscorelhs), "\n")
## 
## Rules (trust=Low) with no score-items on LHS: 29
inspect(head(rules_trust_low_noscorelhs, 20))
##      lhs                                                               rhs                             support confidence coverage     lift count
## [1]  {Age=3-10y,                                                                                                                                 
##       Debtor=Unknown,                                                                                                                            
##       Size=1-10}                                                    => {refer_scorecustomer_trust=Low}   0.069  0.9452055    0.073 1.549517    69
## [2]  {Age=3-10y,                                                                                                                                 
##       Debtor=Unknown,                                                                                                                            
##       TECH=Google Analytics}                                        => {refer_scorecustomer_trust=Low}   0.055  0.9016393    0.061 1.478097    55
## [3]  {Age=3-10y,                                                                                                                                 
##       Debtor=Unknown,                                                                                                                            
##       SM=Facebook}                                                  => {refer_scorecustomer_trust=Low}   0.066  0.8800000    0.075 1.442623    66
## [4]  {Age=3-10y,                                                                                                                                 
##       Size=1-10,                                                                                                                                 
##       SM=Facebook}                                                  => {refer_scorecustomer_trust=Low}   0.060  0.8695652    0.069 1.425517    60
## [5]  {Age=3-10y,                                                                                                                                 
##       Debtor=Unknown,                                                                                                                            
##       TECH=Google Tag Manager}                                      => {refer_scorecustomer_trust=Low}   0.066  0.8684211    0.076 1.423641    66
## [6]  {Age=3-10y,                                                                                                                                 
##       Size=1-10,                                                                                                                                 
##       VAT=Czynny}                                                   => {refer_scorecustomer_trust=Low}   0.093  0.8611111    0.108 1.411658    93
## [7]  {Debtor=Unknown,                                                                                                                            
##       Size=1-10,                                                                                                                                 
##       TECH=Form}                                                    => {refer_scorecustomer_trust=Low}   0.128  0.8533333    0.150 1.398907   128
## [8]  {Debtor=Unknown,                                                                                                                            
##       LegalForm=OSOBA FIZYCZNA PROWADZĄCA DZIAŁALNOŚĆ GOSPODARCZĄ,                                                                               
##       Size=1-10}                                                    => {refer_scorecustomer_trust=Low}   0.147  0.8497110    0.173 1.392969   147
## [9]  {Age=3-10y,                                                                                                                                 
##       Size=1-10,                                                                                                                                 
##       TECH=Form}                                                    => {refer_scorecustomer_trust=Low}   0.067  0.8481013    0.079 1.390330    67
## [10] {Age=3-10y,                                                                                                                                 
##       Debtor=Unknown,                                                                                                                            
##       TECH=Form}                                                    => {refer_scorecustomer_trust=Low}   0.070  0.8433735    0.083 1.382579    70
## [11] {Age=3-10y,                                                                                                                                 
##       Debtor=Unknown}                                               => {refer_scorecustomer_trust=Low}   0.113  0.8432836    0.134 1.382432   113
## [12] {Debtor=Unknown,                                                                                                                            
##       Size=1-10,                                                                                                                                 
##       VAT=Czynny}                                                   => {refer_scorecustomer_trust=Low}   0.186  0.8340807    0.223 1.367345   186
## [13] {Age=3-10y,                                                                                                                                 
##       Size=1-10}                                                    => {refer_scorecustomer_trust=Low}   0.110  0.8333333    0.132 1.366120   110
## [14] {Debtor=Unknown,                                                                                                                            
##       Size=1-10,                                                                                                                                 
##       TECH=Google Tag Manager}                                      => {refer_scorecustomer_trust=Low}   0.109  0.8257576    0.132 1.353701   109
## [15] {Age=3-10y,                                                                                                                                 
##       LegalForm=OSOBA FIZYCZNA PROWADZĄCA DZIAŁALNOŚĆ GOSPODARCZĄ,                                                                               
##       VAT=Czynny}                                                   => {refer_scorecustomer_trust=Low}   0.052  0.8253968    0.063 1.353110    52
## [16] {Age=3-10y,                                                                                                                                 
##       LegalForm=OSOBA FIZYCZNA PROWADZĄCA DZIAŁALNOŚĆ GOSPODARCZĄ}  => {refer_scorecustomer_trust=Low}   0.056  0.8235294    0.068 1.350048    56
## [17] {Debtor=Unknown,                                                                                                                            
##       Size=1-10,                                                                                                                                 
##       SM=Facebook}                                                  => {refer_scorecustomer_trust=Low}   0.107  0.8230769    0.130 1.349306   107
## [18] {Debtor=Unknown,                                                                                                                            
##       Size=1-10}                                                    => {refer_scorecustomer_trust=Low}   0.196  0.8200837    0.239 1.344399   196
## [19] {Debtor=Unknown,                                                                                                                            
##       LegalForm=OSOBA FIZYCZNA PROWADZĄCA DZIAŁALNOŚĆ GOSPODARCZĄ,                                                                               
##       VAT=Czynny}                                                   => {refer_scorecustomer_trust=Low}   0.174  0.8093023    0.215 1.326725   174
## [20] {Debtor=Unknown,                                                                                                                            
##       LegalForm=OSOBA FIZYCZNA PROWADZĄCA DZIAŁALNOŚĆ GOSPODARCZĄ,                                                                               
##       TECH=Google Tag Manager}                                      => {refer_scorecustomer_trust=Low}   0.104  0.8062016    0.129 1.321642   104

The strongest rules indicate that small and relatively young firms (3–10 years of operation) with unknown debtor status are particularly associated with low customer trust, especially when combined with a little number of employees (1-10). The frequent occurrence of the Unknown value in debtor status in the final rules, could naturally be interpreted as lack of information but it could also very likely be an indicator of limited transparency regarding the debt burden. These patterns suggest that insufficient honesty in terms of financial obligations in new, small firms often negatively influences trustworthiness.

Additionally, several rules highlight the role of online presence, where firms that rely primarily on basic digital tools, such as Facebook or simple web Forms show an increased likelihood of low customer trust. This may reflect limited reputation or lower levels of digital maturity, which affects how potential customers perceive reliability.

6 Conclusions

Overall, the results demonstrate that low customer trust is not driven by a single factor but rather by specific combinations of structural and behavioral attributes, supporting the idea that trust-related risk emerges from the interaction of firm age, number of employees, financial transparency and digital footprint rather than from isolated characteristics.