Analytical Minds Project

Author

Favour Adekunle and Thanh Nguyen

QR CODE

#install.packages("qrcode")
library(qrcode)

qr <- qr_code("https://rpubs.com/Favour_Adekunle/1426667")

plot(qr)

PROJECT TITLE:

Identifying the Key Drivers of Credit Card Debt and Segmenting Customers by Financial Behavior

Group Name:

Analytical Minds

Group Members:

  • Favour Adekunle

  • Thanh Nguyen

Introduction

Credit cards have become one of the most widely used financial tools for everyday transactions, online purchases, emergency spending, and short-term access to credit. They provide convenience and financial flexibility for customers, while also serving as an important revenue source for banks and financial institutions. However, improper credit card usage can lead to increasing debt balances, missed payments, and financial stress for customers.

Understanding the factors that contribute to credit card debt is important for both consumers and financial institutions. Customers can benefit from insights into spending and repayment behavior, while banks can use data-driven findings to improve customer risk assessment, product design, and financial support strategies.

This project focuses on analyzing customer credit card behavior using the Credit Card Customers Dataset (CC GENERAL) from Kaggle. The dataset contains information on customer balances, purchases, cash advances, credit limits, payments, and usage frequency. By examining these financial behaviors, the project aims to identify the key drivers of credit card debt and group customers into meaningful segments based on their financial patterns.

The project will apply data wrangling, visualization, regression modeling, clustering, and interactive reporting in R using methods covered in class. The results of this project can provide practical recommendations for improving credit management, reducing excessive debt, and understanding different customer financial profiles.

Project Objective

The main objective of this project is to determine which customer financial behaviors are most associated with higher credit card balances and to classify customers into groups based on their spending, borrowing, and repayment patterns.

Data Description

This project uses the Credit Card Customers Dataset (CC GENERAL) obtained from Kaggle. The dataset contains information on customer credit card usage behavior and financial activity. It includes variables related to balances, purchases, payments, cash advances, credit limits, and transaction frequency. The dataset is appropriate for this project because it provides customer-level financial data that can be used to analyze the main factors associated with credit card debt and spending behavior. Each row in the dataset represents one credit card customer and the dataset contains 8,950 observations (customers).

Key Variables

Variables Description
CUST_ID Unique customer identification number
BALANCE Current outstanding balance on the credit card
PURCHASES Total amount of purchases made
ONEOFF_PURCHASES Amount spent on one-time purchases
INSTALLMENTS_PURCHASES Amount spent through installment purchases
CASH_ADVANCE Cash withdrawn using the credit card
CREDIT_LIMIT Maximum available credit limit
PAYMENTS Total payments made by the customer
MINIMUM_PAYMENTS Minimum required payment
PURCHASES_FREQUENCY Frequency of purchases
CASH_ADVANCE_FREQUENCY Frequency of cash advance usage
PRC_FULL_PAYMENT Percentage of full payments made
TENURE Number of months as a customer

Target Variable

The target (dependent) variable for the regression analysis is BALANCE. This is used as a measure of customer credit card debt.

Variables for Clustering

To group customers into similar financial behavior segments, the clustering analysis will use selected independent variables such as:

  • PURCHASES

  • CASH_ADVANCE

  • PAYMENTS

  • CREDIT_LIMIT

  • PURCHASES_FREQUENCY

  • TENURE

(BALANCE will be excluded from clustering as required in the project checklist.)

Importing the Data

#install.packages("tidyverse")
library(tidyverse)

# Import the dataset
cc <- read.csv("CC GENERAL.csv")

# Inspecting the dataset
str(cc)
'data.frame':   8950 obs. of  18 variables:
 $ CUST_ID                         : chr  "C10001" "C10002" "C10003" "C10004" ...
 $ BALANCE                         : num  40.9 3202.5 2495.1 1666.7 817.7 ...
 $ BALANCE_FREQUENCY               : num  0.818 0.909 1 0.636 1 ...
 $ PURCHASES                       : num  95.4 0 773.2 1499 16 ...
 $ ONEOFF_PURCHASES                : num  0 0 773 1499 16 ...
 $ INSTALLMENTS_PURCHASES          : num  95.4 0 0 0 0 ...
 $ CASH_ADVANCE                    : num  0 6443 0 206 0 ...
 $ PURCHASES_FREQUENCY             : num  0.1667 0 1 0.0833 0.0833 ...
 $ ONEOFF_PURCHASES_FREQUENCY      : num  0 0 1 0.0833 0.0833 ...
 $ PURCHASES_INSTALLMENTS_FREQUENCY: num  0.0833 0 0 0 0 ...
 $ CASH_ADVANCE_FREQUENCY          : num  0 0.25 0 0.0833 0 ...
 $ CASH_ADVANCE_TRX                : int  0 4 0 1 0 0 0 0 0 0 ...
 $ PURCHASES_TRX                   : int  2 0 12 1 1 8 64 12 5 3 ...
 $ CREDIT_LIMIT                    : num  1000 7000 7500 7500 1200 1800 13500 2300 7000 11000 ...
 $ PAYMENTS                        : num  202 4103 622 0 678 ...
 $ MINIMUM_PAYMENTS                : num  140 1072 627 NA 245 ...
 $ PRC_FULL_PAYMENT                : num  0 0.222 0 0 0 ...
 $ TENURE                          : int  12 12 12 12 12 12 12 12 12 12 ...
summary(cc)
   CUST_ID             BALANCE        BALANCE_FREQUENCY   PURCHASES       
 Length:8950        Min.   :    0.0   Min.   :0.0000    Min.   :    0.00  
 Class :character   1st Qu.:  128.3   1st Qu.:0.8889    1st Qu.:   39.63  
 Mode  :character   Median :  873.4   Median :1.0000    Median :  361.28  
                    Mean   : 1564.5   Mean   :0.8773    Mean   : 1003.21  
                    3rd Qu.: 2054.1   3rd Qu.:1.0000    3rd Qu.: 1110.13  
                    Max.   :19043.1   Max.   :1.0000    Max.   :49039.57  
                                                                          
 ONEOFF_PURCHASES  INSTALLMENTS_PURCHASES  CASH_ADVANCE     PURCHASES_FREQUENCY
 Min.   :    0.0   Min.   :    0.0        Min.   :    0.0   Min.   :0.00000    
 1st Qu.:    0.0   1st Qu.:    0.0        1st Qu.:    0.0   1st Qu.:0.08333    
 Median :   38.0   Median :   89.0        Median :    0.0   Median :0.50000    
 Mean   :  592.4   Mean   :  411.1        Mean   :  978.9   Mean   :0.49035    
 3rd Qu.:  577.4   3rd Qu.:  468.6        3rd Qu.: 1113.8   3rd Qu.:0.91667    
 Max.   :40761.2   Max.   :22500.0        Max.   :47137.2   Max.   :1.00000    
                                                                               
 ONEOFF_PURCHASES_FREQUENCY PURCHASES_INSTALLMENTS_FREQUENCY
 Min.   :0.00000            Min.   :0.0000                  
 1st Qu.:0.00000            1st Qu.:0.0000                  
 Median :0.08333            Median :0.1667                  
 Mean   :0.20246            Mean   :0.3644                  
 3rd Qu.:0.30000            3rd Qu.:0.7500                  
 Max.   :1.00000            Max.   :1.0000                  
                                                            
 CASH_ADVANCE_FREQUENCY CASH_ADVANCE_TRX  PURCHASES_TRX     CREDIT_LIMIT  
 Min.   :0.0000         Min.   :  0.000   Min.   :  0.00   Min.   :   50  
 1st Qu.:0.0000         1st Qu.:  0.000   1st Qu.:  1.00   1st Qu.: 1600  
 Median :0.0000         Median :  0.000   Median :  7.00   Median : 3000  
 Mean   :0.1351         Mean   :  3.249   Mean   : 14.71   Mean   : 4494  
 3rd Qu.:0.2222         3rd Qu.:  4.000   3rd Qu.: 17.00   3rd Qu.: 6500  
 Max.   :1.5000         Max.   :123.000   Max.   :358.00   Max.   :30000  
                                                           NA's   :1      
    PAYMENTS       MINIMUM_PAYMENTS    PRC_FULL_PAYMENT     TENURE     
 Min.   :    0.0   Min.   :    0.019   Min.   :0.0000   Min.   : 6.00  
 1st Qu.:  383.3   1st Qu.:  169.124   1st Qu.:0.0000   1st Qu.:12.00  
 Median :  856.9   Median :  312.344   Median :0.0000   Median :12.00  
 Mean   : 1733.1   Mean   :  864.207   Mean   :0.1537   Mean   :11.52  
 3rd Qu.: 1901.1   3rd Qu.:  825.485   3rd Qu.:0.1429   3rd Qu.:12.00  
 Max.   :50721.5   Max.   :76406.208   Max.   :1.0000   Max.   :12.00  
                   NA's   :313                                         
dim(cc)
[1] 8950   18
colnames(cc)
 [1] "CUST_ID"                          "BALANCE"                         
 [3] "BALANCE_FREQUENCY"                "PURCHASES"                       
 [5] "ONEOFF_PURCHASES"                 "INSTALLMENTS_PURCHASES"          
 [7] "CASH_ADVANCE"                     "PURCHASES_FREQUENCY"             
 [9] "ONEOFF_PURCHASES_FREQUENCY"       "PURCHASES_INSTALLMENTS_FREQUENCY"
[11] "CASH_ADVANCE_FREQUENCY"           "CASH_ADVANCE_TRX"                
[13] "PURCHASES_TRX"                    "CREDIT_LIMIT"                    
[15] "PAYMENTS"                         "MINIMUM_PAYMENTS"                
[17] "PRC_FULL_PAYMENT"                 "TENURE"                          

The dataset contains 8,950 customer observations and 18 financial variables related to balances, purchases, payments, cash advances, and credit limits. Initial inspection shows that most variables are numeric and suitable for quantitative analysis. Missing values were detected in MINIMUM_PAYMENTS (313 missing values) and CREDIT_LIMIT (1 missing value), which require cleaning before regression and clustering.

Data wrangling and cleaning

# Create cleaned project dataset
cc_clean <- cc |>
  select(CUST_ID, BALANCE, PURCHASES, CASH_ADVANCE,
         CREDIT_LIMIT, PAYMENTS, MINIMUM_PAYMENTS,
         PURCHASES_FREQUENCY, PRC_FULL_PAYMENT, TENURE) |>
  filter(!is.na(CREDIT_LIMIT),
         !is.na(MINIMUM_PAYMENTS)) |>
  mutate(
    debt_ratio = BALANCE / CREDIT_LIMIT,
    payment_ratio = PAYMENTS / BALANCE
  )

# Inspect cleaned data
str(cc_clean)
'data.frame':   8636 obs. of  12 variables:
 $ CUST_ID            : chr  "C10001" "C10002" "C10003" "C10005" ...
 $ BALANCE            : num  40.9 3202.5 2495.1 817.7 1809.8 ...
 $ PURCHASES          : num  95.4 0 773.2 16 1333.3 ...
 $ CASH_ADVANCE       : num  0 6443 0 0 0 ...
 $ CREDIT_LIMIT       : num  1000 7000 7500 1200 1800 13500 2300 7000 11000 1200 ...
 $ PAYMENTS           : num  202 4103 622 678 1400 ...
 $ MINIMUM_PAYMENTS   : num  140 1072 627 245 2407 ...
 $ PURCHASES_FREQUENCY: num  0.1667 0 1 0.0833 0.6667 ...
 $ PRC_FULL_PAYMENT   : num  0 0.222 0 0 0 ...
 $ TENURE             : int  12 12 12 12 12 12 12 12 12 12 ...
 $ debt_ratio         : num  0.0409 0.4575 0.3327 0.6814 1.0055 ...
 $ payment_ratio      : num  4.934 1.281 0.249 0.83 0.774 ...
summary(cc_clean)
   CUST_ID             BALANCE          PURCHASES         CASH_ADVANCE    
 Length:8636        Min.   :    0.0   Min.   :    0.00   Min.   :    0.0  
 Class :character   1st Qu.:  148.1   1st Qu.:   43.37   1st Qu.:    0.0  
 Mode  :character   Median :  916.9   Median :  375.40   Median :    0.0  
                    Mean   : 1601.2   Mean   : 1025.43   Mean   :  994.2  
                    3rd Qu.: 2105.2   3rd Qu.: 1145.98   3rd Qu.: 1132.4  
                    Max.   :19043.1   Max.   :49039.57   Max.   :47137.2  
  CREDIT_LIMIT      PAYMENTS        MINIMUM_PAYMENTS    PURCHASES_FREQUENCY
 Min.   :   50   Min.   :    0.05   Min.   :    0.019   Min.   :0.00000    
 1st Qu.: 1600   1st Qu.:  418.56   1st Qu.:  169.164   1st Qu.:0.08333    
 Median : 3000   Median :  896.68   Median :  312.452   Median :0.50000    
 Mean   : 4522   Mean   : 1784.48   Mean   :  864.305   Mean   :0.49600    
 3rd Qu.: 6500   3rd Qu.: 1951.14   3rd Qu.:  825.496   3rd Qu.:0.91667    
 Max.   :30000   Max.   :50721.48   Max.   :76406.208   Max.   :1.00000    
 PRC_FULL_PAYMENT     TENURE        debt_ratio       payment_ratio      
 Min.   :0.0000   Min.   : 6.00   Min.   : 0.00000   Min.   :0.0009695  
 1st Qu.:0.0000   1st Qu.:12.00   1st Qu.: 0.04741   1st Qu.:0.3515823  
 Median :0.0000   Median :12.00   Median : 0.31826   Median :1.4828256  
 Mean   :0.1593   Mean   :11.53   Mean   : 0.39772   Mean   :      Inf  
 3rd Qu.:0.1667   3rd Qu.:12.00   3rd Qu.: 0.72575   3rd Qu.:7.7078825  
 Max.   :1.0000   Max.   :12.00   Max.   :15.90995   Max.   :      Inf  
dim(cc_clean)
[1] 8636   12

After selecting relevant variables and removing missing values, the cleaned dataset contains 8,636 customers and 12 variables. Two additional variables were created using mutate(): debt_ratio, which measures balance relative to credit limit, and payment_ratio, which measures payments relative to balance. These derived variables help evaluate customer credit utilization and repayment behavior.

### Correcting the infinte values ###

cc_clean <- cc_clean |>
  mutate(
    payment_ratio = case_when(
      BALANCE == 0 ~ NA,
      TRUE ~ PAYMENTS / BALANCE
    )
  )

summary(cc_clean$payment_ratio)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
0.000e+00 3.500e-01 1.480e+00 1.958e+02 7.700e+00 1.186e+06         6 

IMPORTANT OBSERVATION:

One of the newly created variables (payment_ratio) contains infinite values, (since payment_ratio = PAYMENTS / BALANCE, for some customers BALANCE = 0, hence PAYMENTS / 0 = undefined), which is a mathematical error, even though the dataset is already cleaned for NA missing values. And if we keep the infinite values (inf), it can cause problems later, like plots axes may distort, regression may fail or bias results, and clustering can seriously damage cluster distances. So, for customers with zero balance, the values were replaced with missing values to ensure valid analysis. After correcting division-by-zero cases, the payment_ratio variable was successfully updated. Only 6 customers had undefined ratios due to zero balances and were recorded as missing values.

### Now we have the fully cleaned project dataset ###
summary(cc_clean)
   CUST_ID             BALANCE          PURCHASES         CASH_ADVANCE    
 Length:8636        Min.   :    0.0   Min.   :    0.00   Min.   :    0.0  
 Class :character   1st Qu.:  148.1   1st Qu.:   43.37   1st Qu.:    0.0  
 Mode  :character   Median :  916.9   Median :  375.40   Median :    0.0  
                    Mean   : 1601.2   Mean   : 1025.43   Mean   :  994.2  
                    3rd Qu.: 2105.2   3rd Qu.: 1145.98   3rd Qu.: 1132.4  
                    Max.   :19043.1   Max.   :49039.57   Max.   :47137.2  
                                                                          
  CREDIT_LIMIT      PAYMENTS        MINIMUM_PAYMENTS    PURCHASES_FREQUENCY
 Min.   :   50   Min.   :    0.05   Min.   :    0.019   Min.   :0.00000    
 1st Qu.: 1600   1st Qu.:  418.56   1st Qu.:  169.164   1st Qu.:0.08333    
 Median : 3000   Median :  896.68   Median :  312.452   Median :0.50000    
 Mean   : 4522   Mean   : 1784.48   Mean   :  864.305   Mean   :0.49600    
 3rd Qu.: 6500   3rd Qu.: 1951.14   3rd Qu.:  825.496   3rd Qu.:0.91667    
 Max.   :30000   Max.   :50721.48   Max.   :76406.208   Max.   :1.00000    
                                                                           
 PRC_FULL_PAYMENT     TENURE        debt_ratio       payment_ratio      
 Min.   :0.0000   Min.   : 6.00   Min.   : 0.00000   Min.   :0.000e+00  
 1st Qu.:0.0000   1st Qu.:12.00   1st Qu.: 0.04741   1st Qu.:3.500e-01  
 Median :0.0000   Median :12.00   Median : 0.31826   Median :1.480e+00  
 Mean   :0.1593   Mean   :11.53   Mean   : 0.39772   Mean   :1.958e+02  
 3rd Qu.:0.1667   3rd Qu.:12.00   3rd Qu.: 0.72575   3rd Qu.:7.700e+00  
 Max.   :1.0000   Max.   :12.00   Max.   :15.90995   Max.   :1.186e+06  
                                                     NA's   :6          
dim(cc_clean)
[1] 8636   12
colnames(cc_clean)
 [1] "CUST_ID"             "BALANCE"             "PURCHASES"          
 [4] "CASH_ADVANCE"        "CREDIT_LIMIT"        "PAYMENTS"           
 [7] "MINIMUM_PAYMENTS"    "PURCHASES_FREQUENCY" "PRC_FULL_PAYMENT"   
[10] "TENURE"              "debt_ratio"          "payment_ratio"      

After the data cleaning and transformation, the final working dataset (cc_clean) contained 8,636 customer observations and 12 variables, and which now be used for all subsequent analyses.

Data Visualizations

(This helps us to see patterns in the data).

# Figure 1: Scatter Plot: Purchases vs Credit Card Balance
plot1 <- ggplot(data = cc_clean) +
  geom_point(aes(x = PURCHASES, y = BALANCE)) +
  labs(
    title = "Purchases vs Credit Card Balance",
    x = "Purchases",
    y = "Balance"
  ) +
  
theme_bw()

plot1

Figure 1: Purchases vs Credit Card Balance

(This figure helps determine whether customer spending is associated with debt accumulation, which directly supports the project objective of identifying the key drivers of credit card debt).

The scatter plot shows the relationship between customer purchases and credit card balance. Most customers are concentrated at lower purchase levels and lower balances, indicating that many customers have moderate spending and relatively low debt. There are also several outliers with very high purchase amounts and high balances. In all, the relationship appears weakly positive, suggesting that higher purchases may contribute to higher credit card balances for some customers.

# Figure 2: Scatter Plot with Trend Line: Payments vs Balance
plot2 <- ggplot(data = cc_clean) +
  geom_point(aes(x = PAYMENTS, y = BALANCE)) +
  geom_smooth(aes(x = PAYMENTS, y = BALANCE)) +
  labs(
    title = "Payments vs Credit Card Balance",
    x = "Payments",
    y = "Balance"
  )

plot2

Figure 2: Payments vs Credit Card Balance

(This figure helps explain repayment behavior and shows that payment amount alone may not indicate low debt, since customers with high balances often make higher payments).

The scatter plot shows the relationship between customer payments and credit card balance. Most customers are concentrated at lower payment amounts and lower balances. The smooth trend line shows a positive relationship, indicating that customers with higher balances also tend to make higher payments. This means that larger payments are often associated with customers carrying more debt rather than automatically reducing balances.

#Figure 3: Scatter Plot with Trend Line: Cash Advance vs Balance
plot3 <- ggplot(data = cc_clean) +
  geom_point(aes(x = CASH_ADVANCE, y = BALANCE)) +
  geom_smooth(aes(x = CASH_ADVANCE, y = BALANCE)) +
  labs(
    title = "Cash Advance vs Credit Card Balance",
    x = "Cash Advance",
    y = "Balance"
  ) +
  theme_bw()

plot3

Figure 3: Cash Advance vs Credit Card Balance

(This figure helps identify risky borrowing behavior. Cash advances often carry extra fees and interest, so frequent or large withdrawals may contribute to higher debt).

The scatter plot shows the relationship between cash advance usage and credit card balance. Most customers have low cash advance amounts and lower balances. The smooth trend line slopes upward, indicating a positive relationship between cash advance and balance. Customers who withdraw larger cash advances tend to carry higher credit card debt.

#Figure 4: Histogram: Distribution of Credit Card Balance
plot4 <- ggplot(data = cc_clean) +
  geom_histogram(aes(x = BALANCE), bins = 30) +
  labs(
    title = "Distribution of Credit Card Balance",
    x = "Balance",
    y = "Count"
  ) 

plot4

Figure 4: Distribution of Credit Card Balance

(This figure shows that credit card debt is not evenly distributed. Most customers manage lower balances, while a smaller group may represent higher debt risk).

The histogram shows the distribution of customer credit card balances. Most customers have relatively low balances, with the highest concentration near zero to moderate debt levels. The distribution is strongly right-skewed, meaning a small number of customers carry very high balances.

#Figure 5: Histogram: Distribution of Purchases
plot5 <- ggplot(data = cc_clean) +
  geom_histogram(aes(x = PURCHASES), bins = 30) +
  labs(
    title = "Distribution of Customer Purchases",
    x = "Purchases",
    y = "Count"
  ) +
  theme_bw()

plot5

Figure 5: Distribution of Customer Purchases

(This figure helps identify customer spending patterns and suggests that a small group of high spenders may significantly influence overall purchase totals).

The histogram shows the distribution of customer purchases. Most customers have low to moderate purchase amounts, while a small number of customers make very large purchases. The distribution is strongly right-skewed, indicating that high spending is concentrated among a few customers.

# Figure 6: Boxplot: Balance by Tenure
plot6 <- ggplot(data = cc_clean) +
  geom_boxplot(aes(x = factor(TENURE),
                   y = BALANCE,
                   fill = factor(TENURE))) +
  labs(
    title = "Credit Card Balance by Customer Tenure",
    x = "Tenure (in months)",
    y = "Balance",
    fill = "Tenure"
  )

plot6

Figure 6: Credit Card Balance by Customer Tenure

(This figure helps evaluate whether the length of customer relationship is associated with debt accumulation and balance variability).

This boxplot compares balances by tenure. Customers with shorter tenure generally have lower balances, while customers with longer tenure show higher typical balances and wider spread. This means that debt may increase over time for some customers.

# Figure 7: Bar Chart: Number of Customers by Tenure
plot7 <- ggplot(data = cc_clean) +
  geom_bar(aes(x = factor(TENURE),
               fill = factor(TENURE))) +
  labs(
    title = "Number of Customers by Tenure",
    x = "Tenure (in months)",
    y = "Count",
    fill = "Tenure"
  )

plot7

Figure 7: Number of Customers by Tenure

(This figure helps understand customer composition and explains why tenure 12 may strongly influence overall results).

The bar chart shows the number of customers in each tenure category. The largest proportion of customers have a tenure of 12 months, while much smaller numbers are observed in the shorter tenure groups. This indicates that most customers in the dataset are long-term customers.

#Figure 8: Scatter Plot: Credit Limit vs Balance
plot8 <- ggplot(data = cc_clean) +
  geom_point(aes(x = CREDIT_LIMIT, y = BALANCE)) +
  geom_smooth(aes(x = CREDIT_LIMIT, y = BALANCE)) +
  labs(
    title = "Credit Limit vs Credit Card Balance",
    x = "Credit Limit",
    y = "Balance"
  )

plot8

Figure 8: Credit Limit vs Credit Card Balance

(This figure helps evaluate whether access to larger credit limits contributes to higher debt balances and whether borrowing behavior changes at higher limit levels).

The scatter plot shows the relationship between credit limit and credit card balance. The trend line initially rises, indicating that customers with higher credit limits tend to carry higher balances. At very high credit limits, the relationship levels off and slightly declines. This shows that moderate increases in credit limit are associated with higher debt, but the highest-limit customers may manage balances more effectively.

#Figure 9: Scatter Plot: Debt Ratio vs Balance
plot9 <- ggplot(data = cc_clean) +
  geom_point(aes(x = debt_ratio, y = BALANCE)) +
  geom_smooth(aes(x = debt_ratio, y = BALANCE)) +
  labs(
    title = "Debt Ratio vs Credit Card Balance",
    x = "Debt Ratio",
    y = "Balance"
  )

plot9

Figure 9: Debt Ratio vs Credit Card Balance

(This figure helps evaluate credit utilization behavior and identifies customers who may be overextended relative to their available credit).

The scatter plot shows the relationship between debt ratio and credit card balance. Most customers are concentrated at lower debt ratios, indicating balances below their credit limits. Customers with higher balances are mostly observed within lower to moderate debt ratios. A small number of extreme debt ratio values appear as outliers. That is, the majority of customers are clustered at low debt ratios, showing moderate credit utilization. And some unusual cases with very high debt ratios are present, but they are few in number.

#Figure 10: Combined Dashboard Using Patchwork
library(patchwork)

(plot1 + plot2) /
  (plot3 + plot8)

Figure 10: Combined Financial Behavior Dashboard

(This figure gives a quick summary of the most important customer behaviors related to credit card debt and helps compare multiple drivers in one view).

The combined dashboard summarizes four major relationships with credit card balance. Purchases show a weak positive relationship with balance. Payments display a positive relationship, indicating customers with larger balances often make larger payments. Cash advance has a clear positive relationship with balance, suggesting borrowing cash is associated with higher debt. Credit limit also shows a positive relationship at moderate levels before flattening at higher limits. In summary, this dashboard shows four things:

  • Spending more can increase debt.

  • People with bigger debt often pay more.

  • Taking cash from the card is linked to more debt.

  • Bigger credit limits can lead to bigger balances.

Data Modeling (Regression)

(This helps to measure and predict relationships in the data).

A multiple linear regression model was fitted to predict credit card balance (BALANCE) using PURCHASES, CASH_ADVANCE, PAYMENTS, CREDIT_LIMIT, and TENURE.

## Data Modeling ##
fit1 <- lm(BALANCE ~ PURCHASES + CASH_ADVANCE +
             PAYMENTS + CREDIT_LIMIT + TENURE,
           data = cc_clean)

summary(fit1)

Call:
lm(formula = BALANCE ~ PURCHASES + CASH_ADVANCE + PAYMENTS + 
    CREDIT_LIMIT + TENURE, data = cc_clean)

Residuals:
     Min       1Q   Median       3Q      Max 
-10761.6   -854.0   -209.5    672.8  14126.6 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -7.264e+02  1.545e+02  -4.702 2.62e-06 ***
PURCHASES     1.364e-01  1.164e-02  11.719  < 2e-16 ***
CASH_ADVANCE  4.461e-01  1.076e-02  41.454  < 2e-16 ***
PAYMENTS     -1.064e-01  9.577e-03 -11.109  < 2e-16 ***
CREDIT_LIMIT  2.314e-01  5.433e-03  42.594  < 2e-16 ***
TENURE        7.695e+01  1.343e+01   5.730 1.04e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1596 on 8630 degrees of freedom
Multiple R-squared:  0.4206,    Adjusted R-squared:  0.4203 
F-statistic:  1253 on 5 and 8630 DF,  p-value: < 2.2e-16

The overall regression model is statistically significant (F-statistic p < 2.2e-16), indicating that the selected predictors jointly explain customer credit card balances. The model has an Adjusted R-squared of 0.4203, meaning approximately 42% of the variation in balances is explained by the included variables.

All predictors were statistically significant (p < 0.001). Higher PURCHASES were associated with higher balances. Holding other variables constant, a one-unit increase in purchases increases balance by approximately 0.136 units. Higher CASH_ADVANCE was strongly associated with higher balances. Holding other variables constant, a one-unit increase in cash advance increases balance by approximately 0.446 units, making it one of the strongest positive predictors of debt. Higher PAYMENTS were associated with lower balances. Holding other variables constant, a one-unit increase in payments decreases balance by approximately 0.106 units. Higher CREDIT_LIMIT was associated with higher balances. Holding other variables constant, a one-unit increase in credit limit increases balance by approximately 0.231 units. Longer TENURE was also associated with higher balances. Holding other variables constant, each additional unit increase in tenure increases balance by approximately 76.95 units, suggesting that longer-term customers tend to carry higher debt balances.

Hence, the results suggest that cash advance usage, credit limit, purchases, and longer tenure are important drivers of higher credit card debt, while higher payments help reduce debt balances.

Clustering Analysis

After removing the target (dependent) variable “BALANCE” we will cluster customers using only independent financial behavior variables. This helps identify low-risk customers, high-spending customers, heavy borrowers, disciplined payers. The variables for Clustering are; PURCHASES (spending), CASH_ADVANCE (borrowing), PAYMENTS (repayment), CREDIT_LIMIT (available credit), TENURE (customer duration).

## Prepare data for clustering ##
#install.packages("cluster")
#install.packages("NbClust")

library(tidyverse)
library(cluster)
library(NbClust)

cc_cluster <- cc_clean |>
  select(PURCHASES, CASH_ADVANCE,
         PAYMENTS, CREDIT_LIMIT,
         TENURE)

cc_scale <- scale(cc_cluster)

## Clustering Analysis ##
set.seed(123)

sample_rows <- sample(1:nrow(cc_scale), 1000)

cc_scale_sample <- cc_scale[sample_rows, ]

number_cluster_estimate <- NbClust(
  cc_scale_sample,
  distance = "euclidean",
  min.nc = 2,
  max.nc = 6,
  method = "kmeans"
)

*** : The Hubert index is a graphical method of determining the number of clusters.
                In the plot of Hubert index, we seek a significant knee that corresponds to a 
                significant increase of the value of the measure i.e the significant peak in Hubert
                index second differences plot. 
 

*** : The D index is a graphical method of determining the number of clusters. 
                In the plot of D index, we seek a significant knee (the significant peak in Dindex
                second differences plot) that corresponds to a significant increase of the value of
                the measure. 
 
******************************************************************* 
* Among all indices:                                                
* 8 proposed 2 as the best number of clusters 
* 2 proposed 3 as the best number of clusters 
* 8 proposed 4 as the best number of clusters 
* 4 proposed 5 as the best number of clusters 
* 2 proposed 6 as the best number of clusters 

                   ***** Conclusion *****                            
 
* According to the majority rule, the best number of clusters is  2 
 
 
******************************************************************* 
number_cluster_estimate$Best.nc
                    KL       CH Hartigan     CCC    Scott      Marriot   TrCovW
Number_clusters 5.0000   5.0000   4.0000  6.0000    4.000 4.000000e+00      4.0
Value_Index     4.8602 431.5909 157.6825 14.1016 1864.343 2.411849e+14 182447.2
                  TraceW Friedman   Rubin Cindex     DB Silhouette   Duda
Number_clusters   4.0000   4.0000  5.0000  5.000 6.0000     2.0000 2.0000
Value_Index     416.6359   6.0241 -0.1992  0.086 1.0896     0.4827 0.7854
                PseudoT2  Beale Ratkowsky     Ball PtBiserial   Frey McClain
Number_clusters   2.0000 2.0000    4.0000   3.0000     4.0000 2.0000  2.0000
Value_Index      70.5094 0.8505    0.3623 896.7652     0.5754 4.4186  0.1773
                 Dunn Hubert SDindex Dindex   SDbw
Number_clusters 3.000      0  2.0000      0 2.0000
Value_Index     0.011      0  2.9611      0 2.0365

The clustering results show that customers can be divided into two major financial behavior groups.

pam_fit <- pam(cc_scale, k = 2)

pam_fit
Medoids:
       ID  PURCHASES CASH_ADVANCE    PAYMENTS CREDIT_LIMIT    TENURE
[1,] 3399 -0.2517151   -0.3063550 -0.40032831   -0.5525986 0.3551601
[2,] 6239  0.1580752    0.2083768 -0.03564832    0.9504456 0.3551601
Clustering vector:
   [1] 1 2 2 1 1 2 1 2 2 1 1 1 2 1 2 1 1 2 1 1 2 1 2 2 1 1 1 2 2 2 2 2 1 2 1 2 2
  [38] 2 2 1 1 1 1 2 1 1 1 2 1 1 1 2 1 2 2 1 2 1 1 1 1 2 1 2 2 2 1 2 2 2 1 1 2 1
  [75] 2 2 1 2 2 2 2 1 2 2 2 1 1 1 2 1 1 2 1 1 1 2 1 2 1 2 2 1 1 2 1 2 1 2 2 1 2
 [112] 2 1 1 2 1 1 2 1 1 2 1 2 1 1 2 2 2 2 2 1 2 2 2 2 2 1 2 1 2 1 1 2 1 1 1 2 2
 [149] 2 1 1 2 2 2 1 2 1 1 2 1 1 2 2 1 2 1 1 1 1 2 1 2 2 2 2 2 2 1 1 2 2 1 1 2 2
 [186] 1 2 2 2 1 1 1 2 1 2 2 2 1 2 2 1 1 2 1 2 1 1 2 1 1 2 2 1 1 1 2 1 1 1 2 1 2
 [223] 2 2 1 1 2 2 1 2 2 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 2 2 2 1 2 1 1 2 1
 [260] 1 2 2 2 2 1 1 1 2 2 2 1 1 1 2 2 1 2 1 2 1 1 2 1 1 2 1 2 1 1 2 2 2 2 2 1 2
 [297] 1 1 2 1 2 2 2 1 2 1 2 1 2 1 2 1 1 2 1 2 1 2 1 2 2 2 2 2 2 2 1 1 1 1 2 2 1
 [334] 1 1 2 2 2 2 2 2 1 2 2 1 1 1 2 1 1 1 2 2 1 2 1 1 1 2 2 2 1 2 2 1 2 1 1 1 2
 [371] 2 2 1 2 2 1 2 2 2 2 2 2 2 2 1 1 2 2 2 2 1 2 2 2 2 1 2 1 2 1 2 1 2 1 2 1 1
 [408] 2 2 1 2 1 1 1 2 2 2 1 2 2 1 1 2 2 1 1 1 1 2 1 2 2 2 2 1 2 1 1 2 2 2 1 2 2
 [445] 2 2 1 2 2 1 2 2 2 2 2 2 1 2 2 1 1 2 1 1 2 2 1 1 1 1 2 1 2 1 2 2 2 2 2 1 2
 [482] 1 2 1 1 2 2 1 2 2 2 1 1 1 1 2 2 2 2 2 1 2 1 2 2 2 2 2 2 1 2 1 2 2 2 2 2 1
 [519] 1 2 2 1 2 1 1 2 1 2 2 1 2 2 1 1 2 2 1 2 2 2 2 1 2 2 2 2 2 1 2 1 2 2 2 1 1
 [556] 1 1 1 2 1 1 2 1 2 2 2 1 2 2 2 1 2 2 2 2 1 2 2 1 1 2 2 1 2 1 2 1 1 1 2 2 1
 [593] 1 2 2 2 2 1 2 1 2 2 1 2 2 2 2 2 2 2 1 1 1 2 2 2 2 1 2 1 2 2 2 1 2 2 1 2 2
 [630] 1 2 2 1 2 1 2 2 1 2 1 2 1 1 2 1 2 1 1 2 2 2 1 2 1 2 2 1 2 1 2 1 1 1 2 1 1
 [667] 1 2 2 2 1 2 2 1 2 2 1 2 1 2 2 1 2 1 2 2 2 1 2 1 1 1 2 2 2 1 1 1 2 1 2 1 2
 [704] 1 1 2 1 1 2 2 1 2 1 2 2 2 2 1 1 2 1 2 2 1 1 1 1 2 2 2 2 1 1 1 2 2 1 1 2 1
 [741] 2 1 1 2 1 2 2 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 2 2 2 2 1 1 1 1 1 1 1 1 2 1
 [778] 1 1 2 1 1 2 2 1 1 1 1 1 1 1 1 2 2 2 2 1 1 2 1 1 2 2 2 2 2 2 1 2 1 1 1 1 2
 [815] 1 2 2 1 2 1 1 1 1 2 2 2 2 1 2 1 1 1 1 2 1 2 2 2 2 1 1 2 2 2 1 1 1 2 2 2 1
 [852] 2 1 1 1 2 2 1 2 1 2 2 2 1 2 1 1 1 1 1 2 1 2 2 2 2 1 1 1 1 1 1 1 1 2 1 2 1
 [889] 2 1 1 2 1 2 1 1 1 1 1 1 2 1 1 1 2 2 1 1 2 2 1 2 1 1 1 2 2 2 1 2 1 1 1 1 1
 [926] 1 2 1 1 1 1 1 2 2 2 1 2 1 2 1 1 1 2 1 2 1 1 1 1 1 1 1 2 2 2 2 2 2 1 2 1 1
 [963] 2 2 1 2 1 1 1 2 2 1 2 2 1 1 2 2 1 2 1 2 1 1 1 1 2 1 1 1 1 2 2 1 2 1 1 2 1
[1000] 1 1 2 2 1 1 1 2 1 2 1 1 1 2 1 1 1 2 1 2 2 1 1 1 2 2 2 2 1 1 2 2 1 1 2 2 2
[1037] 1 2 2 1 2 2 1 2 1 2 1 2 2 2 1 2 2 1 1 1 1 2 1 2 1 2 1 2 2 1 1 2 1 1 2 1 1
[1074] 1 1 1 2 1 2 1 1 1 1 1 2 2 1 2 1 2 1 1 1 1 2 2 2 1 1 1 1 2 1 2 2 1 2 1 1 2
[1111] 2 2 1 2 1 2 1 1 2 2 1 1 2 2 1 2 2 2 2 2 1 1 2 2 1 1 2 2 2 2 1 1 1 1 1 1 2
[1148] 1 1 1 2 1 2 1 2 1 1 1 2 2 1 1 1 2 1 2 2 1 2 1 1 1 2 2 2 1 2 1 2 2 1 2 1 1
[1185] 1 2 1 1 1 1 1 2 1 1 2 2 2 2 1 2 2 2 1 2 1 1 2 1 1 2 2 2 1 2 2 1 2 2 2 1 2
[1222] 2 1 2 1 1 1 2 2 1 2 2 1 1 1 2 2 1 1 1 1 2 2 2 1 1 1 2 1 2 2 2 1 2 1 2 2 2
[1259] 2 2 1 2 1 1 1 2 2 2 2 2 1 1 1 2 1 2 1 2 2 2 1 2 2 2 1 2 2 2 2 2 2 1 1 1 1
[1296] 2 1 1 1 1 1 2 1 2 1 1 1 2 1 1 1 2 1 2 2 1 2 1 2 2 2 1 2 2 2 2 1 2 2 1 2 2
[1333] 2 1 1 2 1 2 1 2 2 2 2 2 1 2 1 2 1 1 2 2 2 2 1 2 1 1 2 2 1 2 2 1 1 1 2 2 2
[1370] 2 2 2 2 2 2 1 1 1 2 1 2 1 2 1 2 2 2 2 1 2 2 1 2 2 2 2 1 1 2 2 1 1 1 1 1 2
[1407] 1 1 1 1 2 2 2 2 1 1 1 1 2 1 1 2 2 1 1 2 2 2 1 1 2 2 2 2 2 1 1 1 1 1 2 2 2
[1444] 1 2 1 2 2 1 1 1 1 2 1 2 1 2 1 2 1 1 1 2 1 1 1 2 2 2 2 2 1 1 1 1 2 2 1 2 1
[1481] 2 2 1 2 2 1 1 2 2 1 2 2 1 1 2 1 2 1 1 1 2 2 1 1 1 1 1 1 1 2 2 1 1 2 2 1 1
[1518] 2 1 2 2 2 2 2 2 1 1 2 2 1 1 1 2 2 1 2 2 2 1 1 2 2 2 2 1 2 1 1 2 1 2 2 2 2
[1555] 2 1 2 2 2 2 2 1 1 2 2 1 2 2 1 1 2 1 2 2 1 1 1 2 2 2 1 2 1 2 2 2 1 1 1 1 1
[1592] 2 2 2 2 1 2 1 1 1 2 2 2 1 2 2 2 2 1 2 2 2 2 2 1 1 1 2 2 1 2 1 1 2 1 1 2 1
[1629] 1 2 2 1 1 2 2 1 2 1 2 1 2 1 2 2 2 2 1 1 1 2 2 2 1 2 2 2 2 2 1 1 2 2 1 1 1
[1666] 1 2 1 2 1 2 1 2 2 1 2 2 1 2 2 1 1 2 2 1 2 1 1 1 1 2 1 2 1 1 2 2 1 1 1 2 1
[1703] 2 1 1 1 1 1 1 1 2 2 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 2 1 2 2 2 2 2 2 2 2
[1740] 1 1 2 2 1 2 1 1 1 1 1 2 1 1 1 1 2 2 2 2 1 1 1 1 2 1 1 2 2 1 2 2 1 1 1 1 2
[1777] 1 2 1 1 1 2 2 2 2 2 2 1 2 1 2 2 2 1 1 1 2 1 1 1 2 2 2 2 2 1 2 2 2 1 1 2 2
[1814] 1 2 1 2 2 2 2 1 1 2 2 2 1 1 1 1 1 1 1 2 1 1 2 2 2 1 1 1 2 2 2 1 2 1 1 1 1
[1851] 1 2 2 2 1 1 2 1 2 1 2 2 2 2 1 1 1 2 2 2 1 1 2 1 2 2 2 1 2 1 1 1 2 2 1 2 1
[1888] 2 1 1 2 1 1 1 2 1 2 1 1 2 2 1 2 2 1 2 1 2 1 1 2 2 1 2 2 2 2 1 2 2 1 1 1 2
[1925] 2 1 2 2 1 1 2 2 1 1 1 2 2 2 1 1 1 2 2 2 1 2 1 1 2 1 2 2 2 1 1 2 1 1 2 1 1
[1962] 2 1 2 2 1 2 1 1 1 2 2 1 2 1 2 2 2 2 2 1 1 2 2 2 2 2 1 2 1 2 2 2 1 1 1 2 2
[1999] 2 1 1 1 2 2 1 1 1 1 1 2 2 2 2 1 1 1 2 1 1 1 1 2 1 1 2 2 1 2 2 1 1 1 1 2 2
[2036] 1 2 2 2 1 2 2 1 1 2 2 1 2 1 2 1 1 1 1 2 1 1 2 1 2 2 1 1 1 1 1 2 1 2 2 1 2
[2073] 1 1 1 2 1 1 2 1 1 2 2 1 1 1 1 2 1 2 1 2 1 1 1 1 2 2 2 2 1 2 1 2 1 2 2 2 2
[2110] 2 2 2 1 2 1 1 1 1 1 1 1 1 2 2 2 1 1 2 1 1 2 2 2 1 2 1 1 2 1 1 1 2 2 2 2 1
[2147] 2 1 1 2 2 2 1 2 1 2 1 2 1 2 1 1 1 2 2 1 1 1 2 2 1 1 1 2 1 1 2 1 1 1 2 1 2
[2184] 2 1 1 1 1 2 1 1 1 1 2 2 1 1 2 2 1 2 2 2 1 1 2 2 1 1 1 1 1 1 1 2 1 1 2 1 1
[2221] 2 2 2 2 1 2 1 1 2 2 2 2 2 2 2 2 2 1 1 2 2 1 1 2 1 2 1 1 2 1 1 1 2 2 2 1 1
[2258] 2 1 2 2 1 2 1 2 2 1 2 1 2 2 1 1 2 1 2 2 1 2 1 2 1 2 2 1 1 2 1 1 2 1 2 2 1
[2295] 2 1 1 2 1 2 1 2 1 2 1 2 2 1 2 1 1 2 2 1 1 2 2 1 1 2 1 2 2 2 1 1 1 2 1 2 2
[2332] 2 2 1 2 2 2 2 2 1 1 1 1 2 2 1 1 2 1 2 2 2 1 2 1 2 1 2 2 2 2 1 2 2 2 2 1 2
[2369] 1 1 2 1 2 1 2 2 1 2 2 2 2 1 1 1 2 1 1 1 2 2 2 1 2 1 1 1 2 1 1 1 1 1 1 2 2
[2406] 1 1 2 2 1 2 2 1 2 1 2 2 2 1 2 1 1 2 1 1 1 1 2 2 1 1 2 1 1 2 1 2 1 2 1 1 1
[2443] 2 2 2 1 1 2 2 1 1 1 1 1 2 2 2 2 2 1 1 2 1 2 1 2 2 1 2 2 1 1 1 1 1 1 1 1 2
[2480] 2 2 2 1 2 1 2 1 1 1 1 1 1 2 2 1 2 1 1 2 2 2 1 2 2 1 1 1 1 1 2 1 1 1 1 1 2
[2517] 2 2 2 1 1 2 1 1 2 1 2 2 2 1 2 2 1 1 2 2 1 2 1 1 2 2 2 1 1 2 1 2 1 1 1 1 1
[2554] 2 2 2 2 2 2 2 2 1 2 2 1 2 2 2 1 2 2 2 2 2 1 2 1 2 2 1 1 2 2 1 2 1 2 1 1 1
[2591] 2 1 1 2 1 1 2 1 2 2 1 2 2 1 1 2 1 2 1 1 2 2 2 2 1 1 1 1 1 1 1 2 2 1 2 2 2
[2628] 1 1 2 2 2 1 1 2 2 2 1 1 2 2 2 1 2 2 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 2 2
[2665] 2 2 1 2 1 1 1 1 1 1 1 1 2 2 1 1 2 1 2 2 1 2 1 2 1 2 2 1 1 1 2 2 1 1 1 2 2
[2702] 1 1 1 2 1 2 2 1 1 2 2 2 1 1 1 2 1 2 1 1 2 1 2 2 1 2 2 1 1 2 1 1 2 1 1 1 1
[2739] 1 1 2 1 1 2 1 1 2 2 1 2 1 2 2 2 2 2 1 1 1 1 1 2 2 1 1 2 1 1 1 2 2 2 1 1 1
[2776] 1 1 2 1 1 2 1 2 1 1 1 2 2 1 1 2 1 1 1 2 2 1 2 1 1 2 1 1 1 1 2 1 2 1 1 1 2
[2813] 2 1 1 1 1 1 2 1 1 1 2 1 1 2 1 2 2 1 1 1 2 2 1 2 1 1 2 1 1 2 2 1 1 2 2 2 2
[2850] 1 1 2 1 2 2 1 1 2 1 2 2 1 2 2 1 1 2 1 1 2 1 1 1 1 1 1 2 1 2 1 1 2 2 2 1 1
[2887] 2 2 1 1 1 1 2 1 2 1 1 1 1 1 1 2 1 2 1 2 1 2 2 1 1 1 2 2 1 1 2 2 1 2 2 2 1
[2924] 2 1 1 2 1 2 1 1 1 2 1 1 1 2 2 2 1 2 1 1 2 2 1 1 2 1 2 1 1 2 1 1 1 1 1 2 1
[2961] 2 1 2 1 1 1 1 1 2 2 1 1 2 1 1 2 2 1 2 1 1 1 2 1 2 1 2 1 2 2 1 1 1 2 2 1 1
[2998] 2 2 1 1 2 1 2 2 2 1 2 1 1 1 2 1 2 1 1 2 2 2 2 1 1 2 1 2 1 1 1 1 2 2 2 2 2
[3035] 1 2 2 1 2 1 1 1 2 2 1 1 1 1 1 2 1 2 1 1 1 2 2 1 2 2 1 2 1 1 1 1 2 2 1 1 2
[3072] 2 2 2 2 1 2 2 1 1 1 2 1 1 1 1 2 2 1 2 1 1 1 1 1 1 2 2 2 1 2 1 1 2 2 2 2 2
[3109] 1 2 1 2 2 1 2 2 2 2 1 1 1 1 1 1 2 1 1 2 1 2 1 2 2 1 1 2 2 2 2 1 2 2 1 2 1
[3146] 2 2 2 1 2 2 2 1 2 2 1 2 1 2 1 2 1 1 2 1 1 1 2 2 1 2 1 2 1 1 2 2 1 1 1 1 1
[3183] 2 1 1 1 1 1 2 2 2 1 2 1 2 2 1 2 1 2 2 2 1 1 1 2 1 1 1 1 2 1 2 1 2 1 1 1 2
[3220] 1 2 2 2 1 1 2 2 1 1 1 2 1 2 2 2 1 1 2 2 1 2 1 2 1 1 1 2 2 1 1 1 1 2 2 2 1
[3257] 2 1 1 1 1 2 1 1 1 1 1 2 2 1 1 2 2 1 1 1 2 1 2 2 2 1 2 2 2 2 1 2 2 2 1 1 1
[3294] 1 2 1 1 2 1 2 1 1 1 1 2 1 1 1 1 1 1 2 1 2 1 1 1 1 1 2 1 2 1 1 2 2 1 2 1 1
[3331] 1 2 1 1 1 2 1 2 1 1 1 1 2 1 2 1 1 2 2 1 1 2 1 2 1 1 2 1 2 2 1 1 2 2 2 1 1
[3368] 2 2 2 2 2 1 2 1 2 2 1 2 1 1 2 2 1 1 1 2 1 1 1 2 1 2 1 2 1 1 1 1 2 1 2 2 1
[3405] 2 1 1 1 2 1 2 2 2 2 1 1 2 2 1 2 2 1 2 1 2 2 2 2 2 1 1 1 2 1 2 1 1 1 2 1 1
[3442] 2 2 2 2 1 2 1 2 1 1 1 1 1 1 2 2 1 1 1 1 1 2 2 2 1 1 2 1 1 1 2 2 1 2 1 1 1
[3479] 2 2 1 1 2 1 2 2 1 1 2 1 1 2 2 2 1 1 1 1 2 1 1 1 2 2 1 2 2 2 2 1 2 1 2 2 2
[3516] 1 2 1 1 2 2 1 1 2 2 2 2 2 1 1 1 2 2 1 2 1 2 2 1 2 1 1 2 1 2 2 1 1 1 1 1 1
[3553] 2 1 2 1 2 2 2 2 2 1 2 1 2 2 2 1 2 1 2 2 1 2 2 1 2 2 2 1 1 1 2 1 1 2 2 2 1
[3590] 1 1 2 2 1 1 2 1 1 1 2 2 1 2 2 2 1 1 1 2 2 2 1 2 1 2 1 2 1 2 2 1 1 2 2 1 1
[3627] 1 1 2 1 2 2 1 2 2 1 1 1 2 2 1 2 1 1 1 2 1 2 1 1 2 1 2 2 2 1 2 1 2 1 2 2 1
[3664] 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 2 1 1 1 2 1 1 1 2 2 2 1 1
[3701] 2 1 2 1 1 2 1 1 1 1 2 1 1 2 1 2 2 1 1 2 2 1 2 1 2 1 2 1 2 2 1 2 2 1 1 2 1
[3738] 2 1 1 2 1 2 2 2 2 1 2 1 1 1 1 2 1 1 1 2 1 1 1 2 2 2 2 1 2 1 1 2 1 1 2 1 1
[3775] 1 1 2 2 1 1 2 1 1 1 1 2 1 2 1 1 1 2 2 1 2 1 1 1 1 2 1 2 1 2 1 1 1 1 1 2 2
[3812] 2 2 2 2 1 2 1 1 1 1 1 2 2 2 1 2 1 2 1 1 2 2 1 1 1 1 1 1 2 1 1 1 1 2 2 2 1
[3849] 1 1 1 2 2 1 2 2 1 2 2 1 1 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 1 1 2 1 2 2 2 2 1
[3886] 2 2 1 1 2 1 1 1 2 2 1 2 1 2 1 2 1 1 1 2 1 2 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1
[3923] 2 1 1 1 1 2 2 2 1 1 1 1 2 1 1 1 2 1 1 2 1 2 2 1 1 2 1 2 1 1 2 1 2 2 1 1 1
[3960] 1 1 2 1 1 2 1 2 1 2 2 1 2 1 1 2 2 2 2 2 1 1 2 2 2 2 2 1 2 1 2 1 1 2 1 1 1
[3997] 2 1 2 2 2 1 2 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 2 1 2 1 1 2 1 1 1 1 1 2 1 1
[4034] 2 1 2 2 1 1 1 1 2 2 1 2 1 2 2 2 2 1 1 1 1 1 1 1 1 2 1 1 1 2 2 1 2 1 2 2 1
[4071] 1 1 2 1 1 2 1 1 1 1 1 2 1 2 2 1 1 2 1 1 1 1 1 2 1 2 1 1 2 2 1 2 1 1 2 1 1
[4108] 1 2 1 1 1 2 1 1 2 2 1 2 1 2 1 1 2 2 2 1 1 1 1 1 1 2 1 2 1 2 2 1 2 1 1 1 1
[4145] 1 1 1 2 2 2 2 2 1 2 1 1 1 1 2 1 1 2 1 2 2 1 1 2 2 1 1 1 1 1 2 2 1 2 1 1 1
[4182] 2 1 1 2 1 1 1 2 1 1 2 1 2 2 1 2 1 1 2 2 2 1 2 2 1 1 2 1 1 1 1 1 2 2 2 1 1
[4219] 2 1 1 2 1 1 2 1 2 1 2 2 2 1 2 1 1 1 1 2 1 2 1 1 1 1 1 1 2 2 1 1 1 1 2 1 1
[4256] 2 1 1 2 1 2 1 2 2 2 2 1 1 2 1 1 2 1 1 2 2 2 2 1 2 2 2 2 2 1 1 2 1 2 2 1 1
[4293] 2 2 1 1 1 1 1 1 2 1 1 2 1 2 2 1 1 1 2 1 1 2 2 2 1 1 1 1 1 1 1 2 1 2 2 1 2
[4330] 2 2 2 1 1 2 1 1 2 2 1 1 2 1 1 1 1 1 1 1 2 2 2 1 1 2 2 2 2 2 2 1 1 1 2 2 2
[4367] 2 1 2 1 2 1 2 2 1 1 1 1 1 2 2 1 2 2 2 1 1 1 1 2 1 2 1 1 1 2 1 1 1 2 1 2 2
[4404] 2 1 2 2 2 1 1 1 2 1 1 2 1 2 1 2 1 1 1 1 2 1 1 1 1 2 2 2 2 1 2 2 2 1 2 2 2
[4441] 1 1 1 1 1 1 2 2 1 2 2 1 1 1 1 1 1 2 2 1 1 2 2 1 1 2 2 2 1 1 1 1 1 1 2 2 1
[4478] 2 1 2 1 2 1 2 1 2 1 2 1 1 1 1 1 2 1 1 1 2 1 2 1 2 1 1 2 1 2 2 1 1 2 2 1 1
[4515] 2 1 1 2 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 2 1 1
[4552] 2 1 2 2 2 1 2 1 2 2 1 2 1 2 2 2 2 1 2 1 1 1 1 1 2 1 1 1 1 1 2 2 2 2 1 1 1
[4589] 1 2 2 1 2 1 2 2 1 1 2 2 2 2 1 1 2 2 1 1 2 1 1 2 1 1 1 2 1 2 1 2 1 1 1 2 1
[4626] 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 2 2 1 1 1 1 1 2 1 1 2 2 1 1 1 2 1 1 2 1 1 1
[4663] 1 1 1 1 2 1 1 2 2 2 1 2 2 1 1 1 1 2 1 1 1 1 2 1 1 1 2 1 1 1 1 1 2 1 1 1 1
[4700] 2 1 1 2 1 1 1 1 2 1 1 2 2 1 2 1 1 1 1 1 2 1 2 1 1 2 1 1 1 1 1 1 1 1 2 1 1
[4737] 1 1 1 2 2 2 2 1 1 1 2 2 2 2 2 1 1 2 1 1 1 1 1 2 1 2 2 2 1 2 1 1 1 2 1 1 2
[4774] 1 1 2 2 1 1 1 1 1 1 1 1 1 2 1 2 2 1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1
[4811] 2 1 2 1 1 1 2 2 1 1 1 2 2 1 2 1 2 1 1 1 1 2 1 1 1 2 1 1 2 1 2 1 2 2 2 2 1
[4848] 1 2 1 1 2 2 1 2 2 2 2 2 1 2 2 1 1 1 2 2 1 1 1 2 2 1 1 2 1 2 2 1 1 1 1 1 2
[4885] 1 2 1 2 2 1 1 2 1 1 1 1 2 1 2 1 1 1 1 1 1 2 2 1 1 1 1 1 2 1 1 2 1 1 1 2 2
[4922] 1 2 1 2 2 1 1 1 1 1 2 1 1 1 2 1 2 1 1 2 1 2 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1
[4959] 1 1 1 1 1 1 2 1 1 2 1 2 2 2 1 2 2 1 1 1 1 1 1 1 2 1 1 1 1 2 2 1 2 1 1 1 1
[4996] 1 2 1 1 1 2 2 2 1 2 1 1 1 1 1 1 2 2 1 2 1 1 1 1 1 1 1 1 2 2 1 1 2 2 1 1 2
[5033] 1 2 1 1 2 2 1 1 2 2 1 1 1 2 2 1 2 1 1 1 2 2 1 2 1 1 1 1 2 2 1 1 2 2 2 1 1
[5070] 2 1 1 1 2 1 2 1 1 1 1 1 2 2 1 1 2 2 1 1 1 2 1 1 2 1 1 1 2 2 2 1 2 2 1 1 1
[5107] 2 2 2 1 1 1 1 2 1 2 2 1 1 1 2 2 2 1 1 1 1 1 1 2 1 1 1 1 1 2 2 2 2 1 1 1 2
[5144] 1 1 2 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 2 1 2 2 2 1 1 2 1 2 2 2 1 1 2 1 1
[5181] 1 1 1 2 1 2 1 1 2 1 1 1 2 1 1 2 1 1 1 2 1 1 2 1 1 1 2 1 1 1 2 1 2 2 1 1 1
[5218] 1 1 1 2 1 2 1 2 2 2 1 2 2 1 1 1 2 2 2 1 1 1 1 1 1 2 2 2 2 2 1 2 1 2 1 1 1
[5255] 1 1 1 1 1 1 1 2 2 1 1 1 1 1 2 2 2 1 2 1 2 2 2 1 1 1 1 1 2 1 1 1 1 1 2 1 1
[5292] 1 1 2 1 2 2 1 2 1 1 1 1 1 1 1 1 2 2 2 2 1 2 1 2 2 2 1 2 1 2 2 1 1 1 2 1 1
[5329] 2 1 1 1 2 2 2 2 2 2 1 1 1 1 1 1 2 1 2 1 1 1 2 2 1 2 1 1 2 1 1 2 1 1 1 2 2
[5366] 1 2 2 1 1 1 1 2 2 1 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 2 1 2 1 1 1 1 2 2 1
[5403] 1 1 2 2 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 2 1 2 1 2 2 1 1 2 1 1 2 2 1 1 2 2 1
[5440] 1 2 1 1 1 1 1 1 1 2 1 1 2 1 1 2 1 1 1 1 1 1 1 2 2 1 1 2 1 2 2 1 1 1 1 2 2
[5477] 1 1 1 2 1 1 1 2 2 1 2 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1
[5514] 2 1 1 1 1 2 1 2 2 2 1 2 1 2 1 1 2 1 1 2 2 1 1 2 2 1 2 1 1 1 1 1 2 2 1 2 2
[5551] 1 1 1 2 1 1 2 2 1 2 2 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 2 2 2 1 1 2 1 1 1 1 2
[5588] 2 1 2 1 1 1 2 2 2 1 1 1 2 2 1 2 1 1 2 2 1 2 1 2 1 1 1 1 1 2 2 1 2 1 1 2 1
[5625] 1 2 2 1 2 1 1 2 1 1 1 2 2 1 1 1 2 2 2 1 1 1 2 2 1 2 2 1 2 1 2 1 1 2 2 1 2
[5662] 1 1 1 2 2 2 1 1 1 2 1 1 1 2 2 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 2 1 2 2
[5699] 2 2 1 1 2 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 2 1 2 1 1 1 1
[5736] 1 2 1 1 1 1 2 2 2 2 2 1 1 1 2 1 2 2 1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1
[5773] 2 1 1 2 1 2 1 2 2 1 1 1 2 2 1 1 2 2 1 1 1 1 1 2 2 1 1 2 1 1 1 2 2 1 2 2 1
[5810] 2 1 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 2 2 2
[5847] 2 1 2 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 2 2 1 2 1 1 1 1 1 2 1 2 2 2 1 1 1 1 1
[5884] 2 1 1 1 1 1 2 2 1 2 1 1 2 2 1 2 1 1 1 1 1 1 2 1 1 1 1 2 2 1 1 2 1 1 2 2 1
[5921] 1 1 2 1 2 1 1 2 1 2 2 1 1 1 2 1 1 2 1 1 1 1 1 2 1 1 1 1 2 1 2 1 1 2 1 1 1
[5958] 2 1 1 1 2 1 1 1 1 1 1 2 1 2 1 2 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2
[5995] 1 1 1 1 1 1 1 2 1 1 2 1 2 1 1 1 2 2 2 1 1 2 1 2 2 1 1 1 2 1 1 1 1 1 1 1 2
[6032] 2 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 2 1 1 2 2 1 1 1 2 1 1 1 1 1 2 1 2 1 2 1 2
[6069] 1 1 1 1 2 2 1 2 2 2 2 2 2 1 1 2 2 1 2 2 1 1 1 1 1 2 2 2 1 1 1 2 1 1 2 1 2
[6106] 1 2 1 2 1 1 1 1 2 1 1 2 2 2 1 2 1 2 2 1 2 1 1 1 1 1 1 1 2 1 2 1 1 1 1 2 2
[6143] 1 1 1 2 1 1 2 1 1 1 1 1 1 2 1 1 1 2 1 2 1 2 1 1 2 1 2 1 2 1 1 2 1 1 1 1 2
[6180] 1 1 2 1 2 1 2 1 1 1 1 1 1 2 1 2 1 2 2 1 1 1 1 1 1 1 1 1 2 1 1 2 1 2 1 2 1
[6217] 2 2 2 2 1 2 1 2 1 1 1 1 1 1 2 1 2 1 1 1 1 1 2 1 1 1 2 1 1 1 2 1 1 1 1 1 1
[6254] 1 1 1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 2 1 1 2 2 1 1 2 1 2 1 2 1 1 2 1 1 1 1
[6291] 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1
[6328] 2 1 1 1 1 2 1 1 1 2 1 2 2 1 1 2 1 1 1 2 1 2 1 2 1 1 1 1 1 1 1 1 2 1 2 1 1
[6365] 1 1 1 2 2 2 1 1 1 1 1 2 1 1 1 2 1 2 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1
[6402] 1 1 1 1 2 1 2 1 1 2 1 1 1 2 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1
[6439] 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 2 1 1 2 1 1 2 1 1
[6476] 1 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 2 1 1 2 1 2 1 1 1 1 1 1 1 2 1 1 1 2 1 2
[6513] 1 1 1 1 1 1 1 1 1 2 2 1 2 1 2 1 2 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1
[6550] 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 2 2 1 1
[6587] 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1
[6624] 1 1 1 1 2 1 1 1 1 1 2 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1
[6661] 1 1 1 1 2 1 1 1 2 1 2 1 2 2 2 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 2 1 1 1 2 1
[6698] 1 1 2 1 1 1 1 2 1 1 2 1 1 1 1 1 1 2 1 1 1 2 2 1 1 1 1 1 1 1 1 2 1 1 2 1 1
[6735] 2 1 1 1 1 2 2 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1
[6772] 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 2 1 1 1 2 1 1
[6809] 1 1 1 1 2 2 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1
[6846] 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[6883] 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 1 2 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[6920] 1 1 1 2 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 2 1 2 1 2
[6957] 1 1 1 1 1 1 1 2 1 1 2 1 2 1 2 1 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 2 1 1
[6994] 2 1 1 1 2 1 2 1 1 2 1 2 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2
[7031] 2 1 1 1 1 1 1 2 1 1 2 1 1 2 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[7068] 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 2 1 1 2 2
[7105] 1 1 1 1 2 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2
[7142] 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[7179] 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1
[7216] 1 2 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1
[7253] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 2 1 1 1 1 1 2 1
[7290] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1
[7327] 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 2 1 2 1 1 1 1 1
[7364] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[7401] 1 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 2 1 1 1 1 1 1 1 1
[7438] 1 1 1 2 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1
[7475] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 2 1 1 1 1
[7512] 1 1 1 1 1 2 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1
[7549] 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 2 2 2 2 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1
[7586] 1 1 1 2 1 2 1 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1
[7623] 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 2
[7660] 1 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1
[7697] 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1
[7734] 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 2 1
[7771] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1
[7808] 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[7845] 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1
[7882] 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1
[7919] 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 2 1 1 1 2
[7956] 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 2 1
[7993] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1
[8030] 1 2 1 1 2 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1
[8067] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1
[8104] 1 1 1 1 2 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1
[8141] 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 2 1 1
[8178] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1
[8215] 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 2
[8252] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[8289] 1 2 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[8326] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[8363] 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1
[8400] 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1
[8437] 2 1 2 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1
[8474] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 1
[8511] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[8548] 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[8585] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[8622] 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1
Objective function:
   build     swap 
1.394259 1.350500 

Available components:
 [1] "medoids"    "id.med"     "clustering" "objective"  "isolation" 
 [6] "clusinfo"   "silinfo"    "diss"       "call"       "data"      

Cluster 1

The PAM (Partitioning Around Medoids) values are generally below average for purchases, cash advance, payments, and credit limit. Cluster 1 contains customers with lower spending activity, lower borrowing behavior, smaller credit limits, and lower repayment amounts. Hence, cluster 1 comprises of smaller users; spend less, borrow less and lower limits.

Cluster 2

The PAM (Partitioning Around Medoids) values are above average for purchases, cash advance, and especially credit limit. Cluster 2 contains more financially active customers with higher spending, greater access to credit, and stronger borrowing activity. That is, these are larger users; spend more, borrow more, and higher limits.

Cluster Summary

## Cluster summary table ##

cc_cluster_result <- cc_clean |>
  mutate(cluster = pam_fit$clustering)

cluster_summary <- cc_cluster_result |>
  group_by(cluster) |>
  summarize(
    avg_purchases = mean(PURCHASES),
    avg_cash_advance = mean(CASH_ADVANCE),
    avg_payments = mean(PAYMENTS),
    avg_credit_limit = mean(CREDIT_LIMIT),
    avg_tenure = mean(TENURE)
  )

knitr::kable(
  cluster_summary,
  caption = "Cluster Summary Statistics"
)
Cluster Summary Statistics
cluster avg_purchases avg_cash_advance avg_payments avg_credit_limit avg_tenure
1 532.2875 442.2861 906.2455 2381.365 11.42536
2 1916.8168 1991.7390 3371.9207 8391.545 11.73147

The cluster summary table provides a clearer understanding of the two customer groups identified by the PAM clustering model.

Cluster 1: Lower Activity Customers

Customers in Cluster 1 have:

  • Average purchases of 532 currency units

  • Average cash advance of 442 currency units

  • Average payments of 906 currency units

  • Average credit limit of 2,381currency units

Cluster 2: Higher Activity Customers

Customers in Cluster 2 have:

  • Average purchases of 1,917 currency units

  • Average cash advance of 1,992 currency units

  • Average payments of 3,372 currency units

  • Average credit limit of 8,392currency units

This segmentation can help banks design different products, repayment strategies, and risk monitoring systems for each customer group.

Interactive Figures

1. Purchases vs Balance

The interactive scatter plot shows the relationship between purchases and balance. Most customers have lower purchases and lower balances, while a smaller number of customers have very high purchases and balances. The interactive feature allows us to hover and zoom in on specific observations, explore outliers, and crowded regions more clearly than static charts.

## Figure 1: Purchases vs Balance ##
#install.packages("plotly")
library(plotly)

p1 <- ggplot(data = cc_clean) +
  geom_point(aes(x = PURCHASES, y = BALANCE)) +
  labs(
    title = "Interactive Purchases vs Balance",
    x = "Purchases",
    y = "Balance"
  )

ggplotly(p1)

2. Cash Advance vs Balance

The interactive scatter plot shows the relationship between cash advance and balance. Most customers have low cash advance usage, while a smaller group uses large cash advances and tends to have higher balances. The interactive view helps identify extreme borrowers more clearly.

## Figure 2: Cash Advance vs Balance ##
p2 <- ggplot(data = cc_clean) +
  geom_point(aes(x = CASH_ADVANCE, y = BALANCE)) +
  labs(
    title = "Interactive Cash Advance vs Balance",
    x = "Cash Advance",
    y = "Balance"
  )

ggplotly(p2)

Interactive Table

## Interactive Table ##
#install.packages("DT")
library(DT)

datatable(
  cc_cluster_result |>
    select(CUST_ID, cluster, PURCHASES,
           CASH_ADVANCE, PAYMENTS,
           CREDIT_LIMIT) |>
    head(50)
)

The interactive table displays customer IDs, assigned cluster groups, and major financial variables such as purchases, cash advance, payments, and credit limit. This table allows us to easily sort columns, search customers, and browse multiple pages. This makes it easier to inspect customer segments in detail.

Conclusion

This project analyzed customer credit card behavior using the CC GENERAL dataset to identify the key drivers of credit card debt and segment customers into meaningful groups. The regression analysis showed that cash advance usage, purchases, credit limit, and tenure were positively associated with higher balances, while higher payments were associated with lower balances. The clustering analysis identified two main customer groups:

  • Cluster 1: Lower activity customers with smaller spending, borrowing, and credit limits.

  • Cluster 2: Higher activity customers with larger spending, borrowing, payments, and credit limits.

These findings show that customer debt behavior differs significantly across individuals and can be modeled effectively using financial activity variables.

Recommendations

For Banks / Financial Institutions

  1. Monitor customers with high cash advance usage, as they are more likely to carry higher debt balances.

  2. Offer tailored products for different customer clusters:

    • Standard cards for Cluster 1

    • Premium rewards cards for Cluster 2

  3. Encourage larger and more consistent payments to reduce balances.

  4. Use cluster segmentation for targeted marketing and risk management.

For Customers

  1. Limit unnecessary cash advances due to their association with higher debt.

  2. Make timely payments above the minimum payment when possible.

  3. Use available credit responsibly to avoid excessive balances.

Acknowledgement

The findings presented in this project are exclusive to this course and were not in this or previous semesters, and will not be presented in any other courses during this semester.

KSU MAP

#install.packages("leaflet")
library(leaflet)
leaflet() %>%
  addTiles() %>%
  addMarkers(
    lng=-84.5184,
    lat=33.9384,
    popup="KSU Marrietta"
  )

Students Mails

Favour Adekunle - fadekun1@students.kennesaw.edu

Thanh Nguyen - tnguy480@students.kennesaw.edu