Clusterization of Customer Segments (Part I)

Exploratory Data Analaysis (EDA) of the raw dataset

library(magrittr)
library(dplyr)
library(tidyverse)
library(ggplot2)
raw<-read.csv(file = "D:/МО II/WalletUpCRM.csv")
raw<-data.frame(raw)
head(raw)

Reviewing the structure of the data

dim(raw)
[1] 10750    28

The dataset includes data for 10 750 customers and their multiple characteristics, which we will use for the clusterization.

str(raw)
'data.frame':   10750 obs. of  28 variables:
 $ Age                  : int  40 37 71 53 40 32 72 29 55 38 ...
 $ Income               : int  79623 71616 78524 69938 74244 75738 74816 72334 77992 73186 ...
 $ HouseholdSize        : int  2 5 1 3 1 7 5 3 2 3 ...
 $ CityAreaSize         : int  454686 452465 456594 456594 452004 456143 463071 456844 455962 456032 ...
 $ MeanCityIncome       : int  90668 156742 52484 118422 36227 91296 106445 43453 156272 46232 ...
 $ MeanCityHousePrize   : int  1849978 1849599 1849953 1849302 1849247 1849755 1849963 1849975 1849634 1849832 ...
 $ MeanCityHouseHoldSize: int  3 5 3 4 6 3 2 6 3 2 ...
 $ MeanCitySqFtPrice    : int  3813 5264 3405 2141 2160 4686 4012 3897 2433 3410 ...
 $ NumberCars           : int  2 2 1 0 1 1 2 3 3 0 ...
 $ InternetTrafficVolume: int  58 36 57 71 39 52 62 41 65 67 ...
 $ MortageVolume        : int  430299 378228 282232 394235 350471 331681 350163 436763 216903 250300 ...
 $ AccountSpending      : num  938 1128 931 1002 1171 ...
 $ CreditCardSpending   : num  1418 694 1281 1135 1109 ...
 $ HelpHotlineTime      : num  3.22 3.44 2.47 5.9 5.1 ...
 $ CustomerSince        : int  36 36 36 36 37 36 37 36 36 36 ...
 $ GrocerySpending      : num  433 594 574 561 309 ...
 $ StockVolume          : num  1118 1392 1117 1354 1037 ...
 $ CreditVolume         : num  809 803 791 780 804 ...
 $ NASDAQInvest         : num  1488 1504 1500 1496 1500 ...
 $ USAXSFundInvest      : num  476 490 465 488 450 ...
 $ BranchVisits         : int  3 4 3 4 3 4 3 3 4 4 ...
 $ AppLogins            : int  10 19 14 14 16 13 11 16 20 8 ...
 $ ATMVisits            : int  9 8 8 8 9 8 9 7 8 7 ...
 $ TimeOnlineBanking    : num  71.5 67.1 58.9 61 67.2 ...
 $ ServiceFees          : num  41.8 52.2 54.6 41.8 57.1 ...
 $ SocialMediaInter     : int  27 25 28 34 34 24 21 32 28 24 ...
 $ Bitcoins             : num  0.0032 0.0037 0.0136 0.0016 0.0075 0.003 0.0076 0.0055 0.0036 0.0026 ...
 $ NFTs                 : int  2 1 1 1 0 3 2 4 3 1 ...

The data is all numeric. Every column is in the right data type. Some attribbutes like income or mortgage are usually decimal numbers, but for this project they will be analyzed as integers as we do not have further specific information

summary(raw)
      Age            Income       HouseholdSize   CityAreaSize    MeanCityIncome   MeanCityHousePrize
 Min.   :18.00   Min.   : 35202   Min.   :1.00   Min.   : 61613   Min.   : 35372   Min.   : 125011   
 1st Qu.:23.00   1st Qu.: 42803   1st Qu.:2.00   1st Qu.:121704   1st Qu.:116253   1st Qu.: 444817   
 Median :30.00   Median : 71268   Median :3.00   Median :450100   Median :140458   Median : 614601   
 Mean   :35.16   Mean   : 84700   Mean   :2.81   Mean   :372196   Mean   :163706   Mean   : 942505   
 3rd Qu.:45.00   3rd Qu.:125870   3rd Qu.:4.00   3rd Qu.:459418   3rd Qu.:235000   3rd Qu.:1849915   
 Max.   :74.00   Max.   :181863   Max.   :8.00   Max.   :708729   Max.   :286996   Max.   :1850000   
 MeanCityHouseHoldSize MeanCitySqFtPrice   NumberCars    InternetTrafficVolume MortageVolume    AccountSpending 
 Min.   :1.000         Min.   :1871      Min.   :0.000   Min.   :  6.00        Min.   : 14898   Min.   : 500.0  
 1st Qu.:2.000         1st Qu.:2627      1st Qu.:1.000   1st Qu.: 45.00        1st Qu.:120462   1st Qu.: 560.1  
 Median :3.000         Median :5778      Median :1.000   Median : 60.00        Median :232414   Median : 898.0  
 Mean   :3.023         Mean   :5318      Mean   :1.384   Mean   : 67.57        Mean   :202824   Mean   :1275.7  
 3rd Qu.:4.000         3rd Qu.:6741      3rd Qu.:2.000   3rd Qu.: 86.00        3rd Qu.:287298   3rd Qu.:1647.6  
 Max.   :8.000         Max.   :9886      Max.   :4.000   Max.   :118.00        Max.   :605846   Max.   :4257.1  
 CreditCardSpending HelpHotlineTime     CustomerSince   GrocerySpending   StockVolume    CreditVolume   
 Min.   : 501.1     Min.   : 0.006058   Min.   : 0.00   Min.   : 150.1   Min.   : 388   Min.   : 117.3  
 1st Qu.: 651.1     1st Qu.: 4.577513   1st Qu.: 3.00   1st Qu.: 293.4   1st Qu.:1059   1st Qu.: 161.7  
 Median : 785.7     Median : 8.774480   Median :11.00   Median : 426.5   Median :1537   Median : 802.7  
 Mean   :1013.3     Mean   :12.816409   Mean   :19.25   Mean   : 535.8   Mean   :2142   Mean   :1330.1  
 3rd Qu.:1451.4     3rd Qu.:16.593896   3rd Qu.:36.00   3rd Qu.: 627.9   3rd Qu.:2505   3rd Qu.:2488.0  
 Max.   :2041.9     Max.   :60.754994   Max.   :74.00   Max.   :1253.5   Max.   :5738   Max.   :3532.3  
  NASDAQInvest    USAXSFundInvest    BranchVisits      AppLogins       ATMVisits      TimeOnlineBanking
 Min.   : 228.4   Min.   :  69.95   Min.   : 0.000   Min.   :  1.0   Min.   : 0.000   Min.   : 22.77   
 1st Qu.: 401.4   1st Qu.: 149.80   1st Qu.: 2.000   1st Qu.: 18.0   1st Qu.: 3.000   1st Qu.: 69.28   
 Median :1498.1   Median : 313.82   Median : 3.000   Median : 64.0   Median : 5.000   Median : 88.26   
 Mean   :1828.5   Mean   : 761.65   Mean   : 3.913   Mean   : 55.7   Mean   : 4.928   Mean   :113.88   
 3rd Qu.:3056.0   3rd Qu.:1060.23   3rd Qu.: 5.000   3rd Qu.: 82.0   3rd Qu.: 7.000   3rd Qu.:152.92   
 Max.   :4532.4   Max.   :3396.61   Max.   :20.000   Max.   :130.0   Max.   :11.000   Max.   :232.21   
  ServiceFees       SocialMediaInter    Bitcoins           NFTs       
 Min.   :  0.1343   Min.   : 0.00    Min.   :0.0000   Min.   : 0.000  
 1st Qu.: 17.8442   1st Qu.: 5.00    1st Qu.:0.0005   1st Qu.: 1.000  
 Median : 27.2386   Median :16.00    Median :0.0998   Median : 3.000  
 Mean   : 40.9382   Mean   :19.03    Mean   :0.1937   Mean   : 3.317  
 3rd Qu.: 50.1652   3rd Qu.:31.00    3rd Qu.:0.4005   3rd Qu.: 4.000  
 Max.   :124.2613   Max.   :60.00    Max.   :0.6014   Max.   :12.000  

The above shown code provides general exploratory summary of the raw dataset. By these general statistics, I can overview the mean values and the distribution of each column.

Check for NULL values

any(is.null(raw))
[1] FALSE

Check by each column

sapply(raw, function(x) any(is.null(x)))
                  Age                Income         HouseholdSize          CityAreaSize        MeanCityIncome 
                FALSE                 FALSE                 FALSE                 FALSE                 FALSE 
   MeanCityHousePrize MeanCityHouseHoldSize     MeanCitySqFtPrice            NumberCars InternetTrafficVolume 
                FALSE                 FALSE                 FALSE                 FALSE                 FALSE 
        MortageVolume       AccountSpending    CreditCardSpending       HelpHotlineTime         CustomerSince 
                FALSE                 FALSE                 FALSE                 FALSE                 FALSE 
      GrocerySpending           StockVolume          CreditVolume          NASDAQInvest       USAXSFundInvest 
                FALSE                 FALSE                 FALSE                 FALSE                 FALSE 
         BranchVisits             AppLogins             ATMVisits     TimeOnlineBanking           ServiceFees 
                FALSE                 FALSE                 FALSE                 FALSE                 FALSE 
     SocialMediaInter              Bitcoins                  NFTs 
                FALSE                 FALSE                 FALSE 

There are no missing values in the dataset, which assists in quicker EDA, clustering validation and modelling in the next steps.

###Correlation analysis

library(modelsummary)
datasummary_correlation(raw, fmt = 2, title = "Correlation Matrix")
tinytable_u4kzjf08goy0o54wu1rg
Correlation Matrix
Age Income HouseholdSize CityAreaSize MeanCityIncome MeanCityHousePrize MeanCityHouseHoldSize MeanCitySqFtPrice NumberCars InternetTrafficVolume MortageVolume AccountSpending CreditCardSpending HelpHotlineTime CustomerSince GrocerySpending StockVolume CreditVolume NASDAQInvest USAXSFundInvest BranchVisits AppLogins ATMVisits TimeOnlineBanking ServiceFees SocialMediaInter Bitcoins NFTs
Age 1 . . . . . . . . . . . . . . . . . . . . . . . . . . .
Income .66 1 . . . . . . . . . . . . . . . . . . . . . . . . . .
HouseholdSize .28 .15 1 . . . . . . . . . . . . . . . . . . . . . . . . .
CityAreaSize -.26 -.59 .10 1 . . . . . . . . . . . . . . . . . . . . . . . .
MeanCityIncome -.34 -.49 -.08 .66 1 . . . . . . . . . . . . . . . . . . . . . . .
MeanCityHousePrize -.13 -.35 .17 .73 .43 1 . . . . . . . . . . . . . . . . . . . . . .
MeanCityHouseHoldSize .24 .07 .20 .08 -.20 .14 1 . . . . . . . . . . . . . . . . . . . . .
MeanCitySqFtPrice -.10 -.02 .05 .58 .73 .56 -.10 1 . . . . . . . . . . . . . . . . . . . .
NumberCars .48 .57 .19 -.42 -.42 -.10 .18 -.12 1 . . . . . . . . . . . . . . . . . . .
InternetTrafficVolume -.31 -.59 -.08 .67 .45 .05 -.05 .12 -.60 1 . . . . . . . . . . . . . . . . . .
MortageVolume .16 .35 -.10 -.53 -.66 -.31 -.04 -.58 .11 -.26 1 . . . . . . . . . . . . . . . . .
AccountSpending .73 .93 .22 -.52 -.40 -.30 .15 .04 .64 -.61 .10 1 . . . . . . . . . . . . . . . .
CreditCardSpending .45 .54 .08 -.75 -.41 -.33 .07 -.26 .59 -.83 .09 .66 1 . . . . . . . . . . . . . . .
HelpHotlineTime .47 .77 .05 -.69 -.32 -.49 -.03 -.04 .50 -.62 .10 .80 .72 1 . . . . . . . . . . . . . .
CustomerSince .80 .81 .30 -.43 -.45 -.01 .25 -.03 .68 -.70 .18 .89 .70 .64 1 . . . . . . . . . . . . .
GrocerySpending .76 .90 .22 -.55 -.45 -.39 .17 -.07 .62 -.55 .11 .96 .66 .79 .87 1 . . . . . . . . . . . .
StockVolume .33 .29 -.09 -.17 -.04 -.62 -.13 -.17 -.15 .41 .27 .19 -.17 .19 -.01 .27 1 . . . . . . . . . . .
CreditVolume -.26 -.52 -.07 .80 .80 .32 -.14 .55 -.57 .82 -.46 -.51 -.74 -.52 -.57 -.51 .33 1 . . . . . . . . . .
NASDAQInvest .61 .45 .07 -.13 -.09 -.49 .03 -.11 .05 .28 .17 .43 -.01 .29 .29 .51 .92 .25 1 . . . . . . . . .
USAXSFundInvest -.03 -.31 -.05 -.14 -.27 -.60 .09 -.68 -.17 .45 .03 -.25 -.09 -.16 -.33 -.10 .40 .01 .33 1 . . . . . . . .
BranchVisits .45 .75 .11 -.55 -.32 -.19 .04 .07 .55 -.71 .12 .77 .67 .71 .71 .72 -.06 -.56 .09 -.39 1 . . . . . . .
AppLogins -.36 -.31 -.19 .39 .36 -.25 -.21 .15 -.56 .83 -.09 -.42 -.76 -.36 -.69 -.40 .59 .71 .37 .34 -.51 1 . . . . . .
ATMVisits .10 -.44 .24 .53 .14 .56 .32 .01 -.02 .20 -.34 -.26 -.13 -.44 .02 -.22 -.37 .17 -.17 .15 -.31 -.26 1 . . . . .
TimeOnlineBanking -.29 -.26 -.27 .03 .06 -.54 -.24 -.27 -.51 .70 .26 -.43 -.56 -.27 -.64 -.35 .73 .45 .45 .57 -.49 .87 -.35 1 . . . .
ServiceFees .37 .04 .01 .18 .23 -.30 -.03 .01 -.21 .56 -.04 .05 -.26 -.01 -.04 .14 .86 .57 .88 .39 -.22 .53 .01 .55 1 . . .
SocialMediaInter .09 -.33 -.03 .48 .32 -.00 -.03 .00 -.44 .78 .02 -.37 -.60 -.44 -.34 -.29 .67 .74 .62 .38 -.54 .63 .20 .64 .82 1 . .
Bitcoins -.03 .02 -.08 .14 -.06 -.47 -.06 -.15 -.32 .69 .16 -.10 -.59 -.16 -.37 -.04 .75 .40 .61 .51 -.34 .86 -.27 .85 .59 .60 1 .
NFTs -.01 -.20 -.09 .37 .31 -.23 -.12 .07 -.45 .80 -.03 -.27 -.63 -.29 -.43 -.21 .76 .72 .65 .40 -.46 .83 -.08 .79 .79 .83 .80 1

There is a strong positive correlation between Grocery Spending x Income (0.9). Logically, a strong positive correlation with a coefficient of 0.92 can also be observed between Stock Volume x NasdaqInvest. Investments made in Bitcoin and NFT are correlated with a coefficient of 0.8, which may indicate that similar types of financial assets are preferred by one type of customer. However, strong or weak correlation coefficients do not equal causation between the analysed characteristics. This is another reason why we want to build the clustering model based on only behavioral attributes and ignore demographics like age and income. The customers’ behaviour can hide interesting insights or segments, which can not be observed with simple EDA or correlation analysis.

Distribution of customers based on age and other attributes

In the following code chunk, I built multiple visualizations in order to compare customers once they are divided in age groups. First I had to create the age bins, after that I inserted 4 bar charts.

data_age_bins <- raw %>%
  mutate(age_group = case_when(
    Age <= 26 ~ "18-26",
    Age <= 34 ~ "26-34",
    Age <= 41 ~ "33-41",
    Age <= 49 ~ "41-49",
    Age <= 56 ~ "48-56",
    Age <= 63 ~ "56-63",
    Age >= 64 ~ "Over 64"
  ))
head(data_age_bins$age_group)
[1] "33-41"   "33-41"   "Over 64" "48-56"   "33-41"   "26-34"  
# Count of people in each age bin.
ggplot(data_age_bins, aes(x = age_group)) +
  geom_bar(fill = "skyblue", color = "darkblue") +
  theme_minimal() +
  labs(title = "Distribution of Age Groups", x = "Age Group", y = "Count")

# We have the most customers in the youngest age group, 18-25, with nearly 4,000 people. Then there are those aged 56-63 with about 1,600. And the fewest are pensioners over 64.

# Age Group x Income
ggplot(data_age_bins, aes(x = age_group, y=Income)) +
  geom_col(fill = "skyblue", color = "darkblue") +
  theme_minimal() +
  labs(title = "Distribution of Income", x = "Age Group", y = "Income") +
  scale_y_continuous(labels = function(x) format(round(x), big.mark = ",", scientific = FALSE))

# The two groups 26 - 34 and 56 - 63 have the highest income, exceeding $200,000. People in the middle age group from 40 to 55 years old, as well as retirees, have lower incomes.

# Age Group x Mortgage Volume
ggplot(data_age_bins, aes(x = age_group, y= MortageVolume)) +
  geom_col(fill = "skyblue", color = "darkblue") +
  theme_minimal() +
  labs(title = "Distribution of Mortgage", x = "Age Group", y = "Mortgage")+
  scale_y_continuous(labels = function(x) format(round(x), big.mark = ",", scientific = FALSE))

# Young adults between 26 and 34 have the highest mortgage debt at over $800,000,000 in total value.

# Age Group x ATM Transactions
ggplot(data_age_bins, aes(x = age_group, y= ATMVisits)) +
  geom_col(fill = "skyblue", color = "darkblue") +
  theme_minimal() +
  labs(title = "Distribution of ATM Visits", x = "Age Group", y = "ATM Visits")+
  scale_y_continuous(labels = function(x) format(round(x), big.mark = ",", scientific = FALSE))

# From this last graph, we can conclude that young customers most often use ATMs to withdraw cash.

Outliers detection

Even though I work with clean dataset, it is still very important to check for outliers in the data. These are too high or low values based on the interquartile range of the attribute. I picked the reviewed attributes on random choice. The detection of outliers is crucial, because if there are identified ones, eventually the final ML clustering model should be picked with such consideration. I used boxplots to quickly review.

boxplot(raw$MortageVolume, ylab = "Mortgage")

boxplot(raw$AccountSpending, ylab= "AccountSpending")

boxplot(raw$GrocerySpending, ylab= "GrocerySpending")

boxplot(raw$USAXSFundInvest, ylab= "USAXS Fund Invest")

boxplot(raw$ServiceFees, ylab= "ServiceFees")

In the dataset containing the behavioral characteristics of customers, anomalies can be found in some variables. If this data is not removed, but kept in order to preserve the information, then it is good to choose a clustering model, which copes better with the treatment of anomalies. For example, k-means is quite sensitive to outliers and it is possible that K-medoids is more suitable. In the next notebooks, the project will continue in preparing the data for model building, validation of cluster formation and optimal number of k-clusters. Once I find this information, the testing between different models can initiate. Each model will be valuated with specific metrics. In the end, we will analyze the clustered dataset and present the customer segments.

LS0tDQp0aXRsZTogIkVEQSINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQoNCiMgIENsdXN0ZXJpemF0aW9uIG9mIEN1c3RvbWVyIFNlZ21lbnRzIChQYXJ0IEkpDQojIyBFeHBsb3JhdG9yeSBEYXRhIEFuYWxheXNpcyAoRURBKSBvZiB0aGUgcmF3IGRhdGFzZXQNCg0KYGBge3J9DQpsaWJyYXJ5KG1hZ3JpdHRyKQ0KbGlicmFyeShkcGx5cikNCmxpYnJhcnkodGlkeXZlcnNlKQ0KbGlicmFyeShnZ3Bsb3QyKQ0KYGBgDQoNCmBgYHtyfQ0KcmF3PC1yZWFkLmNzdihmaWxlID0gIkQ6L9Cc0J4gSUkvV2FsbGV0VXBDUk0uY3N2IikNCnJhdzwtZGF0YS5mcmFtZShyYXcpDQpoZWFkKHJhdykNCmBgYA0KUmV2aWV3aW5nIHRoZSBzdHJ1Y3R1cmUgb2YgdGhlIGRhdGENCmBgYHtyfQ0KZGltKHJhdykNCmBgYA0KVGhlIGRhdGFzZXQgaW5jbHVkZXMgZGF0YSBmb3IgMTAgNzUwIGN1c3RvbWVycyBhbmQgdGhlaXIgbXVsdGlwbGUgY2hhcmFjdGVyaXN0aWNzLCB3aGljaCB3ZSB3aWxsIHVzZSBmb3IgdGhlIGNsdXN0ZXJpemF0aW9uLg0KYGBge3J9DQpzdHIocmF3KQ0KYGBgDQpUaGUgZGF0YSBpcyBhbGwgbnVtZXJpYy4gRXZlcnkgY29sdW1uIGlzIGluIHRoZSByaWdodCBkYXRhIHR5cGUuIFNvbWUgYXR0cmliYnV0ZXMgbGlrZSBpbmNvbWUgb3IgbW9ydGdhZ2UgYXJlIHVzdWFsbHkgZGVjaW1hbCBudW1iZXJzLCBidXQgZm9yIHRoaXMgcHJvamVjdCB0aGV5IHdpbGwgYmUgYW5hbHl6ZWQgYXMgaW50ZWdlcnMgYXMgd2UgZG8gbm90IGhhdmUgZnVydGhlciBzcGVjaWZpYyBpbmZvcm1hdGlvbg0KYGBge3J9DQpzdW1tYXJ5KHJhdykNCmBgYA0KVGhlIGFib3ZlIHNob3duIGNvZGUgcHJvdmlkZXMgZ2VuZXJhbCBleHBsb3JhdG9yeSBzdW1tYXJ5IG9mIHRoZSByYXcgZGF0YXNldC4gQnkgdGhlc2UgZ2VuZXJhbCBzdGF0aXN0aWNzLCBJIGNhbiBvdmVydmlldyB0aGUgbWVhbiB2YWx1ZXMgYW5kIHRoZSBkaXN0cmlidXRpb24gb2YgZWFjaCBjb2x1bW4uDQoNCiMjIyBDaGVjayBmb3IgTlVMTCB2YWx1ZXMNCmBgYHtyfQ0KYW55KGlzLm51bGwocmF3KSkNCmBgYA0KQ2hlY2sgYnkgZWFjaCBjb2x1bW4NCmBgYHtyfQ0Kc2FwcGx5KHJhdywgZnVuY3Rpb24oeCkgYW55KGlzLm51bGwoeCkpKQ0KYGBgDQpUaGVyZSBhcmUgbm8gbWlzc2luZyB2YWx1ZXMgaW4gdGhlIGRhdGFzZXQsIHdoaWNoIGFzc2lzdHMgaW4gcXVpY2tlciBFREEsIGNsdXN0ZXJpbmcgdmFsaWRhdGlvbiBhbmQgbW9kZWxsaW5nIGluIHRoZSBuZXh0IHN0ZXBzLg0KDQojIyNDb3JyZWxhdGlvbiBhbmFseXNpcw0KYGBge3J9DQpsaWJyYXJ5KG1vZGVsc3VtbWFyeSkNCmRhdGFzdW1tYXJ5X2NvcnJlbGF0aW9uKHJhdywgZm10ID0gMiwgdGl0bGUgPSAiQ29ycmVsYXRpb24gTWF0cml4IikNCmBgYA0KVGhlcmUgaXMgYSBzdHJvbmcgcG9zaXRpdmUgY29ycmVsYXRpb24gYmV0d2VlbiBHcm9jZXJ5IFNwZW5kaW5nIHggSW5jb21lICgwLjkpLiBMb2dpY2FsbHksIGEgc3Ryb25nIHBvc2l0aXZlIGNvcnJlbGF0aW9uIHdpdGggYSBjb2VmZmljaWVudCBvZiAwLjkyIGNhbiBhbHNvIGJlIG9ic2VydmVkIGJldHdlZW4gU3RvY2sgVm9sdW1lIHggTmFzZGFxSW52ZXN0LiBJbnZlc3RtZW50cyBtYWRlIGluIEJpdGNvaW4gYW5kIE5GVCBhcmUgY29ycmVsYXRlZCB3aXRoIGEgY29lZmZpY2llbnQgb2YgMC44LCB3aGljaCBtYXkgaW5kaWNhdGUgdGhhdCBzaW1pbGFyIHR5cGVzIG9mIGZpbmFuY2lhbCBhc3NldHMgYXJlIHByZWZlcnJlZCBieSBvbmUgdHlwZSBvZiBjdXN0b21lci4NCkhvd2V2ZXIsIHN0cm9uZyBvciB3ZWFrIGNvcnJlbGF0aW9uIGNvZWZmaWNpZW50cyBkbyBub3QgZXF1YWwgY2F1c2F0aW9uIGJldHdlZW4gdGhlIGFuYWx5c2VkIGNoYXJhY3RlcmlzdGljcy4gVGhpcyBpcyBhbm90aGVyIHJlYXNvbiB3aHkgd2Ugd2FudCB0byBidWlsZCB0aGUgY2x1c3RlcmluZyBtb2RlbCBiYXNlZCBvbiBvbmx5IGJlaGF2aW9yYWwgYXR0cmlidXRlcyBhbmQgaWdub3JlIGRlbW9ncmFwaGljcyBsaWtlIGFnZSBhbmQgaW5jb21lLiBUaGUgY3VzdG9tZXJzJyBiZWhhdmlvdXIgY2FuIGhpZGUgaW50ZXJlc3RpbmcgaW5zaWdodHMgb3Igc2VnbWVudHMsIHdoaWNoIGNhbiBub3QgYmUgb2JzZXJ2ZWQgd2l0aCBzaW1wbGUgRURBIG9yIGNvcnJlbGF0aW9uIGFuYWx5c2lzLg0KDQojIyMgRGlzdHJpYnV0aW9uIG9mIGN1c3RvbWVycyBiYXNlZCBvbiBhZ2UgYW5kIG90aGVyIGF0dHJpYnV0ZXMNCkluIHRoZSBmb2xsb3dpbmcgY29kZSBjaHVuaywgSSBidWlsdCBtdWx0aXBsZSB2aXN1YWxpemF0aW9ucyBpbiBvcmRlciB0byBjb21wYXJlIGN1c3RvbWVycyBvbmNlIHRoZXkgYXJlIGRpdmlkZWQgaW4gYWdlIGdyb3Vwcy4gRmlyc3QgSSBoYWQgdG8gY3JlYXRlIHRoZSBhZ2UgYmlucywgYWZ0ZXIgdGhhdCBJIGluc2VydGVkIDQgYmFyIGNoYXJ0cy4NCmBgYHtyfQ0KZGF0YV9hZ2VfYmlucyA8LSByYXcgJT4lDQogIG11dGF0ZShhZ2VfZ3JvdXAgPSBjYXNlX3doZW4oDQogICAgQWdlIDw9IDI2IH4gIjE4LTI2IiwNCiAgICBBZ2UgPD0gMzQgfiAiMjYtMzQiLA0KICAgIEFnZSA8PSA0MSB+ICIzMy00MSIsDQogICAgQWdlIDw9IDQ5IH4gIjQxLTQ5IiwNCiAgICBBZ2UgPD0gNTYgfiAiNDgtNTYiLA0KICAgIEFnZSA8PSA2MyB+ICI1Ni02MyIsDQogICAgQWdlID49IDY0IH4gIk92ZXIgNjQiDQogICkpDQpoZWFkKGRhdGFfYWdlX2JpbnMkYWdlX2dyb3VwKQ0KIyBDb3VudCBvZiBwZW9wbGUgaW4gZWFjaCBhZ2UgYmluLg0KZ2dwbG90KGRhdGFfYWdlX2JpbnMsIGFlcyh4ID0gYWdlX2dyb3VwKSkgKw0KICBnZW9tX2JhcihmaWxsID0gInNreWJsdWUiLCBjb2xvciA9ICJkYXJrYmx1ZSIpICsNCiAgdGhlbWVfbWluaW1hbCgpICsNCiAgbGFicyh0aXRsZSA9ICJEaXN0cmlidXRpb24gb2YgQWdlIEdyb3VwcyIsIHggPSAiQWdlIEdyb3VwIiwgeSA9ICJDb3VudCIpDQojIFdlIGhhdmUgdGhlIG1vc3QgY3VzdG9tZXJzIGluIHRoZSB5b3VuZ2VzdCBhZ2UgZ3JvdXAsIDE4LTI1LCB3aXRoIG5lYXJseSA0LDAwMCBwZW9wbGUuIFRoZW4gdGhlcmUgYXJlIHRob3NlIGFnZWQgNTYtNjMgd2l0aCBhYm91dCAxLDYwMC4gQW5kIHRoZSBmZXdlc3QgYXJlIHBlbnNpb25lcnMgb3ZlciA2NC4NCg0KIyBBZ2UgR3JvdXAgeCBJbmNvbWUNCmdncGxvdChkYXRhX2FnZV9iaW5zLCBhZXMoeCA9IGFnZV9ncm91cCwgeT1JbmNvbWUpKSArDQogIGdlb21fY29sKGZpbGwgPSAic2t5Ymx1ZSIsIGNvbG9yID0gImRhcmtibHVlIikgKw0KICB0aGVtZV9taW5pbWFsKCkgKw0KICBsYWJzKHRpdGxlID0gIkRpc3RyaWJ1dGlvbiBvZiBJbmNvbWUiLCB4ID0gIkFnZSBHcm91cCIsIHkgPSAiSW5jb21lIikgKw0KICBzY2FsZV95X2NvbnRpbnVvdXMobGFiZWxzID0gZnVuY3Rpb24oeCkgZm9ybWF0KHJvdW5kKHgpLCBiaWcubWFyayA9ICIsIiwgc2NpZW50aWZpYyA9IEZBTFNFKSkNCiMgVGhlIHR3byBncm91cHMgMjYgLSAzNCBhbmQgNTYgLSA2MyBoYXZlIHRoZSBoaWdoZXN0IGluY29tZSwgZXhjZWVkaW5nICQyMDAsMDAwLiBQZW9wbGUgaW4gdGhlIG1pZGRsZSBhZ2UgZ3JvdXAgZnJvbSA0MCB0byA1NSB5ZWFycyBvbGQsIGFzIHdlbGwgYXMgcmV0aXJlZXMsIGhhdmUgbG93ZXIgaW5jb21lcy4NCg0KIyBBZ2UgR3JvdXAgeCBNb3J0Z2FnZSBWb2x1bWUNCmdncGxvdChkYXRhX2FnZV9iaW5zLCBhZXMoeCA9IGFnZV9ncm91cCwgeT0gTW9ydGFnZVZvbHVtZSkpICsNCiAgZ2VvbV9jb2woZmlsbCA9ICJza3libHVlIiwgY29sb3IgPSAiZGFya2JsdWUiKSArDQogIHRoZW1lX21pbmltYWwoKSArDQogIGxhYnModGl0bGUgPSAiRGlzdHJpYnV0aW9uIG9mIE1vcnRnYWdlIiwgeCA9ICJBZ2UgR3JvdXAiLCB5ID0gIk1vcnRnYWdlIikrDQogIHNjYWxlX3lfY29udGludW91cyhsYWJlbHMgPSBmdW5jdGlvbih4KSBmb3JtYXQocm91bmQoeCksIGJpZy5tYXJrID0gIiwiLCBzY2llbnRpZmljID0gRkFMU0UpKQ0KIyBZb3VuZyBhZHVsdHMgYmV0d2VlbiAyNiBhbmQgMzQgaGF2ZSB0aGUgaGlnaGVzdCBtb3J0Z2FnZSBkZWJ0IGF0IG92ZXIgJDgwMCwwMDAsMDAwIGluIHRvdGFsIHZhbHVlLg0KDQojIEFnZSBHcm91cCB4IEFUTSBUcmFuc2FjdGlvbnMNCmdncGxvdChkYXRhX2FnZV9iaW5zLCBhZXMoeCA9IGFnZV9ncm91cCwgeT0gQVRNVmlzaXRzKSkgKw0KICBnZW9tX2NvbChmaWxsID0gInNreWJsdWUiLCBjb2xvciA9ICJkYXJrYmx1ZSIpICsNCiAgdGhlbWVfbWluaW1hbCgpICsNCiAgbGFicyh0aXRsZSA9ICJEaXN0cmlidXRpb24gb2YgQVRNIFZpc2l0cyIsIHggPSAiQWdlIEdyb3VwIiwgeSA9ICJBVE0gVmlzaXRzIikrDQogIHNjYWxlX3lfY29udGludW91cyhsYWJlbHMgPSBmdW5jdGlvbih4KSBmb3JtYXQocm91bmQoeCksIGJpZy5tYXJrID0gIiwiLCBzY2llbnRpZmljID0gRkFMU0UpKQ0KIyBGcm9tIHRoaXMgbGFzdCBncmFwaCwgd2UgY2FuIGNvbmNsdWRlIHRoYXQgeW91bmcgY3VzdG9tZXJzIG1vc3Qgb2Z0ZW4gdXNlIEFUTXMgdG8gd2l0aGRyYXcgY2FzaC4NCmBgYA0KDQojIyMgT3V0bGllcnMgZGV0ZWN0aW9uDQpFdmVuIHRob3VnaCBJIHdvcmsgd2l0aCBjbGVhbiBkYXRhc2V0LCBpdCBpcyBzdGlsbCB2ZXJ5IGltcG9ydGFudCB0byBjaGVjayBmb3Igb3V0bGllcnMgaW4gdGhlIGRhdGEuIFRoZXNlIGFyZSB0b28gaGlnaCBvciBsb3cgdmFsdWVzIGJhc2VkIG9uIHRoZSBpbnRlcnF1YXJ0aWxlIHJhbmdlIG9mIHRoZSBhdHRyaWJ1dGUuIEkgcGlja2VkIHRoZSByZXZpZXdlZCBhdHRyaWJ1dGVzIG9uIHJhbmRvbSBjaG9pY2UuIFRoZSBkZXRlY3Rpb24gb2Ygb3V0bGllcnMgaXMgY3J1Y2lhbCwgYmVjYXVzZSBpZiB0aGVyZSBhcmUgaWRlbnRpZmllZCBvbmVzLCBldmVudHVhbGx5IHRoZSBmaW5hbCBNTCBjbHVzdGVyaW5nIG1vZGVsIHNob3VsZCBiZSBwaWNrZWQgd2l0aCBzdWNoIGNvbnNpZGVyYXRpb24uIEkgdXNlZCBib3hwbG90cyB0byBxdWlja2x5IHJldmlldy4NCmBgYHtyfQ0KYm94cGxvdChyYXckTW9ydGFnZVZvbHVtZSwgeWxhYiA9ICJNb3J0Z2FnZSIpDQpib3hwbG90KHJhdyRBY2NvdW50U3BlbmRpbmcsIHlsYWI9ICJBY2NvdW50U3BlbmRpbmciKQ0KYm94cGxvdChyYXckR3JvY2VyeVNwZW5kaW5nLCB5bGFiPSAiR3JvY2VyeVNwZW5kaW5nIikNCmJveHBsb3QocmF3JFVTQVhTRnVuZEludmVzdCwgeWxhYj0gIlVTQVhTIEZ1bmQgSW52ZXN0IikNCmJveHBsb3QocmF3JFNlcnZpY2VGZWVzLCB5bGFiPSAiU2VydmljZUZlZXMiKQ0KYGBgDQpJbiB0aGUgZGF0YXNldCBjb250YWluaW5nIHRoZSBiZWhhdmlvcmFsIGNoYXJhY3RlcmlzdGljcyBvZiBjdXN0b21lcnMsIGFub21hbGllcyBjYW4gYmUgZm91bmQgaW4gc29tZSB2YXJpYWJsZXMuIElmIHRoaXMgZGF0YSBpcyBub3QgcmVtb3ZlZCwgYnV0IGtlcHQgaW4gb3JkZXIgdG8gcHJlc2VydmUgdGhlIGluZm9ybWF0aW9uLCB0aGVuIGl0IGlzIGdvb2QgdG8gY2hvb3NlIGEgY2x1c3RlcmluZyBtb2RlbCwgd2hpY2ggY29wZXMgYmV0dGVyIHdpdGggdGhlIHRyZWF0bWVudCBvZiBhbm9tYWxpZXMuIEZvciBleGFtcGxlLCBrLW1lYW5zIGlzIHF1aXRlIHNlbnNpdGl2ZSB0byBvdXRsaWVycyBhbmQgaXQgaXMgcG9zc2libGUgdGhhdCBLLW1lZG9pZHMgaXMgbW9yZSBzdWl0YWJsZS4NCkluIHRoZSBuZXh0IG5vdGVib29rcywgdGhlIHByb2plY3Qgd2lsbCBjb250aW51ZSBpbiBwcmVwYXJpbmcgdGhlIGRhdGEgZm9yIG1vZGVsIGJ1aWxkaW5nLCB2YWxpZGF0aW9uIG9mIGNsdXN0ZXIgZm9ybWF0aW9uIGFuZCBvcHRpbWFsIG51bWJlciBvZiBrLWNsdXN0ZXJzLiBPbmNlIEkgZmluZCB0aGlzIGluZm9ybWF0aW9uLCB0aGUgdGVzdGluZyBiZXR3ZWVuIGRpZmZlcmVudCBtb2RlbHMgY2FuIGluaXRpYXRlLiBFYWNoIG1vZGVsIHdpbGwgYmUgdmFsdWF0ZWQgd2l0aCBzcGVjaWZpYyBtZXRyaWNzLiBJbiB0aGUgZW5kLCB3ZSB3aWxsIGFuYWx5emUgdGhlIGNsdXN0ZXJlZCBkYXRhc2V0IGFuZCBwcmVzZW50IHRoZSBjdXN0b21lciBzZWdtZW50cy4NCg0K