Identifying Consumer Segments

Loading required packages and dataset…

Loading required package: grid

examine the structure of the bank data frame

'data.frame':   4521 obs. of  17 variables:
 $ age      : int  30 33 35 30 59 35 36 39 41 43 ...
 $ job      : chr  "unemployed" "services" "management" "management" ...
 $ marital  : chr  "married" "married" "single" "married" ...
 $ education: chr  "primary" "secondary" "tertiary" "tertiary" ...
 $ default  : chr  "no" "no" "no" "no" ...
 $ balance  : int  1787 4789 1350 1476 0 747 307 147 221 -88 ...
 $ housing  : chr  "no" "yes" "yes" "yes" ...
 $ loan     : chr  "no" "yes" "no" "yes" ...
 $ contact  : chr  "cellular" "cellular" "cellular" "unknown" ...
 $ day      : int  19 11 16 3 5 23 14 6 14 17 ...
 $ month    : chr  "oct" "may" "apr" "jun" ...
 $ duration : int  79 220 185 199 226 141 341 151 57 313 ...
 $ campaign : int  1 1 1 4 1 2 1 2 2 1 ...
 $ pdays    : int  -1 339 330 -1 -1 176 330 -1 -1 147 ...
 $ previous : int  0 4 1 0 0 3 2 0 0 2 ...
 $ poutcome : chr  "unknown" "failure" "failure" "unknown" ...
 $ response : chr  "no" "no" "no" "no" ...
NULL

Printing first few rows


       admin.   blue-collar  entrepreneur     housemaid    management       retired 
          478           946           168           112           969           230 
self-employed      services       student    technician    unemployed       unknown 
          183           417            84           768           128            38 
         <NA> 
            0 

divorced  married   single     <NA> 
     528     2797     1196        0 

  primary secondary  tertiary   unknown      <NA> 
      678      2306      1350       187         0 

  no  yes <NA> 
4445   76    0 

  no  yes <NA> 
1962 2559    0 

  no  yes <NA> 
3830  691    0 
               jobtype
job             White Collar Blue Collar Other/Unknown <NA>
  admin.                 478           0             0    0
  blue-collar              0         946             0    0
  entrepreneur           168           0             0    0
  housemaid                0           0           112    0
  management             969           0             0    0
  retired                  0           0           230    0
  self-employed          183           0             0    0
  services                 0         417             0    0
  student                  0           0            84    0
  technician               0         768             0    0
  unemployed               0           0           128    0
  unknown                  0           0            38    0
  <NA>                     0           0             0    0
           bluecollar
whitecollar    0    1
          0  592 2131
          1 1798    0
jobtype
 White Collar   Blue Collar Other/Unknown 
         1798          2131           592 
        married
divorced    0    1
       0 1196 2797
       1  528    0
marital
Divorced  Married   Single 
     528     2797     1196 
, , tertiary = 0

       secondary
primary    0    1
      0  187 2306
      1  678    0

, , tertiary = 1

       secondary
primary    0    1
      0 1350    0
      1    0    0
education
  Primary Secondary  Tertiary   Unknown 
      678      2306      1350       187 
'data.frame':   3705 obs. of  16 variables:
 $ response   : chr  "no" "no" "no" "no" ...
 $ age        : int  30 30 59 39 41 39 43 36 20 40 ...
 $ jobtype    : Factor w/ 3 levels "White Collar",..: 3 1 2 2 1 2 1 2 3 1 ...
 $ marital    : Factor w/ 3 levels "Divorced","Married",..: 2 2 2 2 2 2 2 2 3 2 ...
 $ education  : Factor w/ 4 levels "Primary","Secondary",..: 1 3 2 2 3 2 2 3 2 3 ...
 $ default    : chr  "no" "no" "no" "no" ...
 $ balance    : int  1787 1476 0 147 221 9374 264 1109 502 194 ...
 $ housing    : chr  "no" "yes" "yes" "yes" ...
 $ loan       : chr  "no" "yes" "no" "no" ...
 $ whitecollar: num  0 1 0 0 1 0 1 0 0 1 ...
 $ bluecollar : num  0 0 1 1 0 1 0 1 0 0 ...
 $ divorced   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ married    : num  1 1 1 1 1 1 1 1 0 1 ...
 $ primary    : num  1 0 0 0 0 0 0 0 0 0 ...
 $ secondary  : num  0 0 1 1 0 1 1 0 1 0 ...
 $ tertiary   : num  0 1 0 0 1 0 0 1 0 1 ...
NULL

null device 
          1 

Examine the cluster solution results, look for average silhouette width > 0.5 and look for last big jump in average silhoutte width

provide a single summary plot for the clustering solutions

select clustering solution and examine it

From the silhouette plot, the first five of the seven clusters appear to be large and well-defined

look at demographics across the clusters/segments (age Age in years). Examine relationship between age and response to promotion

cluster: A
[1] 45.9057
---------------------------------------------------------------------- 
cluster: B
[1] 41.6633
---------------------------------------------------------------------- 
cluster: C
[1] 43.04649
---------------------------------------------------------------------- 
cluster: D
[1] 40.03083
---------------------------------------------------------------------- 
cluster: E
[1] 32.29765

Plot lattice. responders tend to be older

Level of education (unknown, secondary, primary, tertiary)

       education
cluster Primary Secondary Tertiary Unknown
      A     509         0        0       0
      B       0         0      594       0
      C       0       826        0      56
      D       0       485        0      34
      E       0       354        0      29

Table of job status using jobtype

       jobtype
cluster White Collar Blue Collar Other/Unknown
      A           70         313           126
      B          470          81            43
      C            0         768           114
      D          488           0            31
      E            0         314            69
       marital
cluster Divorced Married Single
      A        0     449     60
      B        0     594      0
      C        0     882      0
      D        0     380    139
      E        0       0    383

look at bank client history across the clusters/segments. default Has credit in default? (yes, no)

       default
cluster  no yes
      A 502   7
      B 590   4
      C 868  14
      D 508  11
      E 372  11

balance Average yearly balance (in Euros)

cluster: A
[1] 1446.106
---------------------------------------------------------------------- 
cluster: B
[1] 1882.47
---------------------------------------------------------------------- 
cluster: C
[1] 1273.861
---------------------------------------------------------------------- 
cluster: D
[1] 1331.609
---------------------------------------------------------------------- 
cluster: E
[1] 1018.661

Plot lattice. responders tend to be older

null device 
          1 

housing Has housing loan? (yes, no)

       housing
cluster  no yes
      A 225 284
      B 310 284
      C 322 560
      D 208 311
      E 194 189

loan Has personal loan? (yes, no)

response Response to term deposit offer (yes, no)

       response
cluster  no yes
      A 473  36
      B 535  59
      C 824  58
      D 489  30
      E 333  50

Plot mosaic Response to Term Deposit Offer

Computation percentage of yes responses to term deposit offer


Percentage Responses

 A 7.1
 B 9.9
 C 6.6
 D 5.8
 E 13.1

Note the percentage of the customers receiving offers for the first time falling into each of the clusters/segments


   1    2    3    4    5    6    7    8 
13.7 16.0 23.8 14.0 10.3  3.9  8.1 10.1 
