Identifying Consumer Segments
Loading required packages and dataset…
Loading required package: grid
examine the structure of the bank data frame
'data.frame': 4521 obs. of 17 variables:
$ age : int 30 33 35 30 59 35 36 39 41 43 ...
$ job : chr "unemployed" "services" "management" "management" ...
$ marital : chr "married" "married" "single" "married" ...
$ education: chr "primary" "secondary" "tertiary" "tertiary" ...
$ default : chr "no" "no" "no" "no" ...
$ balance : int 1787 4789 1350 1476 0 747 307 147 221 -88 ...
$ housing : chr "no" "yes" "yes" "yes" ...
$ loan : chr "no" "yes" "no" "yes" ...
$ contact : chr "cellular" "cellular" "cellular" "unknown" ...
$ day : int 19 11 16 3 5 23 14 6 14 17 ...
$ month : chr "oct" "may" "apr" "jun" ...
$ duration : int 79 220 185 199 226 141 341 151 57 313 ...
$ campaign : int 1 1 1 4 1 2 1 2 2 1 ...
$ pdays : int -1 339 330 -1 -1 176 330 -1 -1 147 ...
$ previous : int 0 4 1 0 0 3 2 0 0 2 ...
$ poutcome : chr "unknown" "failure" "failure" "unknown" ...
$ response : chr "no" "no" "no" "no" ...
NULL
Printing first few rows
admin. blue-collar entrepreneur housemaid management retired
478 946 168 112 969 230
self-employed services student technician unemployed unknown
183 417 84 768 128 38
<NA>
0
divorced married single <NA>
528 2797 1196 0
primary secondary tertiary unknown <NA>
678 2306 1350 187 0
no yes <NA>
4445 76 0
no yes <NA>
1962 2559 0
no yes <NA>
3830 691 0
jobtype
job White Collar Blue Collar Other/Unknown <NA>
admin. 478 0 0 0
blue-collar 0 946 0 0
entrepreneur 168 0 0 0
housemaid 0 0 112 0
management 969 0 0 0
retired 0 0 230 0
self-employed 183 0 0 0
services 0 417 0 0
student 0 0 84 0
technician 0 768 0 0
unemployed 0 0 128 0
unknown 0 0 38 0
<NA> 0 0 0 0
bluecollar
whitecollar 0 1
0 592 2131
1 1798 0
jobtype
White Collar Blue Collar Other/Unknown
1798 2131 592
married
divorced 0 1
0 1196 2797
1 528 0
marital
Divorced Married Single
528 2797 1196
, , tertiary = 0
secondary
primary 0 1
0 187 2306
1 678 0
, , tertiary = 1
secondary
primary 0 1
0 1350 0
1 0 0
education
Primary Secondary Tertiary Unknown
678 2306 1350 187
'data.frame': 3705 obs. of 16 variables:
$ response : chr "no" "no" "no" "no" ...
$ age : int 30 30 59 39 41 39 43 36 20 40 ...
$ jobtype : Factor w/ 3 levels "White Collar",..: 3 1 2 2 1 2 1 2 3 1 ...
$ marital : Factor w/ 3 levels "Divorced","Married",..: 2 2 2 2 2 2 2 2 3 2 ...
$ education : Factor w/ 4 levels "Primary","Secondary",..: 1 3 2 2 3 2 2 3 2 3 ...
$ default : chr "no" "no" "no" "no" ...
$ balance : int 1787 1476 0 147 221 9374 264 1109 502 194 ...
$ housing : chr "no" "yes" "yes" "yes" ...
$ loan : chr "no" "yes" "no" "no" ...
$ whitecollar: num 0 1 0 0 1 0 1 0 0 1 ...
$ bluecollar : num 0 0 1 1 0 1 0 1 0 0 ...
$ divorced : num 0 0 0 0 0 0 0 0 0 0 ...
$ married : num 1 1 1 1 1 1 1 1 0 1 ...
$ primary : num 1 0 0 0 0 0 0 0 0 0 ...
$ secondary : num 0 0 1 1 0 1 1 0 1 0 ...
$ tertiary : num 0 1 0 0 1 0 0 1 0 1 ...
NULL





































null device
1
Examine the cluster solution results, look for average silhouette width > 0.5 and look for last big jump in average silhoutte width
provide a single summary plot for the clustering solutions

select clustering solution and examine it


From the silhouette plot, the first five of the seven clusters appear to be large and well-defined
look at demographics across the clusters/segments (age Age in years). Examine relationship between age and response to promotion
cluster: A
[1] 45.9057
----------------------------------------------------------------------
cluster: B
[1] 41.6633
----------------------------------------------------------------------
cluster: C
[1] 43.04649
----------------------------------------------------------------------
cluster: D
[1] 40.03083
----------------------------------------------------------------------
cluster: E
[1] 32.29765
Plot lattice. responders tend to be older

Level of education (unknown, secondary, primary, tertiary)
education
cluster Primary Secondary Tertiary Unknown
A 509 0 0 0
B 0 0 594 0
C 0 826 0 56
D 0 485 0 34
E 0 354 0 29
Table of job status using jobtype
jobtype
cluster White Collar Blue Collar Other/Unknown
A 70 313 126
B 470 81 43
C 0 768 114
D 488 0 31
E 0 314 69
marital
cluster Divorced Married Single
A 0 449 60
B 0 594 0
C 0 882 0
D 0 380 139
E 0 0 383
look at bank client history across the clusters/segments. default Has credit in default? (yes, no)
default
cluster no yes
A 502 7
B 590 4
C 868 14
D 508 11
E 372 11
balance Average yearly balance (in Euros)
cluster: A
[1] 1446.106
----------------------------------------------------------------------
cluster: B
[1] 1882.47
----------------------------------------------------------------------
cluster: C
[1] 1273.861
----------------------------------------------------------------------
cluster: D
[1] 1331.609
----------------------------------------------------------------------
cluster: E
[1] 1018.661
Plot lattice. responders tend to be older
null device
1
housing Has housing loan? (yes, no)
housing
cluster no yes
A 225 284
B 310 284
C 322 560
D 208 311
E 194 189
loan Has personal loan? (yes, no)
response Response to term deposit offer (yes, no)
response
cluster no yes
A 473 36
B 535 59
C 824 58
D 489 30
E 333 50
Plot mosaic Response to Term Deposit Offer

Computation percentage of yes responses to term deposit offer
Percentage Responses
A 7.1
B 9.9
C 6.6
D 5.8
E 13.1
Note the percentage of the customers receiving offers for the first time falling into each of the clusters/segments
1 2 3 4 5 6 7 8
13.7 16.0 23.8 14.0 10.3 3.9 8.1 10.1
