First of all, let’s take a look at the structure of our dataset:
## 'data.frame': 40841 obs. of 18 variables:
## $ X : int 0 1 2 5 6 7 8 9 10 11 ...
## $ age : int 58 44 33 35 28 42 58 43 41 29 ...
## $ job : Factor w/ 12 levels "admin.","blue-collar",..: 5 11 3 5 5 3 7 11 1 1 ...
## $ marital : Factor w/ 3 levels "divorced","married",..: 2 3 2 2 3 1 2 3 1 3 ...
## $ education : Factor w/ 3 levels "primary","secondary",..: 3 2 2 3 3 3 1 2 2 2 ...
## $ default : Factor w/ 2 levels "no","yes": 1 1 1 1 1 2 1 1 1 1 ...
## $ balance : int 2143 29 2 231 447 2 121 593 270 390 ...
## $ housing : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...
## $ loan : Factor w/ 2 levels "no","yes": 1 1 2 1 2 1 1 1 1 1 ...
## $ day : int 5 5 5 5 5 5 5 5 5 5 ...
## $ month : Factor w/ 12 levels "apr","aug","dec",..: 9 9 9 9 9 9 9 9 9 9 ...
## $ duration : num 4.35 2.52 1.27 2.32 3.62 6.33 0.83 0.92 3.7 2.28 ...
## $ campaign : int 1 1 1 1 1 1 1 1 1 1 ...
## $ pdays : int -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
## $ previous : int 0 0 0 0 0 0 0 0 0 0 ...
## $ poutcome : Factor w/ 3 levels "failure","success",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ response : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
## $ response_binary: int 0 0 0 0 0 0 0 0 0 0 ...
From the table it can be seen that we have 40841 observations of 18 variables and these variables are:
x - ID
age - Age of a client
job - Type of job of a client
marital - Marital status of a client
education - Education level (cat)
default - Has a client a credit in default?
balance - Balance level of a client
housing - Has a client a housing loan?
loan - Has a client a personal loan? (cat)
day - Last contact day of the week
month - Last contact month of the year
duration - Last contact duration, in seconds
campaign - Number of contacts performed during this campaign and for this client
pdays - Number of days that passed by after the client was last contacted from a previous campaign
previous - Number of contacts performed before this campaign and for this client
poutcome - Outcome of the previous marketing campaign
response - Has the client subscribed a term deposit?
Secondly, let’s see how many people among our clients have subscribed a term deposit:
From the barplot it can be seen that out of 40 841 clients only 4 639 have actually subscribed to the term deposit. Therefore, further exploratory analysis is needed in order to better understand which segments of customers constitute our subscribers. This will help us determine our target audience and explore it deeper. Moreover, the results of this analysis will provide us with the background, on which further subscription improvement policies will be based.
This analysis will be divided into four sections according to the variables, which are examined in the particular section:
1. Socio-demographics
Variables explored:
2. Previous marketing campaign
Variables explored:
3. Current marketing campaign
Variables explored:
4. Bank history of a client
Variables explored:
Main focus of the section: the exploration of socio-demographic picture of the bank’s customers.
| Response | Mean Age |
|---|---|
| Yes | 41.46971 |
| No | 40.70366 |
From the above boxplot and estimated means it can be seen that age of subscribers isn’t different much from the age of non-subscribers and is approximately 40 years old. This shows, that the bank already has a targeted age group. People from this age group in most cases are more aware of their finances and understand how important it is to save money wisely. Therefore, they’re more likely to use bank services for saving money and by contacting them we have better chances to get more subscriptions.
From the above barplots it can be seen, that most of the clients, that were contacted are those, who have only secondary education and as a result in the first barplot it can be seen, that the category of people with secondary education has the largest amount of subscribers. However, if we look at the second barplot, where it can be seen which percent of all the contacted people in each category have subscribed to the term deposit. And it is clearly seen, that people with teritary education were subscribing to term deposits more often.
From the first barplot it can be seen, that most of the contacted customers were married ones. However, the second plot shows that, when the bank was contacting single and divorced clients they were subscribing a term deposit more often, than married once. The difference in percentages is not that big, but it still exists. The reason for this might be the fact, that deposit requirements in this bank were met by divorced and single customers more often. Therefore, some improvement strategy should be created in order to attract more married customers.
From the first barplot it can be seen, that among our clients there are a lot of technicians, those who work in services, managers and blue-collar workers. However, the percentage of people, who subscribed term deposits in thsese categories is rather low. On the contrary we can see, that students and retired people were subscribing more often. Therefore, some improvement strategy should be created in order to attract more customers from other categories.
Main focus of the section: to explore actions and outcomes during previous marrketing campaign in relation to current one
From the first barplot it can be seen, that there are a lot of clients for whom the outcome of the previous marketing campaign is unknown. This might be due to the fact, that most of our clients are new ones and weren’t the participants of the previous campaign. However, when we look at the percantage ratio in the second plot, we can see, that customers for whom the previous marketing campaign was a successs tend to subscribe more often.
| Response | Mean number of contacts |
|---|---|
| Yes | 1.0002156 |
| No | 0.3645931 |
From the above boxplots and estimated means, it can be seen, that clients, who eventually subscribed to term deposits were generally contacted more before the campaign. Probably this helped them build trustful relationships with the bank, which resulted in making a subcription during recent marketing campaign. This can be an indicator for the bank to continue communicating with their clients in order to make them stay.
| Response | Mean number of days |
|---|---|
| Yes | 59.76439 |
| No | 28.72234 |
From the boxplots above and the estimated means it can be seen, that clients, who eventually subscribed to term deposits were generally contacted later (more days have passed since last contact from previous marketing campaign).Therefore, probably it is better to wait for a while before contacting a client again with the new offer in order not appear intrusive and not to overload him or her with several offers in a short amount of time.
Main focus of the section: to explore actions during current marketing campaign in relation to final subscriptions rates.
| Response | Mean number of contacts |
|---|---|
| Yes | 2.145506 |
| No | 2.854704 |
From the boxplots above and the estimated means it can be seen, that there is no significant difference in number of contacts, that have been made during this marketing campaign for subscribers and non subscribers. Nevertheless, those, who didn’t subscribe term deposit were contacted a little bit more. This appears to be counterintuitive, because according to common sense and business logic, by contacting a client more times a company has more chances to create a stronger bond with this client or to persuade him or her, that exactly their product is what a client needs. However, there are some exeptions and due to the fact, that the difference in number of contacts is not that big, we can still account for these exeptions. Nevertheless, some quality assesment of operator’s work is needed in order to see whether he is doing everything correct while communicating with the clients.
| Response | Mean durations |
|---|---|
| Yes | 9.128474 |
| No | 3.691365 |
From the above boxplots and estimated means it can be seen, that the duration of the last contact with clients, who eventually subscribed term deposit was generally longer. Therefore, probably if operators will tell more details about the deposit(which will take more time), then the subscripttioin rates will be higher, because they will manage to build trust with the clients and to promote the product in a good way.
Main focus of the section: to explore bank history of a client in relation to the outcome of the current marketing campaign.
| Response | Mean balance levels |
|---|---|
| Yes | 1406.263 |
| No | 1031.403 |
From the boxplots above and estimated means it can be seen, that balance level for subscribers is a little bit bigger. And there is also some logic in these results, because it is irrational to keep big amount of money at home. Therefore, people start to use bank services in order to store them.
From the barplots above it can be seen, that those clients, who don’t have personal loan tend to subscribe a term deposit more frequently. This might happen due to the fact, that a personal loan is already some sort of a burden for a client and firstly he or she wants to deal with it before engaging in any other bank activities. Additionaly, using common sense it ispossible to say, that if a person takes a personal loan, then he or she needs money, therefore, there is nothing to open a deposit for.
From these two barplots we can conclude, that those people, who don’t have housing loan tend to subscribe term deposit more frequently. In this case the logic is the same, as with personal loans.
From the first bar plot we can see, that most of the contacted clients doesn’t have any credit defaults. Probably this happens due to the fact, that banks prefer not to continue work with clients, who don’t manage to fulfill loan obligations. Neverthelss, there are some exeptions. And the second barplot shows, that those clients,who have credit defaults subscribe term deposits less often. Probably this happens because they still have to fulfill their credit obligations and are not ready for any new offers.