Data set preparation and inspection
First six observations of the newly created data sets
| Acesonzap |
Austria |
Existing Customer |
BI |
Lost |
26000 |
1 |
| Applex |
Austria |
Existing Customer |
BI |
Won |
74000 |
2 |
| bam-hex |
Austria |
New Customer |
BI |
Lost |
21000 |
2 |
| Basecare |
Germany |
New Customer |
ERP |
Won |
111000 |
1 |
| Biozumzap |
Austria |
Existing Customer |
BI |
Lost |
214000 |
3 |
| Blacklax |
Germany |
New Customer |
ERP |
Lost |
5000 |
2 |
Contingency table of the counts
| Lost |
0.58 |
0.7 |
| Won |
0.42 |
0.3 |
| Lost |
0.77 |
0.44 |
0.57 |
| Won |
0.23 |
0.56 |
0.43 |
| Lost |
0.77 |
0.46 |
| Won |
0.23 |
0.54 |
Create a sales revenue forecast for 10-18
I first decided to forecast the wins (1) and loses (0) for 10-18 by running a logistic regression on the previous months. Because the revenue depends on the probability of winning a deal.
Variables considered for the analysis:
- Customer type: Existing customer vs New customer
- Region: Austria vs Germany and Switzerland
- Product: BI vs ERP
- Expected time: Time spent till the transaction
- Deal size: Might also play a role as buyer can base decision upon the money that they have to spend
Logistic regression results
|
|
|
|
Status
|
|
|
|
Constant
|
-1.640*
|
|
|
(0.892)
|
|
|
|
|
Customer.TypeNew Customer
|
-0.528
|
|
|
(0.603)
|
|
|
|
|
RegionGermany
|
1.287*
|
|
|
(0.691)
|
|
|
|
|
RegionSwitzerland
|
0.824
|
|
|
(0.747)
|
|
|
|
|
ProductERP
|
1.224**
|
|
|
(0.583)
|
|
|
|
|
exp_time
|
-0.066
|
|
|
(0.270)
|
|
|
|
|
Dealsize
|
0.00000
|
|
|
(0.00000)
|
|
|
|
|
84.449
|
|
|
N
|
63
|
|
Log Likelihood
|
-35.224
|
|
Akaike Inf. Crit.
|
84.449
|
|
|
|
Notes:
|
***Significant at the 1 percent level.
|
|
|
**Significant at the 5 percent level.
|
|
|
*Significant at the 10 percent level.
|
Summary
In a nutshell, results suggest that:
- Customer type does not play a role in term of winning or losing a contract even though I suspected that a contract is won when the client negotiate with an existing customer
- Client significantly wins more in Germany with regard to Austria (p-value < 0.1), and there is no significant difference in the probability of winning comparing Austria with Switzerland
- The probability of selling ERP products is significantly more than the propbability of selling a BI product
- Expected time and deal size do not play a role in the probability of winning or losing a deal
Forecast for 10 18
I build a confidence interval for each deal’s outcome in 10 18 using t-distribution with a degrees of freedom 11 (n-1).
| Toolcompany |
50000 |
Austria |
New Customer |
ERP |
1 |
0.110 |
0.294 |
0.477 |
Lost |
0 |
0 |
| tanteemma |
77000 |
Germany |
Existing Customer |
ERP |
2 |
0.538 |
0.720 |
0.902 |
Won |
1 |
77000 |
| Swups |
80000 |
Switzerland |
Existing Customer |
ERP |
1 |
0.386 |
0.635 |
0.885 |
Won |
1 |
80000 |
| Superdoopa |
33000 |
Austria |
New Customer |
ERP |
1 |
0.100 |
0.285 |
0.469 |
Lost |
0 |
0 |
| fixidea |
222000 |
Germany |
New Customer |
BI |
2 |
0.143 |
0.396 |
0.648 |
Lost |
0 |
0 |
| Tortenmacher |
44000 |
Austria |
Existing Customer |
ERP |
0 |
0.149 |
0.426 |
0.703 |
Lost |
0 |
0 |
| tobado |
55000 |
Austria |
Existing Customer |
ERP |
0 |
0.158 |
0.433 |
0.708 |
Lost |
0 |
0 |
| Rumsdidums |
300000 |
Switzerland |
Existing Customer |
BI |
0 |
0.102 |
0.495 |
0.889 |
Lost |
0 |
0 |
| MaxSteel |
22000 |
Austria |
New Customer |
ERP |
0 |
0.073 |
0.292 |
0.511 |
Lost |
0 |
0 |
| beatmeat |
111000 |
Germany |
Existing Customer |
BI |
0 |
0.164 |
0.486 |
0.807 |
Lost |
0 |
0 |
| Ausdiemaus |
99000 |
Switzerland |
New Customer |
BI |
0 |
0.014 |
0.253 |
0.493 |
Lost |
0 |
0 |
| Arrow |
200000 |
Austria |
Existing Customer |
BI |
0 |
0.033 |
0.248 |
0.464 |
Lost |
0 |
0 |
- There is still a probability for the client to lose the deal with “Swups”: CI (0.386, 0.885)
Probability of winning opportunities for different criteria in a certain month
- Dealsize and expected time is selected as the mean value of the previous info.
- All combinations of factor levels are reported below.
- Limitation: Client might also play a role here (heterogeneity effect). However as we have only few observations, I cannot control for it (e.g., I could not include it as a fixed-effect).
| A |
98619.05 |
Austria |
Existing Customer |
BI |
2 |
0.181 |
Lost |
| A |
98619.05 |
Germany |
Existing Customer |
BI |
2 |
0.445 |
Lost |
| A |
98619.05 |
Switzerland |
Existing Customer |
BI |
2 |
0.335 |
Lost |
| A |
98619.05 |
Austria |
New Customer |
BI |
2 |
0.115 |
Lost |
| A |
98619.05 |
Germany |
New Customer |
BI |
2 |
0.321 |
Lost |
| A |
98619.05 |
Switzerland |
New Customer |
BI |
2 |
0.229 |
Lost |
| A |
98619.05 |
Austria |
Existing Customer |
ERP |
2 |
0.429 |
Lost |
| A |
98619.05 |
Germany |
Existing Customer |
ERP |
2 |
0.732 |
Won |
| A |
98619.05 |
Switzerland |
Existing Customer |
ERP |
2 |
0.632 |
Won |
| A |
98619.05 |
Austria |
New Customer |
ERP |
2 |
0.307 |
Lost |
| A |
98619.05 |
Germany |
New Customer |
ERP |
2 |
0.616 |
Won |
| A |
98619.05 |
Switzerland |
New Customer |
ERP |
2 |
0.503 |
Won |
Alternative forecasting methods
kNN algorithm for forecasting

## Confusion Matrix and Statistics
##
## Reference
## Prediction Lost Won
## Lost 9 5
## Won 0 1
##
## Accuracy : 0.6667
## 95% CI : (0.3838, 0.8818)
## No Information Rate : 0.6
## P-Value [Acc > NIR] : 0.40322
##
## Kappa : 0.1935
## Mcnemar's Test P-Value : 0.07364
##
## Sensitivity : 1.0000
## Specificity : 0.1667
## Pos Pred Value : 0.6429
## Neg Pred Value : 1.0000
## Prevalence : 0.6000
## Detection Rate : 0.6000
## Detection Prevalence : 0.9333
## Balanced Accuracy : 0.5833
##
## 'Positive' Class : Lost
##
Time-series analysis (ARIMA)

Further methods
- Bayesian model for binary Markov chains
- Hidden Markov Models for Time Series