| Item | Variable.Name | Definition | Theoretical.Effect |
|---|---|---|---|
| 1 | Index | Unique identifier (key) for a distinct service contract. | None |
| 2 | Area | Location where the service was sold. | Unknown effect |
| 3 | Service Approach | S is a particular service model. All other services are labeled not S. | The theory is that S will be much more effective than anything else we offer. |
| 4 | Channel Approach | We sell services largely through other companies. This reflects two different channel models. | Unknown effect |
| 5 | Renewal Flag | The key variable whether or not a contract renewed. If it did not renew it is now an expired contract representing lost business. | N/A - Variable to predict |
| 6 | Contract Expiration Date | The date the contract expired or if renewed when it will expire. | Unknown effect |
| 7 | Contract Length | Annual or multi-year renewal. | Multi-year contracts may be more likely to renew. |
| 8 | Sales Category | Different categories of clients. | Unknown effect |
| 9 | Seller Unique ID | Unique idenitfier for the actual seller of the service on behalf of the company. | Certain sellers may be more likely to renew. |
| 10 | Contract Value Category | Buckets for contract value based on net (i.e. discounted price). | Discounted contracts may be more likely to renew. |
| 11 | Contract Line Category | Buckets based on items on the contract. Less than 10 is very small, 10-49 is small, 50-99 is medium, 100-999 is large and greater than 999 is very large. | Contracts with more items may be more likely torenew. |
| 12 | Discount Category | Buckets representining the amount of discount applied from the catalogue price of the contract. Discounts are earned based on loyalty, length of contract, system configuration and geography. | Discounted contracts may be more likely to renew. |
| 13 | Multiple Services | Flag to call out whether there is more than one service approach on the contract. When there is more than 1 the highest value service is called out in the Service Approach column. | Unknown effect |
| 14 | Item Count | Count of the items under service on the contract. | Contracts with more items may be more likely torenew. |
| 15 | Cost | Catalogue cost for the service requested on the items of the contract. | Unknown effect |
## INDEX AREA SERVICE_APPROACH CHANNEL_APPROACH
## Min. : 1 USA :67978 NOT S:32817 FIRST :43042
## 1st Qu.:22932 CANADA : 6679 S :58909 SECOND:48684
## Median :45864 US_CANADA_OTHER: 5019
## Mean :45864 BRASIL : 3059
## 3rd Qu.:68795 CANSAC : 2878
## Max. :91726 MCO : 2672
## (Other) : 3441
## RENEWAL_FLAG CONTRACT_EXPIRATION CONTRACT_LENGTH SALES_CATEGORY
## Min. :0.000 7/31/2015 : 4475 ANNUAL :75975 COM :60259
## 1st Qu.:0.000 12/31/2015: 2843 MULTI-YEAR:15751 ENT : 7025
## Median :0.000 12/31/2016: 1938 OTHER: 6562
## Mean :0.319 3/31/2016 : 1470 PS :17571
## 3rd Qu.:1.000 10/31/2016: 1425 SMB : 1
## Max. :1.000 9/30/2015 : 1230 SP : 308
## (Other) :78345
## SELLER_UNIQUE_ID CONTRACT_VALUE_CATEGORY CONTRACT_LINE_CATEGORY
## 54 : 8773 <10K :80500 LARGE : 2601
## 4139 : 3387 >250K : 441 MEDIUM : 2190
## #VALUE!: 3026 100K-250K: 826 SMALL :12182
## 47796 : 1712 10K-25K : 6045 VERY LARGE: 262
## 29543 : 1521 25K-50K : 2551 VERY SMALL:74491
## 15617 : 1367 50K-100K : 1363
## (Other):71940
## DISCOUNT_CATEGORY MULTIPLE_SERVICES ITEM_COUNT COST
## #DIV/0! : 146 NO :91009 Min. : 1.00 $101.00 : 955
## LARGE :23028 YES: 717 1st Qu.: 1.00 $69.00 : 902
## MEDIUM :42578 Median : 2.00 $119.00 : 730
## NO DISCOUNT:16029 Mean : 22.71 $71.00 : 638
## VERY LARGE : 9945 3rd Qu.: 6.00 $203.00 : 502
## Max. :84184.00 $100.00 : 457
## (Other) :87542
## SALES_STRATEGY
## GCS :71126
## NOT GCS:20600
##
##
##
##
##
| INDEX | AREA | SERVICE_APPROACH | CHANNEL_APPROACH | RENEWAL_FLAG | CONTRACT_EXPIRATION | CONTRACT_LENGTH | SALES_CATEGORY | SELLER_UNIQUE_ID | CONTRACT_VALUE_CATEGORY | CONTRACT_LINE_CATEGORY | DISCOUNT_CATEGORY | MULTIPLE_SERVICES | ITEM_COUNT | COST | SALES_STRATEGY |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | USA | S | FIRST | 1 | 12/29/2015 | ANNUAL | ENT | 54 | 25K-50K | SMALL | LARGE | NO | 38 | $63,958.00 | NOT GCS |
| 2 | USA | S | FIRST | 1 | 9/30/2016 | ANNUAL | PS | 47796 | <10K | SMALL | VERY LARGE | NO | 14 | $18,966.00 | NOT GCS |
| 3 | USA | S | SECOND | 0 | 7/31/2015 | ANNUAL | COM | 1611 | <10K | SMALL | NO DISCOUNT | NO | 11 | $5,518.00 | GCS |
| 4 | USA | S | SECOND | 0 | 9/30/2015 | ANNUAL | PS | -989 | 25K-50K | SMALL | LARGE | NO | 18 | $62,150.00 | NOT GCS |
| 5 | USA | S | FIRST | 0 | 1/19/2016 | ANNUAL | ENT | 24116 | 25K-50K | SMALL | NO DISCOUNT | NO | 17 | $18,870.56 | NOT GCS |
| 6 | USA | S | SECOND | 1 | 12/22/2016 | ANNUAL | COM | 25276 | <10K | VERY SMALL | MEDIUM | NO | 9 | $10,682.00 | GCS |
#basicStats(fp1)
Our data set contains recent service contracts with Company X, some of which were renewed and some which were not renewed. Our goal is to predict whether a service contract will be renewed based on the attributes in the data set. A renewal is considered successful.
There are 91726 rows of data, each representing a service contract with Company X.
We have 15 potential predictor variables and one response variable (“RENEWAL_FLAG”) that indicates if the contract was renewed.
The transformed data set can be seen here:
https://raw.githubusercontent.com/spsstudent15/2016-02-621-W2/master/621-FP-Transformed-Data.csv