Analysis steps

Step 1: Load data and split data into two subsets, with 70% training and 30% test

setwd("C:\\Users\\Yang\\Desktop\\Business Data mining\\R\\Case\\Conv")
Conv <- read.csv ("Conv.csv")
set.seed(1234)
SampleID <- sample(2, nrow(Conv), replace = TRUE, prob = c(0.7, 0.3))
trainData <- Conv[SampleID==1, ]
testData <- Conv[SampleID==2, ]

Step 2: Build the decision tree and plot the tree

library(party)
Conv_ctree <- ctree(Account.Category ~ Sales.2006 + Spending.with.Convs.in.2006 + Normalized.Fortune.Reputation.Index, data = Conv)
plot(Conv_ctree)

Conclusion: Based on the plot tree, we may conclude that: 1) if the client’s fortune reputation index is higher than 5.8, it is highly likely in the catogry of A or B; 2) if the client’s fortune reputation index is less or equal 5.8, spending with Conv company less than 11.9 million dollars, and it’s total sales less than 12120.54 million dollars, it must belong to catogory C

Step 3: Predict the test data

testPred <- predict(Conv_ctree, newdata = testData)
table(testPred, testData$Account.Category)

##         
## testPred  A  B  C
##        A  2  0  0
##        B  0 15  0
##        C  0  1 76

Conclusion: According to the predicted table, we can see that the prediciton is high accurate, with only one mistake

Conv customer categorical analysis using decision tree

Yang

September 11, 2015

Purpose

Analysis steps