COD拒收模型背景与目的

目前COD总体的签收率是80%,这意味着有一部分人是会拒收COD的,拒收带来了额外的成本,带来相当大损失。

于是一个想法是能不能将可能拒收COD的人群在发货之前识别出来,采取一定的措施,减少损失。

这个想法是可以实现的。

COD建模

查看使用到的变量以及部分数据

kable(as.data.frame(colnames(B_New_cod_sample)))
colnames(B_New_cod_sample)
用户性别
app1
device
ip_num
frist_visit
下单day
付款day
label
DT::datatable(B_New_cod_sample[1:100,])

查看拒收与签收的比例

pct(New_cod_sample$label)
Count Percentage
0 88727 88.26
1 11798 11.74

查看特征与是否拒收的关系

  1. 查看不同的手机版本
A1 <- gbpct(New_cod_sample$app1,New_cod_sample$label)

op1<-par(mfrow=c(1,2), new=TRUE)
## Warning in par(mfrow = c(1, 2), new = TRUE): 不绘图就不能调用par(new=TRUE)
New_cod_sample$app1 <- as.factor(New_cod_sample$app1)
New_cod_sample$label <- as.factor(New_cod_sample$label)

plot(New_cod_sample$app1, New_cod_sample$label, 
     ylab="Good-Bad", xlab="category", 
     main="app1")

  1. 产看不同的州
A1 <- gbpct(New_cod_sample$州,New_cod_sample$label)

New_cod_sample$州 <- as.factor(New_cod_sample$州)

plot(New_cod_sample$州, New_cod_sample$label, 
     ylab="Good-Bad", xlab="category", 
     main="state")

3.查看不同性别

A1 <- gbpct(New_cod_sample$gender,New_cod_sample$label)

New_cod_sample$gender <- as.factor(New_cod_sample$gender)

plot(New_cod_sample$gender, New_cod_sample$label, 
     ylab="Good-Bad", xlab="category", 
     main="gender")

4 查看使用过的ip数量与拒收的关系

A1 <- gbpct(New_cod_sample$ip_num,New_cod_sample$label)

New_cod_sample$ip_num <- as.factor(New_cod_sample$ip_num)

plot(New_cod_sample$ip_num, New_cod_sample$label, 
     ylab="Good-Bad", xlab="category", 
     main="quantity of ip")

5 查看下单时间与拒收的关系

A1 <- gbpct(New_cod_sample$下单day,New_cod_sample$label)

New_cod_sample$下单day <- as.factor(New_cod_sample$下单day)

plot(New_cod_sample$下单day, New_cod_sample$label, 
     ylab="Good-Bad", xlab="category", 
     main="Relationship between order time and refusal")

6 付款日期与拒收的关系

A1 <- gbpct(New_cod_sample$付款day,New_cod_sample$label)

New_cod_sample$付款day <- as.factor(New_cod_sample$付款day)

plot(New_cod_sample$付款day, New_cod_sample$label, 
     ylab="Good-Bad", xlab="category", 
     main="Relationship between payment date and rejection")

建模

names(Train) <- c('state','gender','app','device','ip_num','fristvisit','orderday','payday','label')
names(Test) <- c('state','gender','app','device','ip_num','fristvisit','orderday','payday','label')

xgboost

Mlr_Modle(train = Train[,-1],test = Test[,-1],model = l$class[42])
## [1] "评估模型的效果"

## $KS
## [1] 0.3853
## 
## $AUC
## [1] 0.7605
## 
## $Gini
## [1] 0.5211
## 
## $pic
## TableGrob (2 x 2) "arrange": 4 grobs
##       z     cells    name           grob
## pks   1 (1-1,1-1) arrange gtable[layout]
## plift 2 (1-1,2-2) arrange gtable[layout]
## proc  3 (2-2,1-1) arrange gtable[layout]
## ppr   4 (2-2,2-2) arrange gtable[layout]