欺诈行为分析

有鉴于与第三方渠道启动了新KPI合作,需即时分析三方渠道的广告质量,能尽速判断是否扣钱或是进行KPI的即时调整。

load(file = '/Users/milin/advertising/d.Rdata')

library(ggplot2)
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 3.4.4
library(plotly)
library(tidyverse)

三种作弊手段

以渠道Affle为例子

  1. 无任何点击行为
head(Affle[[1]] %>% arrange(adid))
##   click_time.x activity_kind             network_name
## 1         <NA>          <NA>                     <NA>
## 2         <NA>          <NA>                     <NA>
## 3   1527154078       session                    Affle
## 4   1527296932 reattribution Adwords Display Installs
## 5   1527296932         event Adwords Display Installs
## 6   1527154078         event                    Affle
##                                   adid installed_at install_time
## 1 0010671f-4f15-4537-9718-42b9723fa6a5         <NA>   1527302419
## 2 0076347c-9722-4f7a-b625-054012885ba0         <NA>   1527545071
## 3 00b7b867-b8d5-45ae-82ea-e3d473cf0b11   1527154121   1527154078
## 4 00b7b867-b8d5-45ae-82ea-e3d473cf0b11   1527154121   1527154078
## 5 00b7b867-b8d5-45ae-82ea-e3d473cf0b11   1527154121   1527154078
## 6 00b7b867-b8d5-45ae-82ea-e3d473cf0b11   1527154121   1527154078
  1. 最后一次点击与下载时间间隔为0
head(Affle[[2]] %>% filter(install_time-click_time1==0))[,c(1,2,3,4,5)]
## # A tibble: 6 x 5
##   adid            install_time        click_time1         action1 network1
##   <chr>           <dttm>              <dttm>              <chr>   <chr>   
## 1 00014760-951a-… 2018-05-25 09:48:15 2018-05-25 09:48:15 click   Affle   
## 2 000f17e6-db13-… 2018-05-25 11:17:32 2018-05-25 11:17:32 click   Affle   
## 3 000f17e6-db13-… 2018-05-25 11:17:32 2018-05-25 11:17:32 click   Affle   
## 4 000f17e6-db13-… 2018-05-25 11:17:32 2018-05-25 11:17:32 click   Solo    
## 5 000f17e6-db13-… 2018-05-25 11:17:32 2018-05-25 11:17:32 click   Solo    
## 6 0015e82d-b646-… 2018-05-27 02:58:27 2018-05-27 02:58:27 click   Affle
  1. 下载的渠道但是与下载之前的渠道不一致
head(Affle[[2]] %>% filter(network1 !='Affle'))[,c(1,2,3,4,5)]
## # A tibble: 6 x 5
##   adid            install_time        click_time1         action1 network1
##   <chr>           <dttm>              <dttm>              <chr>   <chr>   
## 1 000f17e6-db13-… 2018-05-25 11:17:32 2018-05-25 11:17:32 click   Solo    
## 2 000f17e6-db13-… 2018-05-25 11:17:32 2018-05-25 11:17:32 click   Solo    
## 3 002f8ca5-0e89-… 2018-05-29 08:54:39 2018-05-29 08:54:38 click   Glispa  
## 4 00679b72-11bc-… 2018-05-27 10:48:31 2018-05-27 10:48:30 click   Glispa  
## 5 007be953-8422-… 2018-05-29 10:34:11 2018-05-29 10:33:58 click   Glispa  
## 6 00cb2fcf-cdda-… 2018-05-27 10:42:19 2018-05-27 10:42:17 click   Glispa

目前已经分析的渠道有

  • Affle(5/18~5/31之数据)

  • Mobvista(5/18~5/31之数据)

  • Appnext(5/18~5/31之数据)

  • Glispa(5/18~5/31之数据)

  • RTBdemand(5/18~5/31之数据)

  • Startapp(5/18~5/31之数据)

  • Uc ads(5/18~5/31之数据)

  • volo(5/18~5/31之数据)

  • Solo(5/18~5/31之数据)

分析结果如下

d
##         app       bad Quantity
## 1     Affle 0.9200088     9126
## 2  Mobvista 0.9991789    30446
## 3   Appnext 0.9995073    11457
## 4    Glispa 0.9984424    11556
## 5 RTBdemand 0.7400582     5155
## 6  Startapp 0.8432224     6219
## 7     Ucads 0.9990526     2111
## 8      volo 0.9667147     7041
## 9      Solo 1.0000000     8194
p <- ggplot(d,aes(x = app,y = bad))+geom_bar(stat = 'identity')
p <- p + theme_gdocs()+xlab('app channel') + ylab('False ratio')
ggplotly(p)
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`