June 19, 2017

Optimized Cost per Click in Taobao Display Advertising Zhu et al. (2017)

Problem Definition

  • Advertiser demand:
    • ROI
    • Gaining quality traffic
  • Platform ecology index
    • GMV (gross merchandise volume)
      • GMV does not include discounts, costs involved and returns of products.

Modeling ( Buyer Side )

  • expected GMV for single click is \(p(c | u, a) \times v_a\)
  • Assumes that the cost of a click paid by the advertiser is \(b_a\)
    • \(roi_{(u, a)} = \frac{p(c|u,a) \times v_a}{b_a}\)
    • \(roi_a = \frac{v_a \times \sum_{u}{n_u \times p(c | u,a)}}{b_a \times \sum_{u}{n_u}} = \frac{E_u[p(c|u,a)] \times v_a}{b_a}\)
    • \(roi_a\) should keep unchanged or be improved (ROI constraint)

Notations

  • Conversion: \(p(c|u, a)\) where \(u\) is a user and \(a\) is a clicked ad
  • \(v_a\) is the predicted pay-per-buy(PPB) by consumers

Modeling ( Buyer Side )

  • ROI constraint implies:
    • Raise the bid if \(\frac{p(c|u,a)}{E_u[p(c|u,a)]} \geq 1\)
    • Depress the bid if \(\frac{p(c|u,a)}{E_u[p(c|u,a)]} < 1\)

Modeling ( Seller Side )

\[max_{b^*_1, ..., b^*_n} f(k,b^*_k)\] such that

  • \(k = argmax_i {pctr_i \times b^*_i}\) (seller side)
  • \(l(b^*_i) \leq b^*_i \leq u(b^*_i) \forall i \in A\) (buyer side)

Model Estimation

  • Gai et al. (2017)
  • Calibration
    • The estimation is still biased

Model Evaluation

Group AUC

The data is grouped by user and position of ad spot

\[GAUC = \frac{\sum_{u,p}{w_{(u,p)} \times AUC_{u,p}}}{\sum_{(u,p)}{w_{(u,p)}}}\]

  • \(w_{(u,p)}\) is proportional to impression times or click times

Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction Gai et al. (2017)

Formulation

\[p(y = 1 | x) = g(\sum_{j=1}^m {\sigma(u_j^T x) \times \eta(w_j^T x)})\]

  • \(\Theta = \left\{u_1, ..., um, w_1, ..., w_m\right\} \in \mathbb{R}^{d \times 2m}\) is the model parameters
  • \(\sigma\) is the dividing function, divides feature space into \(m\)(hyper-parameter) different regions
  • \(\eta\) is the fitting function, gives prediction in each region
  • \(g\) ensures our model to satisfy the definition of probability

Formulation

\[p(y=1|x) = \sum_{i=1}^m \frac{exp(u^T_i x)}{\sum_{j=1}^m{exp(u^T_j x)}} \times \frac{1}{1 + exp(-w_i^T x)}\]

Special Case

  • \(\sigma\) is softmax
  • \(\eta\) is sigmoid

Objective Function

\[argmin_{\Theta} f(\Theta) = loss(\Theta) + \lambda \left\lVert \Theta \right\rVert_{2,1} + \beta \left\lVert \Theta \right\rVert_1\]

Regularization

  • \(loss(\Theta)\) : logloss ( negative log likelihood)
  • \(\left\lVert \Theta \right\rVert_{2,1} = \sum_{i} \sqrt{\sum_j{\theta^2_{i,j}}}\): feature selection
  • \(\left\lVert \Theta \right\rVert_1 = \sum_{i,j} \left|\theta_{i,j}\right|\): sparsity

Optimization

  • replace gradient with descent direction (It might be subgradient?)
  • Find direction first
  • Use approximated inverse-Hessian matrix to compute the step size. (Proposed by LBFGS)
  • LBFGS

Computation

Data Structure (Common Feature Trick)

  • In a page view, the user feature is the same with several ad features…

Experienments

Only compare to LR…..

Reference

Gai, K., X. Zhu, H. Li, K. Liu, and Z. Wang. 2017. “Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction.” ArXiv E-Prints, April.

Zhu, Han, Junqi Jin, Chang Tan, Fei Pan, Yifan Zeng, Han Li, and Kun Gai. 2017. “Optimized Cost Per Click in Taobao Display Advertising.” CoRR abs/1703.02091. http://arxiv.org/abs/1703.02091.