Introduction

Our company recommends the ads to the users.
We have some features of the users (ip, website, gender, …)
The recommended ad is picked by a model
The log is used to train the model.
We have about 100,000,000 instances per day.
We have about 10,000 features. All these features are catogorical.

Model

Take logistic regression as an example:

\[ P(is_click | ad, website, ip, ...) = logit(b + w_{ad} * ad + w_{website} * website + w_{ip} * ip) \]

We can't control website, ip, …
We can control ad
For 99% instances, the ad are picked to maximize the click according to existed model.
For 1% instances, Can we design a way to select ad to make the estimation of the model better?