February 27, 2017

Slides

名詞

  • Out of fold prediction
  • Lv1, Lv2, Lv3, …

Stacked Generalization

資料的切割

  1. 事先決定K-Fold 與 Subsets
  2. 各自進行Lv1 Model Training
    • Step 1, 2, and 3
  3. 收集大家的Predictions
    • Prediction of Model 1.1 on Fold 2
    • Prediction of Model 1.2 on Fold 1
    • Prediction on Test Dataset
  4. 進行Stacked Generalization
  5. New Submission

How to do Local CV on Stacked Generalization?

  1. Fold 1, Fold 2, Fold 3
  2. Local CV
    • Lv 1 Model:
      • Fold 1 + Fold 2 ==> Fold 3 and evaluate
      • Fold 1 + Fold 3 ==> Fold 2 and evaluate
      • Fold 2 + Fold 3 ==> Fold 1 and evaluate
    • Lv 2 Model:
      • Fold 1 + Fold 2 ==> Fold 3 with Stacked Generalization and evaluate
      • Fold 1 + Fold 3 ==> Fold 2 with Stacked Generalization and evaluate
      • Fold 2 + Fold 3 ==> Fold 1 with Stacked Generalization and evaluate

Summary

  • 事先決定Folds, 至少 3 以上
    • 如果要玩高Lv, 考慮Local CV的執行,Fold 數量 = Lv + 1 (?)
  • 所有人用相同的Fold 切割法
  • 各自帶開、各自建模、各自上傳、各自觀察Local CV v.s. Public Leader Board
    • 可能需要重新分配CV
  • 組隊
    • Stacked Generalization…

Criteo Competition 的心得

  • 想清楚自己的優勢
    • 所有上面的知識所有人都會
    • 計算能量
    • 演算法實做
  • 演算法實做
    • Dropout + FTPRL
  • 那時候我們完全沒有用上面的Kaggle技巧

How To Win?

  • 學會上述的技巧
  • 擁有比別人多的Domain Knowledge

Homework

Next Homework

  • Read Paper、Report Paper
  • Survey the Winner of Similar Competition
  • Lv 1 Modeling