August 26, 2018

Summary

  • What is Embedding Layer
  • Related work of Embedding Layer in KDD 2018

Embedding Layer

  • A common layer used for categorical feature
  • The name is from keras: https://keras.io/layers/embeddings/
    • Possible aliases: representation, factorization machine, factor analysis, …

Example

  • Categorical feature: nominal or ordinal measurement.
    • Nominal: sex
    • Ordinal: GPA
  • Sex Input: Female, Male, Unknown
  • Common dummy coding:
    • Female: \((1, 0, 0)\)
    • Male: \((0, 1, 0)\)
    • Unknown: \((0, 0, 1)\)

Example

  • Embedding Layer: A \(( k ,\) number of levles \()\) matrix, ex:

\[\begin{array}{ccc} .1 & .2 & .3 \\ .2 & .3 & .4 \\ .3 & .4 & .5 \\ .4 & .5 & .6 \\ .5 & .6 & .7 \\ \end{array}\]

  • Maps the categorical feature into \(k\)-dimensional vector.
    • If the categorical features are essential to your data, use the embedding layer as the first layer in the deep learning.
  • Real example: word embeddings

Trend

  • Embedding layer becomes a specific topic in the research.
    • Network Embedding
    • Interpretation
    • Others
  • There are 22 papers whose title contains embedding in KDD 2018. 18 in research track and 4 in applied data science track.
  • There are 8 papers whose title contains ** representation** in KDD 2018. 5 in research track and 3 in applied data science track.

Network Embedding (NE)

Source: Cui et al. (A survey on network embedding, 2017)

There are many papers related to NE in KDD2018 Research Track

Liu et al. (Content to node, 2018a)

  • Issues of "network embedding":
    • learn seperated content and structure representation, requires post process to combine (suboptimal)
    • structure information requires neighborhood scope which is hard to decide when facing a complex problem
  • Proposed solution: sequence-to-sequence model based network embedding
    • content seq. –> node seq.

Representation / Feature Extractions are Main Research Topic of Applied DS Track

  • Chen et al. (Scalable optimization for embedding highly-dynamic and recency-sensitive data, 2018b)
  • Liang et al. (Dynamic embeddings for user profiling in twitter, 2018)
  • Grbovic/Cheng (Real-time personalization using embeddings for search ranking at airbnb, 2018)
  • Bai et al. (Scalable query n-gram embedding for improving matching and relevance in sponsored search, 2018)
  • Wang et al. (Billion-scale commodity embedding for e-commerce recommendation in alibaba, 2018)

All Papers of KDD 2018 (@arbor.ee.ntu.edu.tw is required)

Some Pictures

Some Pictures

Some Pictures

Some Pictures

Some Pictures

Thanks for Your Listening

Reference

Bai, Xiao; Ordentlich, Erik; Zhang, Yuanyuan; Feng, Andy; Ratnaparkhi, Adwait; Somvanshi, Reena; Tjahjadi, Aldi [Scalable query n-gram embedding for improving matching and relevance in sponsored search; 2018]: Scalable query n-gram embedding for improving matching and relevance in sponsored search, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 52–61, available at: http://doi.acm.org/10.1145/3219819.3219897.

Chen, Hongxu; Yin, Hongzhi; Wang, Weiqing; Wang, Hao; Nguyen, Quoc Viet Hung; Li, Xue [PME; 2018a]: PME: Projected metric embedding on heterogeneous networks for link prediction, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 1177–1186, available at: http://doi.acm.org/10.1145/3219819.3219986.

Chen, Xumin; Cui, Peng; Yi, Lingling; Yang, Shiqiang [Scalable optimization for embedding highly-dynamic and recency-sensitive data; 2018b]: Scalable optimization for embedding highly-dynamic and recency-sensitive data, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 130–138, available at: http://doi.acm.org/10.1145/3219819.3219898.

Cui, Peng; Wang, Xiao; Pei, Jian; Zhu, Wenwu [A survey on network embedding; 2017]: A survey on network embedding, in: CoRR, vol. abs/1711.08752, available at: http://arxiv.org/abs/1711.08752.

Donnat, Claire; Zitnik, Marinka; Hallac, David; Leskovec, Jure [Learning structural node embeddings via diffusion wavelets; 2018]: Learning structural node embeddings via diffusion wavelets, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 1320–1329, available at: http://doi.acm.org/10.1145/3219819.3220025.

Gao, Hongchang; Huang, Heng [Self-paced network embedding; 2018]: Self-paced network embedding, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 1406–1415, available at: http://doi.acm.org/10.1145/3219819.3220041.

Grbovic, Mihajlo; Cheng, Haibin [Real-time personalization using embeddings for search ranking at airbnb; 2018]: Real-time personalization using embeddings for search ranking at airbnb, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 311–320, available at: http://doi.acm.org/10.1145/3219819.3219885.

Liang, Shangsong; Zhang, Xiangliang; Ren, Zhaochun; Kanoulas, Evangelos [Dynamic embeddings for user profiling in twitter; 2018]: Dynamic embeddings for user profiling in twitter, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 1764–1773, available at: http://doi.acm.org/10.1145/3219819.3220043.

Liu, Jie; He, Zhicheng; Wei, Lai; Huang, Yalou [Content to node; 2018a]: Content to node: Self-translation network embedding, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 1794–1802, available at: http://doi.acm.org/10.1145/3219819.3219988.

Liu, Ninghao; Huang, Xiao; Li, Jundong; Hu, Xia [On interpretation of network embedding via taxonomy induction; 2018b]: On interpretation of network embedding via taxonomy induction, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 1812–1820, available at: http://doi.acm.org/10.1145/3219819.3220001.

Liu, Zemin; Zheng, Vincent W.; Zhao, Zhou; Li, Zhao; Yang, Hongxia; Wu, Minghui; Ying, Jing [Interactive paths embedding for semantic proximity search on heterogeneous graphs; 2018c]: Interactive paths embedding for semantic proximity search on heterogeneous graphs, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 1860–1869, available at: http://doi.acm.org/10.1145/3219819.3219953.

Ma, Jianxin; Cui, Peng; Wang, Xiao; Zhu, Wenwu [Hierarchical taxonomy aware network embedding; 2018]: Hierarchical taxonomy aware network embedding, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 1920–1929, available at: http://doi.acm.org/10.1145/3219819.3220062.

Shi, Yu; Zhu, Qi; Guo, Fang; Zhang, Chao; Han, Jiawei [Easing embedding learning by comprehensive transcription of heterogeneous information networks; 2018]: Easing embedding learning by comprehensive transcription of heterogeneous information networks, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 2190–2199, available at: http://doi.acm.org/10.1145/3219819.3220006.

Tu, Ke; Cui, Peng; Wang, Xiao; Yu, Philip S.; Zhu, Wenwu [Deep recursive network embedding with regular equivalence; 2018]: Deep recursive network embedding with regular equivalence, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 2357–2366, available at: http://doi.acm.org/10.1145/3219819.3220068.

Wang, Jizhe; Huang, Pipei; Zhao, Huan; Zhang, Zhibo; Zhao, Binqiang; Lee, Dik Lun [Billion-scale commodity embedding for e-commerce recommendation in alibaba; 2018]: Billion-scale commodity embedding for e-commerce recommendation in alibaba, in: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, KDD ’18. New York, NY, USA: ACM, pp. 839–848, available at: http://doi.acm.org/10.1145/3219819.3219869.