Data Science Project: Swiftkey Text Prediction

Azrul Syaffq
25 April 2016

Concept

  • Prediction Algorithm
    • Efficient Modelling - Markov Chain / Katz back-off model
    • Cleaned/Compressed Datasets - 40k data
  • Instructions
    • Input - Sentence (truncate the last 1~4 words)
    • Output - Top frequency words and Wordcloud visualization
  • Experience of Application
    • User Interface (UI) - Shiny Apps

Prediction Method

  • Markov Chain
    A mathematical system that undergoes transitions from one state to another on a state space. Markov Chain
  • Katz back-off
    A generative ngram language model that estimates the conditional probability of a word given its history. Katz backoff

Instructions

Input

Users need to keyin the word into the text field. System will generate the last 1 to 3 words and take them as input of predictive algorithm.

Input

Output

  • Word prediction
    Prediction
  • Wordcloud
    Wordcloud

Experience of App

App Constructure

R RStudio Shiny

User Interface

UI Design