Coursera Data Science Capstone Project

Next Word Prediction Application

Howard Tsang
8th September 2017

This is an introduction to my application for predicting the next word your may type.

The application is the capstone project for the Coursera Data Science specialization held by professors of the Johns Hopkins University and in cooperation with SwiftKey.

Objective

  • Prediction Algorithm

    The use of Markov Chain as the primary modeling algorithm.

    Using the TM package for cleaning and compressing of the raw text.

  • Instructions

    User input sentence is truncate to the last 1 to 3 words.

    The application outputs the top two to five words.

  • Experience of Application

    User Interface is buit in Shiny server.

Prediction Algorithm

  • Markov Chain

    Markov Chain Model represents each word as transitional state with probability, for example, word “B” has 30% chance to transit to word “A”.

    Markov Chain Model assume the probability of the next word depends only on the previous k words. It can thus be used for describing systems that follow a chain of linked events, where what happens next depends only on the current state of the system.

Instruction

Users type their sentence into the text field of app and set the number of next-words the system will suggested. The app will truncate the last 1-3 words as input of predictive algorithm.

Next word prediction is showed in Output panel

Next ?

Give it a try here!