Coursera Data Science Capstone Project
Howard Tsang
8th September 2017
This is an introduction to my application for predicting the next word your may type.
The application is the capstone project for the Coursera Data Science specialization held by professors of the Johns Hopkins University and in cooperation with SwiftKey.
Prediction Algorithm
The use of Markov Chain as the primary modeling algorithm.
Using the TM package for cleaning and compressing of the raw text.
Instructions
User input sentence is truncate to the last 1 to 3 words.
The application outputs the top two to five words.
Experience of Application
User Interface is buit in Shiny server.
Markov Chain
Markov Chain Model
represents each word as transitional state with probability, for example, word “B” has 30% chance to transit to word “A”.
Markov Chain Model
assume the probability of the next word depends only on the previous k words. It can thus be used for describing systems that follow a chain of linked events, where what happens next depends only on the current state of the system.
Users type their sentence into the text field of app and set the number of next-words the system will suggested. The app will truncate the last 1-3 words as input of predictive algorithm.
Next word prediction is showed in Output panel
Give it a try here!