Coursera Data Science Capstone Project--SwiftKey Input Prediction

Roy Wang
July 14th, 2016

Instructions

The goal of this project is to create a product to highlight the prediction algorithm. We developed a Shiny app that takes as input a phrase (multiple words) in a text box input and outputs a prediction of the next word. Here is the instruction of this project:

  • Algorithm used to make the prediction

  • Instructions of the APP

  • Tools and resources

Algorithm used to make the prediction

The main prediction algorithm of this App is called , n-grams.An n-gram looks at the last n words to predict the next word.The working process as below:

  • Cleaned the date(conversion to lowercase, removing punctuation, links, white space and ending words in English)
  • Tokenized the words
  • Aggregated bi-gram,tri-gram and quadgram term frequency matrices into frequency dictionaries.
  • Predict the next word in connection with the text input

Instructions of the APP

The UI of the APP includes 3 parts:

  • Phase input: Enter text into the area

  • Display the input words

  • Prediction: it will display the next word of the inputs

Tools and resources