NXT Word Prediction App

  • Data Science Capstone Project
  • April 2015

Introduction

The Coursera Data Science Capstone project teamed up with Swiftkey corporation. SwiftKey builds smart keyboard predictive text models that makes it easier for people to type on their mobile devices. The goal of our project:

  • Using data files provided that containd Twitter, Blogs, and News data in multiple languages (used English)
  • Smart sampling of relevant information
  • Tokenized and filtered for profanity
  • Deeper analysis of word frequencies and word pairs
  • Build n-gram models
  • Build and evaluate a variety of predictive models
  • Build a data product to showcase our predictive algorithm

NXT Word Prediction Language Modeling

  • Used Stupid Back-Off Language Model
  • Also tried Kneser-Ney smoothing and Interpolated Kneser-Ney smoothing
  • Interpolated Kneser-Ney smoothing performed better than modified Kneser-Ney smoothing

NXT Word Prediction App - Instructions

Using the NXT Word Prediction App is easy as 1 2 …

  • When you first start the App it shows the text “This is so much” and its top 3 picks for the next predicted word.
  • It takes about 5-7 seconds for the whole App to load.
  • You can then press the Reset button and enter your own text and it will predict the top 3 picks for the next predicted word.
  • or you can press the Reset button, then select the number of Words to Generate, and then press the Generate button to see the results.

Next Step - NXT Word App Description

The Shiny App has a one Text Field for entering Word(s) in order to predict the top 3 picks for the next predicted word based on the prediction model developed.

Here is the link to the App