Next Word Prediction Application

John Letteboer
May 17 2019

Introduction

Final application submission for the Johns Hopkins Coursera Data Science Capstone

  • Next Word Prediction Application is created using Swiftkey datasets of twitter, news, and blog natural language text
  • This application uses the N-gram model of natural language processing of a sentence and will try to “guess” the next word
  • The prediction model is based on the Pentagram Stupid Backoff model
  • There are limitations, I was only able to use 10% of the data
  • Shows the top 8 possibilities

How to use the application

The application is straightfoward, the user only need to input a string of text and the application will try to predict the next word.

app

The Stupid Backoff model

  • N-gram model with “Stupid Backoff” (Brants et al 2007)
  • Checks if highest-order (this case n=5) N-gram has been seen. If not it will degrades to a lower model (n=4, n=3 and n=2)
    flow

References