Word Prediction Application - Coursera Capstone

friedoutkombii
Dec 2017

Introduction

This app was created for the final project for the Capstone Course as part of the 10 module Data Science Specialisation through Coursera. The aim was to build an application that can predict the next likely word given an input of one or more words. The backend of this app is built on n-grams created from English blog & Twitter text samples that have been preprocessed.

The app accepts written input and quickly gives it's top 3 predictions. The prediction method used is the Stupid Backoff method applied across uni/bi/tri/quad - grams.

I have kept my design simple so we can focus on the word prediction itself and not the polish, I hope you have enjoyed my application.

The Application

alttext

Access & Use

  • The application can be accessed here
  • To use simply type in one or multiple words into the console.
  • The top 3 guesses will be returned for what the app predicts as the most likely next word.

Method

  • The underlying app is built on N-grams (from uni to quad-gram).
  • A Stupid Backoff algorithm is used as I aimed to create quick compute times.
  • This algorithm means we use an n-1 level ngram model prediction if the probability of the word occuring at n is 0.
  • The “Stupid”“ comes from the hardcoded multplication of odds at 0.4 when we jump back to a n-1 prediction model.