Capstone SwiftKey Project--Next Word Prediction

Robert Jeenchen Chen
2020-01-15

Overview

Coursera Data Science John Hopkins Capstone Project

  • Natural language rocessing (NLP) and artificial intelligence (AI)
  • You input the text, and we predict the next word.
  • Shiny Web App
  • Pitch Slides 5-pages

Methods

  • Prediction algorithm: N-Grams
  • Based on a subset of the given twitter, news and blog data
  • N-Grams for 1 up tp 5 words were generated and saved as RData with the saveRDS function
  • The app loads the saved models and uses them to make predictions

Prediction Algorithm

  • The algorithm makes 5 suggestions for the next word
  • It starts by using the most common 5 words from the training data as predictions/ recommendations
  • Then it checks if there is one word typed inside the input field and whether there are predictions from the 2-Word-N-Gram for the given word
  • And so on… Until 5-Word-N-Gram
  • If the algorithm finds predictions from higher N-Grams it overwrites the old predictions
  • If no prediction is found the most common words will be displayed as the suggestions

Shiny Web App

  • The Shiny App has a Input-Text-Field for the input data and 5 buttons for the next word recommendations
  • When the user types text into the input field the recommendations get automatically updated
  • When the user clicks a button the recommendation is added to the input data (with a space if there isn't one)
  • After clicking a button the input field is refocused so the user can continue to type without having to click on the input field again

Links and Sources