Shiny Application on Next Word Prediction

Mohamed Chenini
2018-06-13

Overview

This presentation is a project which is part of the Coursera Data Science Specialization course: Data Science Capstone.

It consists of the following:

  • A Shiny Application hat takes as input a phrase (multiple words) in a text box input and outputs a prediction of the next word.
  • A presentation using the Rstudio Presenter which youy are playing now.

Shiny Application

The Shiny application is composed by three R files

  • ui.R that control the USER interface

  • server.R that get input (from ui.R) and makes a call to the ngram.R function.w

  • ngram.R function that determines which ngram function to use based on the number of words entered, in order to predict the next word.

How the Application works

The N-Grams Data frame files were created in the Milestone Project of this course are loaded first and they are:

  • BiWords <- readRDS(“biWords.rds”)
  • TriWords <- readRDS(“triWords.rds”)
  • QuadWords <- readRDS(“quadWords.rds”)

The “next word prediction” algorithm works like this:

  • The User input words are cleaned and transformed to lower case.
  • quadgram is first used if the number of input words is 3,
  • ifelse the number of input words is 2 then trigram is used,
  • else bigram is used.

The application can be accessed on the shinyapps.io website at Next Word Predict