Introduction.

This presentation is created to present the final assignment for the Data Sciences Capstone Course, from Coursera course.

The goal of the project is to build a predictive text model. The model will be implemented with a shiny app UI that will predict the next word as the user types a sentence.

It is intended to simulate the functionality of the keyboards of current smartphones, which use the technology of the Swiftkey company.

Downloadig and Getting Data.

To build the app and the predictive model, a series of steps were followed that were:

  • Downloading the data from the source.
  • Cleaning and processing data is the most important step, to set data to the form it can be treat, and remove several things such as white spaces,punctuation, numbers and more. You can Review more here.
  • Build the correspondind n-grams, and saving them to save computational effort and time.
  • The user input terms are extracted from the N-Grams and sorted according to the frequency in descending order, to predict the next possible words.
  • Build the Shiny App and deploy it.

Katz Backoff Model.

The Shiny App model was built taking into account the atz Backoff Model.

It is a generative n-gram language model that estimates the conditional probability of a word given its history in the n-gram. It accomplishes this estimation by backing off through progressively shorter history models under certain conditions[1]. By doing so, the model with the most reliable information about a given history is used to provide the better results. (Source)

Predict2Word App.

Links.