Data Science Capstone Project

Justin E. Rice

This slide deck will describe the word predictor Shiny App. The project is offerred with the corporation of Swiftkey who is the leader in language prediction technology. Their mission is to enhance the interaction between people and technology.

Explanation & Purpose of Word Predictor Technology

Predictive text is an input technology used where one key or button is pressed and used to hint on words. Predictive text could allow for an entire word to be input by single keypress. Predictive text makes efficient use of fewer device keys to input writing into a text message, an e-mail, an address book, a calendar, and the like.

This type of application could benefits the billions of mobile phone users around the world as they use their devices to for social networking, banking, etc. As stated by Swiftkey “technology should adapt to you and not the other way around”.

Development Procedures

There were various key points to consider while developing. For example, What size sample will be effective? What tokens to construct? Below are an outline of the steps taken:

  • Data Collection
  • Data Cleaning
  • Exploratory Analysis
  • Create n-Gram Tokenization
  • Build the predition Model
  • Create the Shiny App

The Application

A preview of the appplication can be seen below or by clicking here. The user interface is rather simple. The left panel is used for input text from the user and the predictions are in the right panel.

s

Future Considerations

  • Further work can be done to develop the tokenization. The n-Grams used here is based on a sample of extremely clean text. Therefore profanity, numbers and puntuation were removed. A more thorough predictive app would consider predictions which include punctuations and numbers.

  • Another point to consider is an additive predictive model. The model used here is rather static. It would improve user friendliness if the user's activity is stored and used to make future predictions.