Swiftkey Project Presentation

Gayathri Nagarajan
12/01/2020

2.Introduction/Executive Summary

I have developed an App that would predict probable next 5 words based on what you type in the Input Box.The data used is only 10% of the Corpus given for the project due to memory,speed and processing constraints when hosting in shiny server( 1GB memory limitation).

For more details on my code,please visit my site

  • Get words in an input Text Box
  • Clean and process them
  • Lookup the model to pick topmost 5 probable next words

3.The Model

The Model is built based on the following process

  • Read in input text files in English language only.
  • Clean them, remove profanity words, split sentences into word tokens
  • Arrange them as 1-4 word combinations and store them in files so that we can read them in from nex time.
  • Every time read from the 2 gram file in specific and load to memory
  • Write the prediction function to look up the last word and return the probable next 5 words based on the 2 grams built.
  • If no match found ,perform a similar search in a website returning 5 probable next words

4. The App

This is a Web App built in R studio and hosted in Shiny server.Performance and memory were the main considerations to make this work.Thanks to all forums in coursera, Google search, wiki pages, stackflow community, R bloggers community ,you tube videos without which I could not have completed this project.

alt text

5. How to use my App

The link to the App is Here

Steps

  • Enter a word.It shows a data table with probable 5 words.
  • Start using them to type in your next word
  • The App will continue to show predicted next words based on the last word you typed.
  • This is more related to the 10% corpus used and hence limited to predictions.If no match from corpus,it performs search from website and returns 5 probable next words for my last word.It displays a message if no match found.