Zach Colburn
March 25, 2017
My objective in developing this application was to enable software-assisted typing. Specifically, the objective was to use the text entered by a user to predict the word that user would type next. This project can be divided into three parts: data collection, model development, and model evaluation.
Part 1 - Data collection
Data was acquired from HC Corpora (https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip). The data consists of text scraped from:
Model development consisted of four key steps:
The model shown on the next slide takes the user's input text and returns the specified number of predictions, as well as statistics concerning the frequency of those prefixes and suffixes.
Input text of the indicated length was used to predict the following word. The successful prediction rate, given the indicated number of allowed guesses, is reported as a percent.
| Prediction 1 | Prediction 2 | Prediction 3 | |
|---|---|---|---|
| Prefix length 1 | 5.91 | 50.20 | 82.46 |
| Prefix length 2 | 8.04 | 12.96 | 19.24 |
| Prefix length 3 | 5.40 | 8.76 | 12.26 |
| Prefix length 4 | 5.35 | 8.41 | 11.76 |
Application: https://zcolburn.shinyapps.io/predictive_typing_application/
Documentation: https://github.com/zcolburn/predictiveTypingApplication