Predictive Text Model

Artem Larionov
09/25/2016

Introduction

The goal of this project is to develop a predictive text model and make user experience of typing better. It is especially important for mobile devices, where touch typing is not available.

The application analyses user's input and predict next word user is going to type.

Training Data

For training purposes, the SwiftKey Dataset was used. In particular, data for english language:

  • en_US.blogs.txt
  • en_US.news.txt
  • en_US.twitter.txt

Algorithm

Based on the exploratory analysis it was decided to use Stupid Back-off algorithm, which is presented by formula:

\[ P(\omega_{i}|\omega^{i-1}_{i-k+1})= \begin{cases} p(\omega^{i}_{i-k+1}),& \text{if } (\omega^{i}_{i-k+1}) \text{ is found}\\ \lambda(\omega^{i-1}_{i-k+1})P(\omega^{i}_{i-k+2}), & \text{otherwise} \end{cases} \]

where \( p(\cdot) \) are pre-computed and stored probabilities, and \( \lambda(\cdot) \) are back-off weights.

How to use

The application is easy to use: just start typing and the application will predict what you are going to type next.

  • if you finish typing with a letter, the application will try to find the possible current word
  • if you finish with a space, the application will try to find the next possible word
predicting the possible current predicting the possible next word

Links