Predictive Text Model

Artem Larionov
09/25/2016

Introduction

The goal of this project is to develop a predictive text model and make user experience of typing better. It is especially important for mobile devices, where touch typing is not available.

The application analyses user's input and predict next word user is going to type.

Training Data

For training purposes, the SwiftKey Dataset was used. In particular, data for english language:

en_US.blogs.txt
en_US.news.txt
en_US.twitter.txt

Algorithm

Based on the exploratory analysis it was decided to use Stupid Back-off algorithm, which is presented by formula:

\[ P(\omega_{i}|\omega^{i-1}_{i-k+1})= \begin{cases} p(\omega^{i}_{i-k+1}),& \text{if } (\omega^{i}_{i-k+1}) \text{ is found}\\ \lambda(\omega^{i-1}_{i-k+1})P(\omega^{i}_{i-k+2}), & \text{otherwise} \end{cases} \]

where \( p(\cdot) \) are pre-computed and stored probabilities, and \( \lambda(\cdot) \) are back-off weights.

How to use

The application is easy to use: just start typing and the application will predict what you are going to type next.

if you finish typing with a letter, the application will try to find the possible current word
if you finish with a space, the application will try to find the next possible word

Predictive Text Model

Introduction

Training Data

Algorithm

How to use

Links