Introduction

This app predicts the next word using Trigram models trained on SwiftKey dataset.

Data and Preprocessing

  • Data: Blogs, News, Twitter
  • Cleaned using regex, stopwords removed
  • Tokenized into trigrams

Algorithm

  • Used most frequent trigram prediction
  • Used dplyr for filtering
  • No deep learning (lightweight)

App Demo

Summary

  • Fast and simple trigram-based app
  • Can be improved with bigram fallback and smoothing