Project: Smart Keyboard

Dr. Senthil
26th September 2019

Overview

  • Targeted Group: Mobile devices user
  • Objective : A keyboard which predicts the next word while typing
  • Methodology : Training Set Data Cleaning, Analyzing, building and sampling for a predictive text model

Data Cleaning

Data cleaning Steps on the Training Dataset :

  • Remove non ASCII characters, numbers and punctuation from
  • Convert lowercase to uppercase
  • Strip white spaces.

Predictive Text - Alogrithm

  • Alogrithm use for Predictive Text is N-gram. It is used to compute P(w/h), the probability of a word w given some history h.
  • In the application it Checks first for the highest-order (Trigram) n-gram or else the next lower order (Bigram);

Predictive Text - Application

  • Input: User to enter text/word/sentence in the left handside text box
  • Output: User can see the predicted text in the right side box
  • Application also shows the N-Gram model currently used to predict the next word

Further Scope

  • Current algorithm is limited to the contextual information till 3-grams
  • To training dataset A larger dataset/ across regions
  • To incorporate clustering to the underlying training corpus/data and predicting the cluster of the sentence typed.