Project: Smart Keyboard
Dr. Senthil
26th September 2019
Overview
Targeted Group: Mobile devices user
Objective : A keyboard which predicts the next word while typing
Methodology : Training Set Data Cleaning, Analyzing, building and sampling for a predictive text model
Data Cleaning
Data cleaning Steps on the Training Dataset :
Remove non ASCII characters, numbers and punctuation from
Convert lowercase to uppercase
Strip white spaces.
Predictive Text - Alogrithm
Alogrithm use for Predictive Text is N-gram. It is used to compute P(w/h), the probability of a word w given some history h.
In the application it Checks first for the highest-order (Trigram) n-gram or else the next lower order (Bigram);
Predictive Text - Application
Input: User to enter text/word/sentence in the left handside text box
Output: User can see the predicted text in the right side box
Application also shows the N-Gram model currently used to predict the next word
Further Scope
Current algorithm is limited to the contextual information till 3-grams
To training dataset A larger dataset/ across regions
To incorporate clustering to the underlying training corpus/data and predicting the cluster of the sentence typed.