Zhenning Xu
11.03
The goal of this project is to create a dashboard (app) to provide an interface that can be used to make predictions based on natural language processing algorithms. This slide deck consists of slides pitching the algorithm and the app.
For this iteration of the project, the data used is actually from a real-world app - SwiftKey (http://swiftkey.com/en/). The purpose of this project is to understand and build predictive text models like those used by SwiftKey.
My app detects words typed and predicts the most likey word(s) within seconds. Companies like Microsoft (Swiftkey's current owner) have recently introduced a popular app that uses NLP algorithms to predict the words we will write and offers sentence completion suggestions accordingly. See the following screenshot (https://www.microsoft.com/en-us/swiftkey?rtc=1&activetab=pivot_1:primaryr2):
The app includes the following feature:
These data were tokenized 3 times using 1-gram to 3-gram calculations using RWeka.
The algorithm predicts the next word based on the last 3 text inputs the user entered then starts to search using the 3-gram. If the next word isn't predicted, it selects the 2-gram, then 1-gram. If nothing is found it falls back to a “default” of the word most often seen.
Please feel free to browse over the shiny app here: https://utjimmyx.shinyapps.io/shinynlp/.
library(imager)
myimg <- load.image("C:/Users/zxu3/Documents/R/shiny/nlp.png")
plot(myimg)