- Aim is to create a data product real-time with low memorywhich is easy to use interface for word prediction similar to SwiftKey.
- Application tries to predict possible 5 best matching words.
- The source data used in this project are files (News, blogs, tweets) taken from the set of corpora provided by HC Corpora [http://www.corpora.heliohost.org]. Complete details bout the data can be found from http://www.corpora.heliohost.org/aboutcorpus.html. Actual source files are available at location <Data set can be obtained from Coursera at https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip.
- Only English language files are considered for this project.