Suman
10/7/2017
This is assignment of coursera datascience capstone project to find next predictive word.
The Capstone is a cooperation between Coursera and SwiftKey company.
Solution approach: Soultion is done in 3 parts
Code: Code is available at : https://github.com/suman12345678/datasciencecapstone
App : Application can be access at https://suman123456.shinyapps.io/shiny/
Presentation : http://rpubs.com/suman12345678/capstoneproj
setwd("C:\\Users\\suman\\Desktop\\datasciencecoursera\\capstone\\shiny")
#nextword<-predictnextword("hello good",2)
This is just a sample of .15 mil raw data so there might be very less phrases
There is capacity for 8GB ram to hold and process this much data and make balance between speed and size
The application might be bit slow so please keep patience
Stopwords are removed so user shouldnot expect any artical/preposition/aux verbs as predicting word
The Capstone is a cooperation between Coursera and SwiftKey company.
Maybe stemword can be used to ignore non english and short word character
It can be integrated with spark to process huge data
Probablly news and blogs data are more gramaticall correct so should take more sample from those