Baguinebie Bazongo
21/04/2015
The purpose of our project is to build a predictive model to predict a single word given a previous words or phase entered as input.
The objectifs needed to achieve the purpose are:
We collected training dataset from publicly available sources such as newspaper, personal blog and Tweeter ;
We selected a random sample from training data sets ;
We cleaned the data with R tm package and build a 1-gram, 2-gram and 3-gram data.frames ;
We applied Maximum likelihood algorithm to compute individual n-gram probablities and selected word with the highest probability.
The model we built use 2 inputs to produce 2 outputs:
The inputs are:
The ouputs are: