To keep the scope of the project managable, I only concidered so called Markov models. They are a class of probabilistic models that assume we can predict the probability of some future unit without looking too far into the past. I based my model on the stupid backoff -algorithm. Despite its name, it actually performs quite well given very large data. Actually, almost as well as some more complex models.
Stupid backoff -algorithm centers around n-grams. They mean contiguous word sequences of length n. Selection of the size depends on the genre of text you are trying to predict. Higher n-grams are not always preferable for prediction. Also the computing and storage needs grow expotentially with that parameter. I chose to let user choose n-grams of lengts one, two and three,… up to twinty. This means that the predictions can be based on maximum 19 previous words.