Future Enhancements
There are several enhancements that would add quite a bit of lift to this algorithm (future development opportunities):
1. Partial word completion - The data is already processed and available for this addition, but time would need to be spent on implementing an additional “character” based token search for the partially typed word instead of just the “word” based token search in the current implementation. Additional re-tuning of my customized “Katz's back-off model” would then be needed.
2. Expansion to a include a much
larger Corpus and additional topics areas (i.e. Wikipedia, Google, etc…).
3. Inclusion of spelling
auto-correction for mistyped words.
4. Inclusion of word
auto-capitalization for proper nouns.
5. Features that can be added at a cost. This will develop a
Revenue Stream ($$) as the user base grows! :)