Data Science Capstone Final Project

Olusola Afuwape
21st February, 2020

Overview

The data science capstone final project is to create an application for words prediction. The application consists of the algorithm for the prediction and a user interface for the execution. The interface is hosted by Shiny

Algorithm

The algorithm makes use of Coursera Swiftkey text data files more specifically news.txt and twitter.txt files. Sample data from these files were employed for words prediction. Data cleansing and N gram tools were involved to determine unigram, bigram, trigram and quadgram. The algorithm produced an interface for text input and text options.

Application description

The application is made up of interface for text input and text prediction. The interface consists of a text box for the inputted word or typed words and another text field to display the predictions. When a user inputs a word into the text box the application displays a drop down list with word options for user to choose by clicking on an option. There is also a button or tab to clear off the predicted word. The application also displays the ngrams, frequency and proportion of the predicted words.

Graphical visualization

plot of chunk news