Final Project Submission , Next Word Prediction

Norhan Osama Abdallah
June 6th , 2019

Instructions

The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others. For this project you must submit:

Data Use in Project: Capstone Dataset

The corpora are collected from publicly available sources by a web crawler. The crawler checks for language, so as to mainly get texts consisting of the desired language.

  • Since the data is large. This project will use random 5% data from Blogs, Twitter, News.

  • For easier analysis, the data remove number and punctuation.

Prediction model

  • Create a N-Gram algorithm. 2-Gram, 3-Gram and 4-Gram will use as base for the analysis

  • Count the frequency for N-Gram and get the top three for the prediction.

Final output

Enter the sentence in the text input. You will see the result in the right Panel

You can find the app in the below link
https://bnaly.shinyapps.io/predict-words/

alt text