Capstone Project Presentation

Prabeeti Bulani

12/26/2019

Introduction

The goal of this exercise is to create a product to highlight the prediction algorithm for Englisg text and to provide an interface that can be accessed by others. This presentation is created as the final step in the Capstone project for the Data Scientist specialization offered through Coursera / Johns Hopkins.

Data Exploration and building n-gram model

Data(English Text) is extracted from news, twitter and blogs to build a prediction algorithm. This data is provided by SwiftKey for an assignment. Following steps were taken:-

Word Prediction

Katz back-off model is referred to model the next word prediction. This model is a generative n-gram language model that estimates the conditional probability of a word given its history in the n-gram. This process works as follows:

Shiny App Screenshot