Prabeeti Bulani
12/26/2019
The goal of this exercise is to create a product to highlight the prediction algorithm for Englisg text and to provide an interface that can be accessed by others. This presentation is created as the final step in the Capstone project for the Data Scientist specialization offered through Coursera / Johns Hopkins.
Source data for Project: https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip
Source Code:-GitHub at:https://github.com/prabeeti/CapstoneProject
For this project I have submitted the following:-
Data(English Text) is extracted from news, twitter and blogs to build a prediction algorithm. This data is provided by SwiftKey for an assignment. Following steps were taken:-
Katz back-off model is referred to model the next word prediction. This model is a generative n-gram language model that estimates the conditional probability of a word given its history in the n-gram. This process works as follows: