Final Report of Swiftkey capstone project

Geeta Nain
January 25 16

Introduction

A shiny Application is developed for next word prediction algorithm as final assignment in Capstone project for data science specialization.

This presentation is intended to highlight functionality of developed shiny app during course project.

Datasets and summary overview

This application is based on the swiftkey database, which can be downloaded from this link [https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip]

-Milestone report [http://api.rpubs.com/gnain/139327] provides summary statistics of the data set used to build the text predictive model.

-The final Shiny app was set up based on the obtained ngrams from the data and a probabilty calculating algorithm.

Concept behind probability calculated algorithm

-Prediction is made by ranking the probability of word choices based on last 3 words, 2 words and 1 words.

-so combinations of 3-ngrams, 2-ngrams , 1-ngrams is being used for final preiction of next word based on Good Turing smoothening and Multiple likelihood model techniques.

How app works?

  • Type Input sentence in user input box
  • Wait till it process gets completed by 100 %
  • Click on update word cloud to see different probability of predicted word
  • Predicted words can be viewed on deshboard with 4 more top results in decresing probability order.

This App can be further utilized to explore long distance discrepancies in sentances.