1. Capstone Project

Marcelo Gomes Gadelha
10/28/2019

2. The Objective

  • This presentation shows a overview of the final project of Data Science Specialization.
  • The main objective of this project is create a aplication with predictive text model.
  • For this task, it was necessary to create an aplication that receives a phrase and returns a prediction word.

3. Methodology

  • One predict text model was created using texts from twitter, blogs and news.
  • An exploratory analysis was performed to understand the size and distribution of the data.
  • Then n-gram ( bi-,tri- and quadgram) were created to understand the most common relationships between words.
  • This model has been optimized to return data efficiently and accurately based in statitiscs and quantity of ngram.
  • Finally, a shiny application was created to get a sentence and return a prediction of possible word.
  • The user interface of this application was designed with Web Search Style in mind.

4. Application

  • (1) First enter the phrase in the text field,
  • (2) and after click in “Go!” button or press “Space” in keyboard,
  • (3) the predict word is show at right.

Application Screenshot

5. Additional Information

  • The app is hosted on shinyapps.io:

https://marcelox3010.shinyapps.io/nextword

  • The code of this application and this presentation is hosted on:

https://github.com/marcelox3010/data_science_capstone_johns_hopkins_coursera