28/9/2020

Problem Description

Around the world, people are spending an increasing amount of time on their mobile devices for email, social networking, banking and a whole range of other activities. But typing on mobile devices can be a serious pain. For that reason, smart keyboard are widely used.

In this Capstone Project, the goal is to create a prediction model for the next word in a sentence. To do that, we have three files with sentences in English from twitter, blogs and news. Because this three files are very big, I will take only the 5% of each of them.

How the algorithm works?

To predict, the algorithm follow the next steps:

  • Transform the sentence (convert to lowercase, remove punctuation, etc)
  • Take the last words in the clean sentence
  • Find the most common word after these last words

Example:

\(\rightarrow\) Input: “Hi, I’m your best”

\(\rightarrow\) Transform: “hi im your best”

\(\rightarrow\) Last Words: “your best”

\(\rightarrow\) Prediction: “friend”

How the app works?

The app is really friendly user, you only need to:

  • Introduce an incomplete sentence
  • Click on submit
Shiny App

Shiny App

Final Comments

In this project, I tried to use a “from scratch” approach. I didn’t use any library different than the base package. I know that there is some libraries to work with n-grams and get a better accuracy but that would have been take the easiest way and I wanted to do something different and to increase the difficult level. I hope I got it right.