Final Project Submission - Text Input Prediction

Carlos Rios
11/20/2020

Welcome to Final Project!

1- Description of the Project.

The goal of this exercise is to create a product to highlight the prediction algorithm that you have built and to provide an interface that can be accessed by others.

2- The Cleansing data.

First of all we need to:

  • Covert all phrases to lower-case
  • Remove strings chains
  • Replace all non alphanumeric letters with space
  • Remove excessive spaces
  • Remove missing data
  • Split text at space to get 1-gram dictionary.

Libraries to tokenize the text (omitting stopwords). For twitter text we could use function tokenize_tweets().

Exploratory analysis and Explication

Is possible to use different ways, like Twitter.

The data file, In order to build a function that can provide word-prediction, a predictive model is needed. Such models use known content to predict unknown content. For this package, that content comes from the HC Corpora collection, which is “a collection of corpora for various languages freely available to download.

The version used was obtained from an archive maintained at Coursera. The file included three text document collections, blogs, news feeds, and tweets, in four languages, German, English, Finnish, and Russian, of which only the English collections were used.

Details

Prediction Model According to “an n-gram is a contiguous sequence of n items from a given sequence of text or speech.” This package takes a key word or phrase, matches that key to the most frequent n-1 term found in a TDM of n-word terms, and returns the nth word of that item.

You can find the app in the link below. : )

RestoreWorkspace: Default SaveWorkspace: Default AlwaysSaveHistory: Default EnableCodeIndexing: Yes UseSpacesForTab: Yes NumSpacesForTab: 4 Encoding: UTF-8 RnwWeave: knitr LaTeX: pdfLaTeX AutoAppendNewline: Yes StripTrailingWhitespace: Yes BuildType: Package PackageUseDevtools: Yes PackageInstallArgs: –no-multiarch –with-keep.source PackageRoxygenize: rd,collate,namespace

Thank you very much!! - Muchas gracias!!