2022-10-09

Introduction

  • Mobile phones are increasingly used to communicate through emails, text messages, and/or social media
  • To make typing on mobile phones easier, smart keyboards that use models to predict the next word have been developed
  • Creating such an app is the Capstone project of Coursera’s Data Science Specialization
  • This project required researching NLP (Natural Language Processing) techniques for processing text
  • Project deliverable is a prediction model using the SwiftKey data files to predict a user’s next word
  • The SwiftKey data used can be found here.
  • For information on the raw data and text processing methods I used, see the Milestone Report

Markov Chain Models with Back-Off

  • Markov-Chain models use n-grams - word strings of ‘n’ length - to predict the next word
  • Typical algorithms check for the probable next word by using the largest n-gram model based on entered text
  • If no prediction found, the algorithm processes smaller n-gram models until a word is found
  • This is known as the Back-Off method

Testing

  • Test data was processed using the same methods as training data
  • 20 tests were ran - each test randomly selecting 50 lines from test data - for a total of 1000 individual tests
  • Input was processed by all n-gram models equal to and less than the text length entered
  • Results were compared to the actual next word in the test data
  • Sample results for one 50-line test are shown below

Testing (Cont’d)

  • The 1st table shows the end result of 5 50-line tests
    • PercNgCorrect = NoNgCorrect/NoNgPredicted
  • The second table shows the overall results for all 20 test runs

Shiny App

  • Seeing the different words predicted by each model during testing was interesting to me
  • Thinking others may find it interesting as well, I decided to return the same information with my app
  • My Shiny App will show you what each n-gram model predicts based on the text you submit
  • Give it a try and see which n-gram model looks the most accurate to you
  • My Shiny App