August 2024

Slide 1: Introduction

Predictive Text Algorithm and Application

  • Objective: Build a predictive text application using a statistical language model.
  • Key Features:
    • Predicts the next word based on user input.
    • Utilizes bigrams and trigrams for accurate predictions.
    • Trained on large text corpora including blogs, news, and tweets.

Slide 2: The Algorithm

How It Works

  • Data Collection:
    • Sampled lines from “en_US.blogs.txt”, “en_US.news.txt”, and “en_US.twitter.txt”.
  • N-gram Model:
    • Bigrams: Predict the next word based on the previous word.
    • Trigrams: Predict the next word based on the previous two words.
  • Prediction Mechanism:
    • If input has 1 word, use bigrams.
    • If input has 2+ words, use trigrams (fallback to bigrams if no match).

Slide 3: The Shiny App

Key Features of the App

  • User Interface:
    • Simple text input field for entering phrases.
    • Displays predicted next word(s) based on input.
  • Backend:
    • Loads pre-trained bigrams and trigrams from .rds files.
    • Efficiently processes user input to provide real-time predictions.

Slide 4: Benefits and Applications

Why This Matters

  • Improves User Experience:
    • Provides accurate and contextually relevant word predictions.
  • Applications:
    • Can be used in text completion tools, chatbots, and writing aids.
  • Future Enhancements:
    • Extend model to handle more complex n-gram combinations.
    • Incorporate user feedback for continuous improvement.

Slide 5: Conclusion and Next Steps

Summary and Future Directions

  • Summary:
    • Developed a predictive text app using bigrams and trigrams.
    • Successfully deployed a Shiny app for real-time predictions.
  • Next Steps:
    • Gather user feedback and performance data.
    • Explore additional datasets and advanced algorithms for enhancement.
  • Call to Action:
    • Invest in further development and deployment to maximize impact.