2024-12-15

How the Algorithm Works

  • Data Preparation: Cleaned, tokenized, and built n-grams (unigrams, bigrams, trigrams).
  • Prediction Model:
    • Uses the last one to three words of a sentence.
    • Searches for the highest-probability n-gram match.
  • Handling Unseen Words:
    • Backoff model assigns probabilities to unseen n-grams.
    • Ensures predictions even for rare word combinations.

App Description

  • Purpose: Predict the next word based on user input.
  • Features:
    • Input box for typing phrases.
    • Real-time next-word predictions displayed below.
  • Usage:
    1. Type a partial sentence into the input field.
    2. See predicted words ranked by likelihood.

Results and Performance

  • Accuracy:
    • 1st word: 80%
    • 2nd word: 70%
    • 3rd word: 60%
  • Efficiency:
    • Average response time: 0.2 seconds.
    • Model size: 5MB.

Conclusion and Future Work

  • Summary:
    • Developed a Shiny app for real-time word prediction.
    • Achieved good accuracy and efficiency.
  • Future Improvements:
    • Train on larger datasets for better predictions.
    • Support for additional languages.
  • Live App: Shiny App Link