August 2024
Slide 1: Introduction
Predictive Text Algorithm and Application
- Objective: Build a predictive text application using a statistical language model.
- Key Features:
- Predicts the next word based on user input.
- Utilizes bigrams and trigrams for accurate predictions.
- Trained on large text corpora including blogs, news, and tweets.
Slide 2: The Algorithm
How It Works
- Data Collection:
- Sampled lines from “en_US.blogs.txt”, “en_US.news.txt”, and “en_US.twitter.txt”.
- N-gram Model:
- Bigrams: Predict the next word based on the previous word.
- Trigrams: Predict the next word based on the previous two words.
- Prediction Mechanism:
- If input has 1 word, use bigrams.
- If input has 2+ words, use trigrams (fallback to bigrams if no match).
Slide 3: The Shiny App
Key Features of the App
- User Interface:
- Simple text input field for entering phrases.
- Displays predicted next word(s) based on input.
- Backend:
- Loads pre-trained bigrams and trigrams from
.rds files.
- Efficiently processes user input to provide real-time predictions.
Slide 4: Benefits and Applications
Why This Matters
- Improves User Experience:
- Provides accurate and contextually relevant word predictions.
- Applications:
- Can be used in text completion tools, chatbots, and writing aids.
- Future Enhancements:
- Extend model to handle more complex n-gram combinations.
- Incorporate user feedback for continuous improvement.
Slide 5: Conclusion and Next Steps
Summary and Future Directions
- Summary:
- Developed a predictive text app using bigrams and trigrams.
- Successfully deployed a Shiny app for real-time predictions.
- Next Steps:
- Gather user feedback and performance data.
- Explore additional datasets and advanced algorithms for enhancement.
- Call to Action:
- Invest in further development and deployment to maximize impact.