June 2025
Slide 1: Introduction
Slide 2: The Algorithm
- N-gram Model: 3-grams with 2-gram back-off
- Data Processing:
- Used full text (~0.6MB) from Sherlock Holmes
- Cleaned: lowercase, removed punctuation/numbers, Gutenberg header/footer
- Robust handling of empty lines and corpus issues
- Why It Works: Fast, lightweight, suited for narrative text
Slide 3: App Functionality
Slide 4: User Experience
- Ease of Use: Type phrase, click, see prediction
- Testing: Predicted next word for 5 phrases:
- “it was a” → “very”
- “he said” → “that”
- “the door” → “was”
- “i have” → “a”
- “in the” → “room”
- Feedback: Intuitive, reliable for narrative-style phrases
Slide 5: Why Hire Me?
- Novelty: Compact N-gram model optimized for small datasets
- Skills: R, Shiny, NLP, data preprocessing
- Impact: Ready for integration into real-world applications
- Hire Me: I bring technical expertise and innovation to your data science startup!