2024-04-08

Introduction

  • Brief Introduction to the Problem:
    • Predicting the next word in a sequence of text.
  • Importance:
    • Enhancing text prediction accuracy has wide-ranging applications in natural language processing.
  • Purpose:
    • Introduce our innovative solution.

Algorithm Description

  • Overview of the Algorithm Used:
    • We utilized n-gram models for next word prediction.
  • Explanation of N-gram Models:
    • N-gram models analyze sequences of N words to predict the next word.
    • Modern methods and libraries (tokenizers, SnowballC) was used to create n-grams
# Read and preprocess text data
text <- readLines("pg73352.txt")
clean_text <- tolower(text)
clean_text <- gsub("[[:punct:]]", "", clean_text)
clean_text <- clean_text[clean_text != ""]

# Tokenize the text into unigrams, bigrams, and trigrams
trigrams <- unlist(tokenize_ngrams(clean_text, n = 3))

# Function to build n-gram model
build_ngram_model <- function(ngram_data) {
  ngram_freq <- table(ngram_data)
  return(ngram_freq)
}

App Description and Functionality

  • Datasets:
  • Overview of the Shiny App:
    • Our Shiny app allows users to input a phrase and receive a prediction for the next word.
  • Instructions:
    • Simply enter your phrase in the text box and click ‘Predict’.
  • Demonstration:
    • Screenshots or GIF demonstrating how the app functions.

User Experience and Conclusion

  • User Experience:
  • Conclusion:
    • Our app offers a user-friendly interface and accurate predictions, paving the way for enhanced text analysis.