Assignment: Final Project Submission

26/09/2024

Introduction

Title: Word Prediction App.
Objective: To present an innovative app for predicting the next word based on user input.
AppTarget Audience: Businesses and researchers looking for advanced text prediction solutions.

This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

Features:
- User-friendly interface with a text input box.
- Predicts the next word when the user clicks “Predict Next Word”.
- Displays predictions based on an extensive dictionary.
Demo:
- Live demo of the app.
- Examples of word predictions in action.

Shiny apps: are contained in a single script called app.R. The script app.R lives in a directory (for example, newdir/) and the app can be run with runApp(“newdir”).
app.R: has three components:
- a user interface object,
- a server function,
- a call to the shinyApp function.
The user interface (ui) object controls the layout and appearance of your app. The server function contains the instructions that your computer needs to build your app. Finally the shinyApp function creates Shiny app objects from an explicit UI/server pair.

Install necessary packages and library.
Sample code shinyUI(fluidPage( # Application title titlePanel(“Predict Next Word”), h6(“it may take several seconds”), # Sidebar with a slider input for number of n gram sidebarLayout( sidebarPanel( sliderInput( inputId = “Ngram”, label = “Select N for Ngram:”, min = 1,max = 20,value = 3, step = 1), textInput(“inputString”, “Enter a partial sentence here”,value = ’‘), submitButton(“Submit”, icon(“refresh”))), mainPanel( h2(“Predicted Next Word”), strong(“Sentence Input:”), tagsstyle(type=′text/css′),textOutput(′text1′),strong(“SentenceswithNextWordCandidates:”),textOutput(“prediction”),strong(“Note:”),tags style(type=’text/css’), textOutput(‘text2’)))))

corpus <- readRDS(‘corpus.RData’)
BackoffModels <- function(n){ BackoffModel <<- list() for(i in 2:n){ BackoffModel[[paste(i,“grams”)]] <<- createNgrams(corpus,i)}}
createNgrams <- function(text, n){ ngram <- function(x) NGramTokenizer(x, Weka_control(min = n, max = n)) ngrams <- TermDocumentMatrix(text, control = list(tokenize = ngram)) ngrams_freq <- rowSums(as.matrix(ngrams)) ngrams_freq <- sort(ngrams_freq, decreasing = TRUE) ngrams_freq_df <- data.frame(word = names(ngrams_freq), freq=ngrams_freq) ngrams_freq_df}