Christopher Stewart
April 26, 2015
A Natural Language Processing Project
Capstone Project for Coursera Data Science Specialization
One found the largest match with the highest likelihood score, a version of the Katz' back-off model.
The second combined the likelihood estimates of all matches and kept the match with the highest combined likelihood in a linear interpolation approach.
Results showed similar in-sample accuracy rates. Predict-A-Word uses a back-off approach for computational efficiency.
lookup <- function (corpus.type, string) {
if (corpus.type == "blogs") {lookup.blogs(string)
if (exists("tetra.target")) {target <<- tetra.target
} else if (exists("tri.target")) {target <<- tri.target
} else if (exists("bi.target")) {target <<- bi.target
}
}
[example of server-side function used to accomplish backoff]