Markov Babbler Web App

An exercise in generating arbitrary strings of text.

Anand Abraham

The App

This app takes in source material and uses it to construct arbitrary phrases based on word proximity. Given a sufficiently large corpus of text, it can be used to imitate a person's writing style. While the utterances generated are rarely grammatical, occasionally they do make a bit of sense.

Users can specify the length of the generated phrases as well as how complex a model they would like to use. Higher-complexity models will require more source material to not be repetitive and will generally be closer reproductions of the source material.

How to Use it

In order to use the app, you must paste a body of text into the "Source Text" box. The more text given to train the generator, the better your result will be. You can adjust the length of your result, and also what sort of N-grams to use. The default is 2, but this number can be raised or lowered for your convenience.

App Image

How it Works

This app makes use of the "ngram" package available on CRAN. This package breaks up long strings of text into ngrams and also has babbler functionality, to generate arbitrary strings of text. The following example uses ngram to generate some arbitrary text from the King James Bible.

library(ngram)
set.seed(40)
kjv <- readChar("kjv.txt", file.info("kjv.txt")$size)
biblegrams <- ngram(kjv, n = 2)
babble(biblegrams, 10)
## [1] "fly into the inner court of fine linen of woven "

Possible Uses

This app can be used to generate filler text, if you are in the need of filler text, similar to Lorem Ipsum. It can also be used to emulate the style of a certain writer, generating sentences that sound like they could be written by them, if not grammatically correct ones.

The app is also a great tool for teaching people about the wonders of markov-chain text generation, showing off how such a simple technique can produce remarkable results.

Most importantly, the app can be used for fun. It is a rather fun thing to mess around with, experimenting with different corpuses to generate text, and this was part of the purpose in creating the app.