This is a simple example of using the R package knitr for writing a report with embedded Bash, Python, Groovy, and Scala code. It’s not my work, but rather a minor adaptation of Yihui Xie’s polyglot R Markdown example. I’ve added some context for readers who are not familiar with R and inserted a Groovy section.
Why is this relevant? Because we want to be able to write web-based, well-formatted documents (== bug reports, knowledge base articles, quick starts, instructions), containing chunks of executable code. The goal is to create reproducible artifacts: results are accompanied by the data and code needed to produce them.
For data scientists, reproducible research is a major concern. A common practice among users of R - the field’s lingua franca - is to write reports in R Markdown, a variant of Markdown developed by the Rstudio with chunks of embedded R code. The knitr package is an engine for dynamic report generation with R.
However, knitr is not limited to the use of R. We can use any language in knitr.
yum install R.install.packages("knitr")knit2html function: R -e 'library(knitr);knit2html("readme.Rmd")Here is an example of using knitr with multiple languages to produce an HTML page from Markdown.
toTest <- c("R", "bash", "python", "scala", "groovy")
where <- Sys.which(toTest)
exists <- nchar(where) > 0 # TODO: Only run chunk if runtime exists
for(n in names(where)) {
path <- where[n]
if(nchar(path) <= 0) {
path <- "<not found>"
}
message("* __", n, "__: `", path, "`\n")
cat(paste("* __", n, "__: `", path, "`\n", sep=""))
}
## * __R__: `/usr/bin/R`
## * __bash__: `/usr/bin/bash`
## * __python__: `/usr/bin/python`
## * __scala__: `/usr/bin/scala`
## * __groovy__: `/home/gerjantd/.gvm/groovy/current/bin/groovy`
Pass the string to transform to engine subprocess via environment variable.
Sys.setenv(SOMETHING = "something")
Sys.setenv(SOME_URL = "http://www.redhat.com")
something <- Sys.getenv("SOMETHING")
somethingelse <- paste(something, "+ R")
cat(paste("'", something, "' is now '", somethingelse, "'", sep=""))
R> 'something' is now 'something + R'
something=$SOMETHING
somethingelse="$something + Bash"
echo "'$something' is now '$somethingelse'"
Bash> 'something' is now 'something + Bash'
import os
something = os.getenv("SOMETHING")
somethingelse = something + " + Python"
print("'" + something + "' is now '" + somethingelse + "'")
Python> 'something' is now 'something + Python'
Running small fragments without caching can take some time, as the Scala compiler has to launch and compile the script to JVM bytecode. The -savecompiled option (passed via engine.opts) will result in Scala caching the compiled script outside of knitr.
val something = System.getenv("SOMETHING")
val somethingelse = something + " + Scala"
println("'" + something + "' is now '" + somethingelse + "'")
// Try something slightly more substantial
def abs(x: Double) = if (x < 0) -x else x
def sqrt(x: Double) = {
def sqrtIter(guess: Double): Double =
if (isGoodEnough(guess)) guess
else sqrtIter(improve(guess))
def isGoodEnough(guess: Double) =
abs(guess * guess - x) / x < 0.001
def improve(guess: Double) =
(guess + x / guess) / 2
sqrtIter(1.0)
}
println(sqrt(2))
println(sqrt(9))
println(sqrt(1e-6))
println(sqrt(1e60))
Scala> 'something' is now 'something + Scala'
Scala> 1.4142156862745097
Scala> 3.00009155413138
Scala> 0.0010000001533016628
Scala> 1.0000788456669446E30
Similar remarks as for Scala can be made about running Groovy without caching. The Groovy compiler has to launch and compile the script to JVM bytecode. It is possible that there exist optimizing flags to pass via engine.opts.
def NEW_LINE = System.getProperty("line.separator")
def address = System.getenv('SOME_URL')
def urlInfo = address.toURL()
println "===================================================================="
println "====== HEADER FIELDS FOR URL ${address}"
println "===================================================================="
def connection = urlInfo.openConnection()
headerFields = connection.getHeaderFields()
headerFields.each {println it}
Groovy> ====================================================================
Groovy> ====== HEADER FIELDS FOR URL http://www.redhat.com
Groovy> ====================================================================
Groovy> Transfer-Encoding=[chunked]
Groovy> null=[HTTP/1.1 200 OK]
Groovy> Server=[Apache]
Groovy> Connection=[Transfer-Encoding, keep-alive]
Groovy> Pragma=[no-cache]
Groovy> Last-Modified=[Tue, 30 Dec 2014 02:42:59 GMT]
Groovy> Date=[Wed, 31 Dec 2014 11:28:24 GMT]
Groovy> Cache-Control=[max-age=0, no-cache]
Groovy> ETag=["1419907379-1"]
Groovy> X-RedHat-Debug=[1]
Groovy> X-Drupal-Cache=[HIT]
Groovy> Set-Cookie=[AWSELB=4BFBD9AD0222295D2AB5799A247D1287032E7BB80B6043FD5041E7DAFE18C3328B86CDC7B77F21F72EFA1876C97427DAFB49805183581DA5A2A28E0C969DACD2572B74DAC9;PATH=/;MAX-AGE=30, AWSELB=7DE7FB19045D425DE69229FBB7F229663FD24433133B54DCEE77F98CE21EFB3E8FE1FE07F6F24C3F30EB76C3348446159423E48632EA549B25C21E60E2A7DB38A1480E4925;PATH=/;MAX-AGE=30, WL_DCID=origin-www-c; expires=Wed, 31-Dec-2014 19:28:24 GMT; path=/]
Groovy> Expires=[Wed, 31 Dec 2014 11:28:24 GMT]
Groovy> Link=[<http://www.redhat.com/en>; rel="canonical"]
Groovy> Content-Language=[en]
Groovy> X-Powered-By=[PHP/5.3.3]
Groovy> Content-Type=[text/html; charset=utf-8]