Curley Lab Website



Background
RMarkdown is a great format for writing reports and sharing the results of data analysis with others. Recently I’ve been increasingly adding user interactivity to my RMarkdown visualizations. One way to do this is to use shiny - which is excellent. However, I often want to do different things than shiny offers and also have more customization of my visualization. In particular, I like to create D3.js visualizations and wanted to experiment with adding in an interactive D3.js visualization into an RMarkdown workflow.

Here I show how to work with data in RMarkdown and to present it in an interactive D3.js visualization without leaving RStudio. We will use the iris dataset to show how we create a scatterplot that can be filtered by group to show how the regression of points changes over groups.

Here is how it looks in final form:





I will now go through how we got this to be embedded in the RMarkdown document.



R object → JSON object

In our RMarkdown document we have our iris dataset that looks like this:

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa


We can convert this to JSON format directly in the RMarkdown document using the toJSON() function from the jsonlite library. D3.js works very straightforwardly with JSON data. For the current visualization, we will look at the relationship between Sepal.width and Sepal.length by Species. This is the code that we will include in our r chunk:

library(jsonlite)
cat(
  paste(
  '<script>
    var data = ',toJSON(iris[c(5,1,2)]),';
  </script>'
  , sep="")
)

To ensure that the newly created JSON data are available to the javascript that’s going to execute our visualization make sure that results="asis" is in header of the chunk above.

The resulting first six rows of the iris data will now be in key-value pairs and will be available in the background for our visualization. It looks like this:

[
{“Species”:“setosa”,“Sepal.Length”:5.1,“Sepal.Width”:3.5},{“Species”:“setosa”,“Sepal.Length”:4.9,“Sepal.Width”:3},{“Species”:“setosa”,“Sepal.Length”:4.7,“Sepal.Width”:3.2},{“Species”:“setosa”,“Sepal.Length”:4.6,“Sepal.Width”:3.1},{“Species”:“setosa”,“Sepal.Length”:5,“Sepal.Width”:3.6},{“Species”:“setosa”,“Sepal.Length”:5.4,“Sepal.Width”:3.9}

]



Set up HTML, CSS & js

Now we have the data, we just need to feed it to our visualization. I previously prototyped a scatterplot with interactive regression in D3.js at blockbuilder here. Each D3.js visualization essentially has 3 elements. The HTML that dictates where each major component of the visualization should be placed, the CSS rules that dictate the styles of elements and the javascript that executes the visualization.


HTML

In our RMarkdown document we need to add the <div> tags related to our visualization at the point in the document where we want it to appear. This visualization has four <div> tags. The one containing the svg chart has a class="chart". There are two that contain the regression equation and R2 value that change each time the data are filtered. These have the class="equation". The interactivity is achieved via a dropdown menu, which is a HTML select element and has id="selector". select elements are populated by option tags inside them that refer to what should be selected. The D3.js code automatically populates these options with the different groups - therefore, if we’d used a different dataset then they would have been populated with different groups without us having to change any code. The D3.js also returns the number of subjects in each group in the dropdown menu.

I have had some issues with making sure that <div> tags are read properly by RMarkdown documents. The easiest way to deal with this seems to be to make sure we wrap the tags with the html_preserve script. See here for advice on inserting raw HTML into RMarkdown


It should look something like this in the RMarkdown:

<!--html_preserve-->
<div><select id=“selector”></select></div>
<div class=“equation”></div>
<div class=“equation”></div>
<div class=“chart”></div>
<!--/html_preserve-->



CSS

We add custom CSS styling into RMarkdown document using the css option in the YAML like this:

css: styles.css

This refers to a CSS file called css.styles in the working directory containing all the CSS for the visualization as well as the RMarkdown HTML document itself. When making D3.js visualizations it’s often easiest to use CSS styling options to have more fine control over elements of the visualization.



Javascript

The easiest way to add the javascript required to build the visualization is to include it between <script> tags. This visualization has two. The first is the link to load the D3 library. This requires <script src="https://d3js.org/d3.v4.min.js"></script> being added directly to the Rmarkdown file. This should come before the next <script> tag which contains the rest of the code needed. For this, I include a separate .js file called vizjs.js in the working directory of the RMarkdown document and call it using the following: <script src="vizjs.js"></script>.

Another, cleaner, way to do this would be to create a .html file in the working directory that only contains the script tags.



Thanks

I hope this is helpful. If you have any questions or comments contact me at twitter or by email at jc3181 AT columbia DOT edu.