Introduction

Learning a new language from scratch in a short period of time is a difficult commitment to undertake. Although we have no background on java script, we have been using R program since 2018. We came across an interesting program for visualization called D3.js. It was seen that R in it’s version 3.6.0 also supports D3.js on its platform. Therefore, we decided to use this to understand some basics of D3 and integrate both the languages together to see how smooth or challenging the task would be.

What is D3.js?

D3 is short form for Data Driven Documents. It is a visualization tool which has gained immense popularity over the years. The reason for its widespread use is because people who have prior knowledge of java scripts can easily use the D3 library in their java scripts to make very creative visualizations. Further, the output, being HTML, can easily be embedded into website organically. It manipulates the Document Object Models(DOM) in programming to render its output. Users can give life to data through CSS,SVG and HTML by using D3.js

Why R?

R is an open-source programming tool primarily used for statistical analysis, but it has spread its branches into various fields. It has various packages that are used to carry out visualizations: e.g., ggplot2, plotly, leaflet, etc., to name a few. Moreover, we know that to make any visualization, the data need to be prepared, as we hardly ever get our data sets to be like we want. The data manipulation is fast and simple in R. We use R to carry out some data cleaning before loading it to the D3. The documentation of this project is done using Rmarkdown and the output is generated as HTML. The idea is to check the code and the output instantaneously by using this. Further, publishing and sharing of the report in R is very simple.

SVG

SVG stands for the Scalable Vector Graphics. It is the building block for D3.js. It occupies less space than an image file and supports interactivity and animation. We will first draw some basic shape files like rectangles, circles, ellipses and lines to get a feel of D3. We will be using a library in R called r2d3

Installation of library

To install the library in R a simple command is to be loaded up. It is install.packages(r2d3) After the installation of the library, the package needs to be called up by the following code for carrying out any further analysis.

library(r2d3) ## calling the library r2d3

We can then insert a D3 chunk in the Rmarkdown file by clicking on the insert button.Image of Installation. (We can run python,sql,Bash,D3,Stan and cpp programs from R simultaneously without moving from one program to the other.) Now we are ready to create a basic SVG shape. The next code creates a circle with center (100,100) and radius 80. The circle has been given the color red here.

var circle=svg.append('circle')
  .attr('cx',100)
  .attr('cy',100)
  .attr('r',80)
  .attr('fill','red');

The code has svg.append which appends a circle to the circle variable. This is, in turn, given an attribute by using ‘.’ which is a method added to the variable to add further attributes to the circle we are drawing. The above code is a java script and we are using R to not only run the java script but also render the output.

Some more basic shapes

We can further add more shapes like rectangle, ellipse and line as SVGs as shown below.

var rect=svg.append('rect')
.attr('x',25)
.attr('y',0)
.attr('width',150)
.attr('height',60)
.attr('fill','yellow');

var ellipse= svg.append('ellipse')
.attr('cx',300)
.attr('cy',200)
.attr('rx',100)
.attr('ry',150)
.attr('fill','green');

var line=svg.append('line')
.attr('x1',200)
.attr('y1',30)
.attr('x2',300)
.attr('y2',30)
.attr('stroke','black')
.attr('stroke-width',3);

The origin of the SVG canvas is at the top-left. When we created a rectangle by giving x=25 and y=0, the plot was rendered from 25 units from the left and at the top of it. Same is the case for others as well. So, now that we have created some basic shapes in SVG. Let us now create a system of circles wherein the radius of the circle changes as per an array of data. We create an array of data in R, and that data will be used as an input to the D3 code.

##Crearting an array
array=seq(10,20,2)
array
## [1] 10 12 14 16 18 20

The above code creates a variable called array which has a starting value of 10 and last value as 20. The values increase by a step of 2. We now want to create 6 circles of having radius corresponding to the array.

The array gives us two information 1. The value of the data. 2. The index or position of the data. We will use these two properties to create 6 circles. We will use a function to loop over to make six circles.

circles=svg.selectAll('circle')
.data(data)
.enter()
.append('circle')
.attr('cx',function(d,i){
return (i*50)+25;
})
.attr('cy',50)
.attr('r',function(d){
return d;
}).
attr('fill','green')

The first function iterates the index of the array and we get various values of cx, while the second function populates the array values as radius. We also see that a large space is wasted here as the canvas size is fixed by R. We need to change the canvas size to change the appearance of the output.

Adding External data to D3

As discussed above, we can import data to R and manipulate it to further before making any visualizations. Here we import a JSON file which is a “Java Script Object Notation”.

library(jsonlite)
building<-fromJSON('buildings.json')
building
##                        name height
## 1              Burj Khalifa    828
## 2            Shanghai Tower    623
## 3 Abraj Al-Bait Clock Tower    601
## 4    Ping An Finance Centre    599
## 5         Lotte World Tower  544.5
class(building$height)
## [1] "character"

The data is the height of the building of world’s top 5 tallest towers. It is seen here that height of the building in the JSON file is actually a character. We need to change that to be a numeric value.

building$height<-as.numeric(building$height)
class(building$height)
## [1] "numeric"

Now this data can be imported into the D3 module. Here we try to make a bar chart for this

var bars=svg.selectAll('rect')
.data(data)
.enter()
.append('rect')
.attr('x',function(d,i){
return (i*50)+25;
})
.attr('y',0)
.attr('width',20)
.attr('height',function(d){
return d.height;
})
.attr('fill','steelblue')
.attr('stroke','black');

So, here we have created a bar chart based on the height of the tallest building. But the bars are hanging upside down and all of them are of the same length.This is because the height of out canvas is 500 pixels and all the heights of the tower is more than 500 units. We can change the height of the tower by changing the code slightly as follows.

var bars=svg.selectAll('rect')
.data(data)
.enter()
.append('rect')
.attr('x',function(d,i){
return (i*50)+25;
})
.attr('y',0)
.attr('width',20)
.attr('height',function(d){
return d.height*(1/3);
})
.attr('fill','steelblue')
.attr('stroke','black');

As we changed the height of each of the towers by \(\frac{1}{3}*height\) of the building we have adjusted everything to the canvas. But this hard coding which we carried out can be changed by using Scales.

Scales

Scales are functions that map from input domain to an output range. The height is a continuous variable and we will use linear scale. The code below shows the scale along with the domain and range. The maximum height of the building is 828 meters. We want to scale the maximum value to 400.

var y=d3.scaleLinear()
.domain([0,828])
.range([0,400]);
var bars=svg.selectAll('rect')
.data(data)
.enter()
.append('rect')
.attr('x',function(d,i){
return (i*50)+25;
})
.attr('y',0)
.attr('width',20)
.attr('height',function(d){
return y(d.height);//the y variable is automatically aligns the value as per scale.
})
.attr('fill','steelblue')
.attr('stroke','black');

Similarly, the x-axis can be aligned automatically (if more data is added to the data set). We use band scale here.

var x=d3.scaleBand()
.domain(['Burj Khalifa','Shanghai Tower','Abraj Al-Bait Clock Tower',
'Ping An Finance Centre','Lotte World Tower'])
.range([0,400])
.paddingInner(0.2)
.paddingOuter(0.2);

var y=d3.scaleLinear()
.domain([0,828])
.range([0,400]);
var bars=svg.selectAll('rect')
.data(data)
.enter()
.append('rect')
.attr('x',function(d){
return x(d.name);   //we use the scales to find out the x positions of the names
})
.attr('y',0)
.attr('width',x.bandwidth)//the width is dertmined by scales as well.
.attr('height',function(d){
return y(d.height);//the y variable is automatically aligns the value as per scale.
})
.attr('fill','steelblue')
.attr('stroke','black');

D3 min,max,extent and map

Hard coding names of building can be a painstaking exercise when we have a large data set. Also, the maximum value will also change when we have multiple buildings. For the same purpose, D3 provides functions to calculate the minimum value, maximum value, range of values, and the mapping of ordinal values via a map function. This is reflected in the code below.

var x=d3.scaleBand()
.domain(data.map(function(d){
return d.name;// using map function to populate the x values automatically
}))
.range([0,400])
.paddingInner(0.2)
.paddingOuter(0.2);

var y=d3.scaleLinear()
.domain([0,d3.max(data,function(d){
return d.height; // calculating the maximum values 
})])
.range([0,400]);
var bars=svg.selectAll('rect')
.data(data)
.enter()
.append('rect')
.attr('x',function(d){
return x(d.name);   //we use the scales to find out the x positions of the names
})
.attr('y',0)
.attr('width',x.bandwidth)//the width is dertmined by scales as well.
.attr('height',function(d){
return y(d.height);//the y variable is automatically aligns the value as per scale.
})
.attr('fill','steelblue')
.attr('stroke','black');

Margins and group

Here, we use SVG group element to create transformations and transition of the plot in the canvas.

var margin = { left:100, right:10, top:10, bottom:100 };

var width = 600 - margin.left - margin.right,
    height = 400 - margin.top - margin.bottom;
var g = svg.append("g")
.attr("transform", "translate(" + margin.left 
            + ", " + margin.top + ")")
var x=d3.scaleBand()
.domain(data.map(function(d){
return d.name;// using map function to populate the x values automatically
}))
.range([0,width])
.paddingInner(0.2)
.paddingOuter(0.2);

var y=d3.scaleLinear()
.domain([0,d3.max(data,function(d){
return d.height; // calculating the maximum values 
})])
.range([0,height]);
var bars=g.selectAll('rect') // translation of the plot to new margins
.data(data)
.enter()
.append('rect')
.attr('x',function(d){
return x(d.name);   //we use the scales to find out the x positions of the names
})
.attr('y',0)
.attr('width',x.bandwidth)//the width is dertmined by scales as well.
.attr('height',function(d){
return y(d.height);//the y variable is automatically aligns the value as per scale.
})
.attr('fill','steelblue')
.attr('stroke','black');

Axes and Labels

D3 provides built-in names for axes and labels. The following code shows the output

var margin = { left:100, right:10, top:10, bottom:100 };

var width = 600 - margin.left - margin.right,
    height = 400 - margin.top - margin.bottom;
var g = svg.append("g")
.attr("transform", "translate(" + margin.left 
            + ", " + margin.top + ")")
var x=d3.scaleBand()
.domain(data.map(function(d){
return d.name;// using map function to populate the x values automatically
}))
.range([0,width])
.paddingInner(0.2)
.paddingOuter(0.2);

var y=d3.scaleLinear()
.domain([0,d3.max(data,function(d){
return d.height; // calculating the maximum values 
})])
.range([0,height]);

var xAxisCall=d3.axisBottom(x); // adding axis to bottom of the chart
g.append('g')
.attr('class','x axis')
.attr('transform','translate(0,'+height+')')
.call(xAxisCall);

var yAxisCall=d3.axisLeft(y); //adding axis to the left of the chart
g.append('g')
.attr('class','y axis')
.call(yAxisCall);

var bars=g.selectAll('rect') // translation of the plot to new margins
.data(data)
.enter()
.append('rect')
.attr('x',function(d){
return x(d.name);   //we use the scales to find out the x positions of the names
})
.attr('y',0)
.attr('width',x.bandwidth)//the width is dertmined by scales as well.
.attr('height',function(d){
return y(d.height);//the y variable is automatically aligns the value as per scale.
})
.attr('fill','steelblue')
.attr('stroke','black');

The labels are overlapping each other. This can be overturned by using a rotation in the transformation matrix.

var margin = { left:100, right:10, top:10, bottom:100 };

var width = 600 - margin.left - margin.right,
    height = 400 - margin.top - margin.bottom;
var g = svg.append("g")
.attr("transform", "translate(" + margin.left 
            + ", " + margin.top + ")")
var x=d3.scaleBand()
.domain(data.map(function(d){
return d.name;// using map function to populate the x values automatically
}))
.range([0,width])
.paddingInner(0.2)
.paddingOuter(0.2);

var y=d3.scaleLinear()
.domain([0,d3.max(data,function(d){
return d.height; // calculating the maximum values 
})])
.range([0,height]);

var xAxisCall=d3.axisBottom(x);
g.append('g')
.attr('class','x axis')
.attr('transform','translate(0,'+height+')')
.call(xAxisCall)
.selectAll('text')
.attr('y',10)
.attr('x',-5)
.attr('text-anchor','end')
.attr('transform','rotate(-30)');

var yAxisCall=d3.axisLeft(y);
g.append('g')
.attr('class','y axis')
.call(yAxisCall);

var bars=g.selectAll('rect') // translation of the plot to new margins
.data(data)
.enter()
.append('rect')
.attr('x',function(d){
return x(d.name);   //we use the scales to find out the x positions of the names
})
.attr('y',0)
.attr('width',x.bandwidth)//the width is dertmined by scales as well.
.attr('height',function(d){
return y(d.height);//the y variable is automatically aligns the value as per scale.
})
.attr('fill','steelblue')
.attr('stroke','black');

The final Effort

Our bars are still hanging from top to bottom. Additionally, the y-axis is moving from top to bottom as well. Therefore, we need to reverse the scale to change the y-axis label.

var margin = { left:100, right:10, top:10, bottom:100 };

var width = 600 - margin.left - margin.right,
    height = 400 - margin.top - margin.bottom;
var g = svg.append("g")
.attr("transform", "translate(" + margin.left 
            + ", " + margin.top + ")")
var x=d3.scaleBand()
.domain(data.map(function(d){
return d.name;// using map function to populate the x values automatically
}))
.range([0,width])
.paddingInner(0.2)
.paddingOuter(0.2);

var y=d3.scaleLinear()
.domain([0,d3.max(data,function(d){
return d.height; // calculating the maximum values 
})])
.range([height,0]); //reversing the scale

var xAxisCall=d3.axisBottom(x);
g.append('g')
.attr('class','x axis')
.attr('transform','translate(0,'+height+')')
.call(xAxisCall)
.selectAll('text')
.attr('y',10)
.attr('x',-5)
.attr('text-anchor','end')
.attr('transform','rotate(-30)');

var yAxisCall=d3.axisLeft(y);
g.append('g')
.attr('class','y axis')
.call(yAxisCall);

var bars=g.selectAll('rect') // translation of the plot to new margins
.data(data)
.enter()
.append('rect')
.attr('x',function(d){
return x(d.name);   //we use the scales to find out the x positions of the names
})
.attr('y',0)
.attr('width',x.bandwidth)//the width is dertmined by scales as well.
.attr('height',function(d){
return y(d.height);//the y variable is automatically aligns the value as per scale.
})
.attr('fill','steelblue')
.attr('stroke','black');

Now that the scales have been reversed but the bar charts do not look quite right. The actual height we want is the height of the canvas minus the current height.

var margin = { left:100, right:10, top:10, bottom:100 };

var width = 600 - margin.left - margin.right,
    height = 400 - margin.top - margin.bottom;
var g = svg.append("g")
.attr("transform", "translate(" + margin.left 
            + ", " + margin.top + ")")
var x=d3.scaleBand()
.domain(data.map(function(d){
return d.name;// using map function to populate the x values automatically
}))
.range([0,width])
.paddingInner(0.2)
.paddingOuter(0.2);

var y=d3.scaleLinear()
.domain([0,d3.max(data,function(d){
return d.height; // calculating the maximum values 
})])
.range([height,0]); //reversing the scale

var xAxisCall=d3.axisBottom(x);
g.append('g')
.attr('class','x axis')
.attr('transform','translate(0,'+height+')')
.call(xAxisCall)
.selectAll('text')
.attr('y',10)
.attr('x',-5)
.attr('text-anchor','end')
.attr('transform','rotate(-30)');

var yAxisCall=d3.axisLeft(y);
g.append('g')
.attr('class','y axis')
.call(yAxisCall);

var bars=g.selectAll('rect') // translation of the plot to new margins
.data(data)
.enter()
.append('rect')
.attr('x',function(d){
return x(d.name);   //we use the scales to find out the x positions of the names
})
.attr('y',function(d){
return y(d.height);
})
.attr('width',x.bandwidth)//the width is dertmined by scales as well.
.attr('height',function(d){
return height-y(d.height);// changing the height
})
.attr('fill','steelblue')
.attr('stroke','black');

Adding texts on the x and y axis.

var margin = { left:100, right:10, top:10, bottom:100 };

var width = 600 - margin.left - margin.right,
    height = 400 - margin.top - margin.bottom;
var g = svg.append("g")
.attr("transform", "translate(" + margin.left 
            + ", " + margin.top + ")")
var x=d3.scaleBand()
.domain(data.map(function(d){
return d.name;// using map function to populate the x values automatically
}))
.range([0,width])
.paddingInner(0.2)
.paddingOuter(0.2);

var y=d3.scaleLinear()
.domain([0,d3.max(data,function(d){
return d.height; // calculating the maximum values 
})])
.range([height,0]); //reversing the scale

var xAxisCall=d3.axisBottom(x);
g.append('g')
.attr('class','x axis')
.attr('transform','translate(0,'+height+')')
.call(xAxisCall)
.selectAll('text')
.attr('y',10)
.attr('x',-5)
.attr('text-anchor','end')
.attr('transform','rotate(-40)');

var yAxisCall=d3.axisLeft(y);
g.append('g')
.attr('class','y axis')
.call(yAxisCall);

var bars=g.selectAll('rect') // translation of the plot to new margins
.data(data)
.enter()
.append('rect')
.attr('x',function(d){
return x(d.name);   //we use the scales to find out the x positions of the names
})
.attr('y',function(d){
return y(d.height);
})
.attr('width',x.bandwidth)//the width is dertmined by scales as well.
.attr('height',function(d){
return height-y(d.height);// changing the height
})
.attr('fill','steelblue')
.attr('stroke','black');

g.append('text')
.attr('class','y axis-label')
.attr('x',-(height)/2)
.attr('y',-60)
.attr('font-size','20px')
.attr('text-anchor','middle')
.attr('transform','rotate(-90)')
.text('Height (m)');

Finally, we use the packages ‘ggplot2’ and ‘plotly’ to make the bar chart here.

library(ggplot2)
library(plotly)
g<-ggplot(building,aes(name,height))+geom_bar(stat='identity',fill='steelblue',
                                              color='black')+
  theme_classic()+labs(x="",y="Height(m)")
ggplotly(g)

Conclusion

It is seen that the ‘plotly’ output from ggplot code is only two lines and the output also has a tool tip, like the one we have seen on Tableau. While making a bar chart from scratch in D3.js was challenging, we could understand that at a basic level the logic that goes inside the code. The more one learns basic tools of D3.js, the more dynamic and interactive charts can be made. But still, in terms of flexibility in data visualization…R is not far behind!!!!

Finally, both the programs works seamlessly in the R platform. We would also like to bring in more interactivity and dynamic plots using D3.js in the future.