As we all know python language and its importance as data scientist. So, I was curious if we can use python or its libraries in R language so that we will be able to write hybrid code by using R and Python both. It would empower developers to work more efficiently.
So when I started to search on internet about similar libraries as python in R language, I came to know there are quite a few like some very popular cross platform library Dlib but my concern was if I can use python based libraries identically. Luckily after searching internet for a while I found a solution to use python as external library to use in R language. Then I decided to narrow down to at least 1 specific library to write blog on, so that my fellows can get benefit by reading that blog.
In this blog post I will be describing about python library called Matplotlib. It is specifically used for displaying images in the form of graphs and charts. More over it can perform various other tasks like as computer vision related problems.
I tried to run this on Ubuntu 16.04 which works perfect. But while searching on internet I found that it also works perfect on windows using anaconda distributions.
Let’s start by setting up environment for R in ubuntu…
First of all, we have to install R dependencies to include python in R code. Open your terminal in ubuntu and run following command.
sudo apt-get install python-dev python-matplotlib
Then we have to set flags for R in ubuntu to access python by running following commands.
py_cflags <- system(“python2.7-config –cflags”, intern=TRUE)
Sys.setenv(“PKG_CXXFLAGS”=sprintf(“%s %s”, Sys.getenv(“PKG_CXXFLAGS”), py_cflags))
py_ldflags <- system(“python2.7-config –ldflags”, intern=TRUE)
Sys.setenv(“PKG_LIBS”=sprintf(“%s”, py_ldflags))
After successful completion of this installation we will start writing our code in R directly.
In following code we will include python as header file and then write 1 generic function to call and python library. For instance we are using Matplotlib so I wrote matplotlib in it to be called. To understand more please take a look at pyrun function.
void pyrun(std::string command) {
PyRun_SimpleString(command.c_str());
}
And to define which library from python is to be called, take a look at void initialize_python() function and then pyrun(“import matplotlib”) line.
void initialize_python() {
#ifndef WIN32
dlopen(“libpython2.7.so”, RTLD_LAZY |RTLD_GLOBAL);
//Required to import matplotlib
#endif
Py_Initialize();
pyrun(“import matplotlib”);
//pyrun(“matplotlib.use(‘Qt4Agg’)”);
pyrun(“import matplotlib.pyplot as plt”);
}
#include <Rcpp.h>
#include <Python.h>
#include <stdlib.h>
#ifndef WIN32
#include <dlfcn.h>
#endif
using namespace Rcpp;
//Run Python commands from R
//[[Rcpp::export]]
void pyrun(std::string command) {
PyRun_SimpleString(command.c_str());
}
//You need to call this first
//[[Rcpp::export]]
void initialize_python() {
#ifndef WIN32
dlopen(“libpython2.7.so”, RTLD_LAZY |RTLD_GLOBAL); //Required to import matplotlib
#endif
Py_Initialize();
pyrun(“import matplotlib”);
//pyrun(“matplotlib.use(‘Qt4Agg’)”);
pyrun(“import matplotlib.pyplot as plt”);
}
//Call after you’re done
//[[Rcpp::export]]
void finalize_python() {
Py_Finalize();
}
Next step towards plotting any data curve using matplotlib is to pass data to it. So, for that reason we will write following code to pass data. Code below will run in main module because when python will be imported it has to be defined where it should run and in my code I have written “main”, so that it will run in main module.
#include <Rcpp.h>
#include <Python.h>
#include <stdlib.h>
using namespace Rcpp;
//Convert NumericVector to Python List
PyObject* numvec_to_list(NumericVector x) {
int n = x.length();
PyObject xpy = PyList_New(n); //Make new list
PyObject f;
for (int i=0; i<n; i++) {
f = PyFloat_FromDouble(x[i]);
PyList_SetItem(xpy, i, f); //Fill list from NumericVector
}
return(xpy);
}
//Copy a numeric vector from R to Python
//[[Rcpp::export]]
void numvec_to_python(std::string name, NumericVector x) {
PyObject xpy = numvec_to_list(x);
PyObject m = PyImport_AddModule(“main”);
PyObject *main = PyModule_GetDict(m); //Get the locals dictionary of__main__ module
PyDict_SetItemString(main, name.c_str(), xpy); //Add variable to that dictionary
}
In last part we will plot graph by using python and matplotlib in R. now we have to copy vector in python to plot it by running command Pyrun. Code below shows how to plot sin and cosine by using plt.plot() based on matplotlib.
x <- seq(0, 2*pi, length = 100)
sx <- sin(x)
cx <- cos(x)
initialize_python()
#Copy variables to Python
numvec_to_python(“x”, x)
numvec_to_python(“sx”, sx)
numvec_to_python(“cx”, cx)
#Set plot size
pyrun(“plt.rcParams.update({‘figure.figsize’ : (7,4)})”)
#Create plots
pyrun(“plt.plot(x, sx)”)
pyrun(“plt.plot(x, cx, ‘–r’, linewidth=2)”)
pyrun(“plt.legend((‘sin(x)’, ‘cos(x)’))”)
pyrun(“plt.savefig(‘../figure/2015-04-02-pyplot.png’)”)
#pyrun(“plt.show()”) #Uncomment this line to show the plot
##Final Output: