We are going to use this document to run reticulate (https://github.com/rstudio/reticulate), which will allow us to interact with python from R-Studio!

This material is based on the amazing reticulate webpage: https://rstudio.github.io/reticulate/index.html

Setup

Reticulate is quite recent and many bugs with RStudio have been solved in the last month, so I recommend installing the latest versions of everything we need.

In order for the current package of reticulate to work out of the box I recommend to follow these instructions:

Now you are ready to install the reticulate package:

install.packages("reticulate")

Once installed we need to call in the package:

library(reticulate)

It might be the case that you had an older Python version installed and that reticulate is pointing to this older version (remember you need a recent version of Python - e.g Python 3.6.5-). Lets check this:

py_discover_config()

If the Python version is not the one from the relevant anaconda environment change the environment by typing

use_condaenv("relevantenv")
py_discover_config()

Once you run use_condaenv("relevantenv") Python is loaded. This means that you need to restart R if you want to change the Python version. The same is true whenever you type py_config() instead of py_discover_config().

If changing the conda environment doesn’t work, try changing the path direclty using:

use_python("Path...", required=TRUE)

We can have a final check of the Python version being used. It should be 3.6…

py_config()
## python:         /Users/jfranco1/anaconda/envs/r-reticulate/bin/python
## libpython:      /Users/jfranco1/anaconda/envs/r-reticulate/lib/libpython3.6m.dylib
## pythonhome:     /Users/jfranco1/anaconda/envs/r-reticulate:/Users/jfranco1/anaconda/envs/r-reticulate
## version:        3.6.5 | packaged by conda-forge | (default, Apr  6 2018, 13:44:09)  [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)]
## numpy:          /Users/jfranco1/anaconda/envs/r-reticulate/lib/python3.6/site-packages/numpy
## numpy_version:  1.14.3
## 
## python versions found: 
##  /Users/jfranco1/anaconda/envs/r-reticulate/bin/python
##  /usr/bin/python
##  /usr/local/bin/python3
##  /Users/jfranco1/anaconda/bin/python
##  /Users/jfranco1/anaconda/envs/general/bin/python

Packages

You can easily install packages directly through reticulate. Lets try installing a usefull package (this will be installed to the “r-reticulate” environment).

If you are using an environment with already installed packages you can skip this step or install it directly on your environement. You can even manage your conda environment directly from R: https://rstudio.github.io/reticulate/articles/python_packages.html

py_install("pandas")
## 
## Installation complete.

We are now all set to start running some python code! There are 4 ways to interact with python using this package:

1. Python in R Markdown: Supports communication between R and Python (R chunks can access Python objects and vice-versa).

2. Importing Python modules: The import() function enables you to import any Python module and call its functions directly from R.

3. Sourcing Python scripts: The source_python() function enables you to source a Python script the same way you would source() an R script (Python functions and objects defined within the script become directly available to the R session).

4. Python interactive session: The repl_python() function creates an interactive Python console within R. Objects you create within Python are available to your R session (and vice versa).

1. Python in R Markdown

Whith this option you can comminicate between Python and R while generating great documents with all your data anlysis pipeline in it!

Warning: The communication between R and Python chunks (the pieces of code in an R-Markdown document) is only supported since RStudio v1.2 preview release. Otherwise it will only work when you knit the document; it doesn’t happen if you are running chunk by chunk.

Lets look at an example from the reticulate documentation. First thing you need to do is create an R-Markdown document and insert an R chunk: insert (top right of source > R). Type here all the preliminaries we discussed so far (no need to install the package again). Try it out.

Lets call our data set from the R chunk

#R
autos = cars

Once this is done we are ready to create a python chunk! Go to insert (top right of source ) > python.

Access objects created within R chunks from Python using r.cars. This will access the R dataframe defined earlier.

#Python
import pandas 
autos_py = r.autos
autos_py['time']=autos_py['dist']/autos_py['speed']

Access objects created within python chunks from R using py$autos_py. This will access the Pandas dataframe defined in Python.

#R
plot(py$autos_py)

Challenge 1: Install matplotlib (Python) and gamclass (R). Save the loti database and plot the yearly temperature anomailes averages using matplotlib. Tip: After installing matplotlib you need to restart the R Session.

2. Importing Python modules

You can use the import() function to import any Python module and call it from R. Functions and other data within Python modules and classes can be accessed via the $ operator.

Let’s import Pandas to R:

library(reticulate)
pandas = import("pandas")
titanic = pandas$read_csv("https://goo.gl/4Gqsnz")
titanic$describe()

Did you get an error? What is wrong?

Reticulate transforms automatically objects in Python to objects in R:

R Python Examples
Single-element vector Scalar 1, 1L, TRUE, “foo”
Multi-element vector List c(1.0, 2.0, 3.0), c(1L, 2L, 3L)
List of multiple types Tuple list(1L, TRUE, “foo”)
Named list Dict list(a = 1L, b = 2.0), dict(x = x_data)
Matrix/Array NumPy ndarray matrix(c(1,2,3,4), nrow = 2, ncol = 2)
Data Frame Pandas DataFrame data.frame(x = c(1,2,3), y = c(“a”, “b”, “c”))
Function Python function function(x) x + 1
NULL, TRUE, FALSE None, True, False NULL, TRUE, FALSE

We can override this behaviour by spcifying convert=FALSE:

library(reticulate)
pd = import("pandas", convert =FALSE)
titanic = pd$read_csv("https://goo.gl/4Gqsnz")
description = titanic$describe()

This should work! But what if you want to jump betweeen objects in Python and R?

Reticualte allows us to transform a python object to an R object and vice versa using py_to_r() and r_to_py:

description_r = py_to_r(description)
description_py = r_to_py(description_r)

Challenge 2: Repeat challenge 1 using matplotlib directly from R, but this time save the figure directly from matplotlib (savefig(filename)). Warning: matplotlib.pyplot.show() doesn’t work yet on Rstudio if run directly from R (importing a module).

3. Sourcing Python scripts

Lets create a python script “functions.py” with the following code:

def add(x, y):
  return x + y

Now we can call this function using source_python()

source_python('functions.py')
add(5, 10)
## [1] 15

Challenge 3: Generate a function that calculates the average of two numbers in python and source it.

4. Python interactive session (REPL)

You can work with Python interactively using the repl_python() function. This will only work in the console.

You can access R elements like before using r. and afterwards you can access Python elements created whilst in the REPL environment by using py$.

When you are done, type exit.

repl_python()
autosPy=r.autos
autosPy.describe()
exit

Challenge 4. Generate a string in the REPL console s="I am a Python Variable" and save it as an R object using the R console.

Reticulate and be Free of R-strictions!



Other options to interact between R and Python

1. rpy2 package

1.1 Use R from a Python console

This is a package that allows you to use R from a Python Console; basically the other way around of Reticulate package in R.

This package has a lot of functionality, yet it is not as straight forward as Reticulate.

You can find here the documentation. Click here for a simple example.

1.2. Use R from IPython and Jupyter notebooks

The rpy2 package allows you to run R code directly from an Ipython and Jupyter notebooks. It allows you to run chunks of code in R as well as execute inline R expressions. However, you need to specify each time which variables in Python will be used as input to the R chunk and which ones will be exported from R to Python.

You can find here the documentation on Ipython magic integration. Click here for a simple example.