We are going to use this document to run reticulate (https://github.com/rstudio/reticulate), which will allow us to interact with python from R-Studio!
This material is based on the amazing reticulate webpage: https://rstudio.github.io/reticulate/index.html
Reticulate is quite recent and many bugs with RStudio have been solved in the last month, so I recommend installing the latest versions of everything we need.
In order for the current package of reticulate to work out of the box I recommend to follow these instructions:
Install Anaconda Python 3.6.5
Install Latest version of R
Install RStudio v1.2 preview release - The new version of RStudio will fix some bugs.
Now you are ready to install the reticulate package:
install.packages("reticulate")
Once installed we need to call in the package:
library(reticulate)
It might be the case that you had an older Python version installed and that reticulate is pointing to this older version (remember you need a recent version of Python - e.g Python 3.6.5-). Lets check this:
py_discover_config()
If the Python version is not the one from the relevant anaconda environment change the environment by typing
use_condaenv("relevantenv")
py_discover_config()
Once you run use_condaenv("relevantenv")
Python is loaded. This means that you need to restart R if you want to change the Python version. The same is true whenever you type py_config()
instead of py_discover_config()
.
If changing the conda environment doesn’t work, try changing the path direclty using:
use_python("Path...", required=TRUE)
We can have a final check of the Python version being used. It should be 3.6…
py_config()
## python: /Users/jfranco1/anaconda/envs/r-reticulate/bin/python
## libpython: /Users/jfranco1/anaconda/envs/r-reticulate/lib/libpython3.6m.dylib
## pythonhome: /Users/jfranco1/anaconda/envs/r-reticulate:/Users/jfranco1/anaconda/envs/r-reticulate
## version: 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 13:44:09) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)]
## numpy: /Users/jfranco1/anaconda/envs/r-reticulate/lib/python3.6/site-packages/numpy
## numpy_version: 1.14.3
##
## python versions found:
## /Users/jfranco1/anaconda/envs/r-reticulate/bin/python
## /usr/bin/python
## /usr/local/bin/python3
## /Users/jfranco1/anaconda/bin/python
## /Users/jfranco1/anaconda/envs/general/bin/python
You can easily install packages directly through reticulate. Lets try installing a usefull package (this will be installed to the “r-reticulate” environment).
If you are using an environment with already installed packages you can skip this step or install it directly on your environement. You can even manage your conda environment directly from R: https://rstudio.github.io/reticulate/articles/python_packages.html
py_install("pandas")
##
## Installation complete.
We are now all set to start running some python code! There are 4 ways to interact with python using this package:
1. Python in R Markdown: Supports communication between R and Python (R chunks can access Python objects and vice-versa).
2. Importing Python modules: The import()
function enables you to import any Python module and call its functions directly from R.
3. Sourcing Python scripts: The source_python()
function enables you to source a Python script the same way you would source()
an R script (Python functions and objects defined within the script become directly available to the R session).
4. Python interactive session: The repl_python()
function creates an interactive Python console within R. Objects you create within Python are available to your R session (and vice versa).
Whith this option you can comminicate between Python and R while generating great documents with all your data anlysis pipeline in it!
Warning: The communication between R and Python chunks (the pieces of code in an R-Markdown document) is only supported since RStudio v1.2 preview release. Otherwise it will only work when you knit the document; it doesn’t happen if you are running chunk by chunk.
Lets look at an example from the reticulate documentation. First thing you need to do is create an R-Markdown document and insert an R chunk: insert (top right of source > R). Type here all the preliminaries we discussed so far (no need to install the package again). Try it out.
Lets call our data set from the R chunk
#R
autos = cars
Once this is done we are ready to create a python chunk! Go to insert (top right of source ) > python.
Access objects created within R chunks from Python using r.cars
. This will access the R dataframe defined earlier.
#Python
import pandas
autos_py = r.autos
autos_py['time']=autos_py['dist']/autos_py['speed']
Access objects created within python chunks from R using py$autos_py
. This will access the Pandas dataframe defined in Python.
#R
plot(py$autos_py)
Challenge 1: Install matplotlib (Python) and gamclass (R). Save the
loti
database and plot the yearly temperature anomailes averages using matplotlib. Tip: After installing matplotlib you need to restart the R Session.
You can use the import()
function to import any Python module and call it from R. Functions and other data within Python modules and classes can be accessed via the $
operator.
Let’s import Pandas to R:
library(reticulate)
pandas = import("pandas")
titanic = pandas$read_csv("https://goo.gl/4Gqsnz")
titanic$describe()
Did you get an error? What is wrong?
Reticulate transforms automatically objects in Python to objects in R:
R | Python | Examples |
---|---|---|
Single-element vector | Scalar |
1 , 1L , TRUE , “foo”
|
Multi-element vector | List |
c(1.0, 2.0, 3.0) , c(1L, 2L, 3L)
|
List of multiple types | Tuple |
list(1L, TRUE, “foo”)
|
Named list | Dict |
list(a = 1L, b = 2.0) , dict(x = x_data)
|
Matrix/Array | NumPy ndarray |
matrix(c(1,2,3,4), nrow = 2, ncol = 2)
|
Data Frame | Pandas DataFrame |
data.frame(x = c(1,2,3), y = c(“a”, “b”, “c”))
|
Function | Python function |
function(x) x + 1
|
NULL, TRUE, FALSE | None, True, False |
NULL , TRUE , FALSE
|
We can override this behaviour by spcifying convert=FALSE
:
library(reticulate)
pd = import("pandas", convert =FALSE)
titanic = pd$read_csv("https://goo.gl/4Gqsnz")
description = titanic$describe()
This should work! But what if you want to jump betweeen objects in Python and R?
Reticualte allows us to transform a python object to an R object and vice versa using py_to_r()
and r_to_py
:
description_r = py_to_r(description)
description_py = r_to_py(description_r)
Challenge 2: Repeat challenge 1 using matplotlib directly from R, but this time save the figure directly from matplotlib (
savefig(filename)
). Warning: matplotlib.pyplot.show() doesn’t work yet on Rstudio if run directly from R (importing a module).
Lets create a python script “functions.py” with the following code:
def add(x, y):
return x + y
Now we can call this function using source_python()
source_python('functions.py')
add(5, 10)
## [1] 15
Challenge 3: Generate a function that calculates the average of two numbers in python and source it.
You can work with Python interactively using the repl_python()
function. This will only work in the console.
You can access R elements like before using r.
and afterwards you can access Python elements created whilst in the REPL environment by using py$
.
When you are done, type exit
.
repl_python()
autosPy=r.autos
autosPy.describe()
exit
Challenge 4. Generate a string in the REPL console
s="I am a Python Variable"
and save it as an R object using the R console.
This is a package that allows you to use R from a Python Console; basically the other way around of Reticulate package in R.
This package has a lot of functionality, yet it is not as straight forward as Reticulate.
You can find here the documentation. Click here for a simple example.
The rpy2 package allows you to run R code directly from an Ipython and Jupyter notebooks. It allows you to run chunks of code in R as well as execute inline R expressions. However, you need to specify each time which variables in Python will be used as input to the R chunk and which ones will be exported from R to Python.
You can find here the documentation on Ipython magic integration. Click here for a simple example.