rpy2#

Rpy2 is a mature wrapper package that emables python to call and execute R codes seamlessly. I will discuss three main ways that rpy2 has enabled it.

Installation#

Use conda and conda-forge channel to install R and R studio on a local or remote conda environment. Bioconda can work too, it'll depend on the user's use case. Access R from bash to see if the correct version of R is installed.

Ipython#

First, install rpy2 using

pip install rpy2

Once installed, load the extension for rpy2 on jupyter notebook. The extension calls Rmagic which allows users to run both R and python in the same notebook (as opposed to separate notebooks for R and python using IRkernel and python kernel.)

%load_ext rpy2.ipython

To run codes in R, use %R or %%R to define the line or cell containing R code respectively.

%R b <- c(1,2,3)

%R
a <- c(4,5,6)

%R creates a separate R environment that runs R commands defined in the cells natively. The notebook will plot any figures produced by the R code without needing other auxillary codes.

Another important feature in Rmagic is that it contains multiple utility functions (e.g. %Rpush, %Rpull) that lets python and R exchange variables.

import numpy as np

X = np.array([4.5,6.3,7.9])
X.mean()
6.2333333333333343
%Rpush X
%R mean(X)
array([ 6.23333333])

While these features improve interactive REPL coding in a jupyter/ipython environment significantly, it does have several drawbacks. Codes are essentially written in two languages and python is unable to parse and execute R codes.

Scripts#

Instead of using Rmagic and combining of two differenet code syntax in one IDE, use the full rpy2 wrapper library to package R functions and variables into python objects. It can be more tedious adapting R code into the rpy2 ecosystem than using Rmagic, but the former is more scalable.

Atom hydrogen#

Atom Hydrogen lets us combine the best of both worlds.

How to plot#

Link

with grdevices.render_to_bytesio(grdevices.png) as b:
    <user function to plot figure>
data = b.getvalue()
Image(data=data, format='png', embed=True)

Another method using a context manager. The code layout mimics R code. The manager also saves an intermediate file test.png.

class rimg(object):
    def __init__(self, file='test.png', width = 1200, height = 1200):
        self.file = file
        extension = Path(self.file).suffix.replace('.','')
        if extension=='pdf':
            print('Generating PDF, ignoring width and height')
            getattr(grdevices, extension)(file=self.file)
        else:
            getattr(grdevices, extension)(file=self.file, width = width, height = height)
    def __enter__(self):
        pass
    def __exit__(self, type, value, tr):
        grdevices.dev_off()

file = 'test.png'
with rimg(file):
    <user function to plot figure>

Image(filename=file)

Installing Hilbert Curve package from Bioconductor#