Using R

Running R in Batch Mode Under SLURM

It's easy to run R in batch mode with the SLURM scheduler. To use version 3.4.2 of R compiled with the Intel 2018 compiler, reading input from a file named lognormal.r, your script would look something like this:

#!/bin/bash
#SBATCH -J R_Job
#SBATCH -p background
#SBATCH --time=00:10:00
#SBATCH --ntasks=1
#SBATCH --mem=1900m
#SBATCH -o R_job-%j.out
#SBATCH -e R_job-%j.err

echo "Starting at $(date) on $(hostname)"
stime=$(date +%s)

# Print the SLURM job ID.
echo "SLURM_JOBID=$SLURM_JOBID"

module load intel/18.0.0
module load bzip2 curl pcre xz zlib
module load R

# Run R with the input file lognormal.r
R --slave --quiet --file=lognormal.r

etime=$(date +%s)
echo "Ended at $(date) on $(hostname). Elapsed time $[etime-stime] seconds."
exit 0

Multi-threaded R

R is capable of using multiple threads to achieve parallelism, in particular when calling using the LAPACK and BLAS routines. To enable multi-threading, define MKL_NUM_THREADS=N before loading the modules for R. The value of N should correspond to the number of SLURM processors (ntasks) requested with the --ntasks (or -n) option. You may need to test your code to see how the number of threads affects the overall executation time of our application. Here's an example requesting 8 processors and using 8 threads:

#!/bin/bash
#SBATCH -J R_Job
#SBATCH -p background
#SBATCH --time=00:10:00
#SBATCH --ntasks=8
#SBATCH --mem=1900m
#SBATCH -o R_job-%j.out
#SBATCH -e R_job-%j.err

echo "Starting at $(date) on $(hostname)"
stime=$(date +%s)

# Print the SLURM job ID.
echo "SLURM_JOBID=$SLURM_JOBID"

export MKL_NUM_THREADS=8
module load intel/18.0.0
module load bzip2 curl pcre xz zlib
module load R

# Run R with the input file lognormal.r
R --slave --quiet --file=lognormal.r

etime=$(date +%s)
echo "Ended at $(date) on $(hostname). Elapsed time $[etime-stime] seconds."
exit 0

NOTICE: R does not run in parallel across multiple nodes, so do not attempt to submit jobs to the MPI partitions. You may submit R jobs to the serial, background, background-4g, and stakeholder (if you are a paying Brazos stakeholder) partitions only. If your R jobs are found running in the MPI partitions they will be terminated. If you can demonstrate that you have Rmpi functioning with your application, we will allow it to run in the mpi partitions, provided it does not leave processors idle.

Installing R Packages

If you wish to install additional packages for R we recommend they be installed to your home directory on Brazos. In the examples below $HOME/R/lib is used, but any path under $HOME will work.

First create your R_LIBS_USER directory and set the environment variable.
Example:

mkdir -p $HOME/R/lib
export R_LIBS_USER=$HOME/R/lib

If you wish for the R_LIBS_USER environment variable to be set upon login, you can add the export line to your .bashrc file.

The example code below goes into your $HOME/.Rprofile. The code in the example will set the package download mirror so you are not prompted during each install of a package.

message("Loading .Rprofile")
r <- getOption("repos")             # hard code the US repo for CRAN
r["CRAN"] <- "http://cran.us.r-project.org"
options(repos = r)
rm(r)

Now you can install a package. This example uses the intel/2013_sp1.3 and R/3.1.1 modules, but others are available. You can find the available R modules by executing module spider R.

module load intel/2013_sp1.3 R/3.1.1
R

To load R version 3.4.2 compiled with the intel-2018 compiler and launch R,

module load intel/18.0.0
module load bzip2 curl pcre xz zlib
module load R
R

The following commands are executed from the R shell.

> install.packages("likelihood")
Installing package into ‘/home/treydock/R/lib’
(as ‘lib’ is unspecified)
trying URL 'http://cran.us.r-project.org/src/contrib/likelihood_1.7.tar.gz'
Content type 'application/x-gzip' length 37620 bytes (36 Kb)
opened URL
==================================================
downloaded 36 Kb

Loading .Rprofile
* installing *source* package ‘likelihood’ ...
** package ‘likelihood’ successfully unpacked and MD5 sums checked
** R
** data
** demo
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
Loading .Rprofile
* DONE (likelihood)

The downloaded source packages are in
	‘/tmp/RtmpBxRZte/downloaded_packages’
>

You can now verify the likelihood package was installed from the R shell.

> (.packages(all.available=TRUE))
 [1] "likelihood" "base"       "boot"       "class"      "cluster"
 [6] "codetools"  "compiler"   "datasets"   "foreign"    "graphics"
[11] "grDevices"  "grid"       "KernSmooth" "lattice"    "MASS"
[16] "Matrix"     "methods"    "mgcv"       "nlme"       "nnet"
[21] "parallel"   "rpart"      "spatial"    "splines"    "stats"
[26] "stats4"     "survival"   "tcltk"      "tools"      "utils"