R
At the HPC R can be used in the command line with batch scripts submitted via slurm or via a web GUI, RStudio, through Open Ondemand.
Modules
One version of R is installed for every year. These are accessible through environment modules. Thus in order to access a specific version of R one must first load the year module, followed by the available R version for that year.
Additionally, extension bundle modules for R are also present.These modules contain a list of the installed extensions in their module files and are thus searcheable
Searching for extensions
In order to search for a particular extension, use the module key command
module key [extensionName]
For instance, when searching for the terra extension,
module load 2023 module key terra
The following output is then printed
----------------------------------------------------------------------------------------------------------- The following modules match your search criteria: "terra" ----------------------------------------------------------------------------------------------------------- R-bundle-CRAN: R-bundle-CRAN/2023.12-foss-2023a Bundle of R packages from CRAN -----------------------------------------------------------------------------------------------------------
This indicated that the terra extension is contained in the R-bundle-CRAN/2023.12-foss-2023a module. This bundle loads the corresponding R version for that year and adds extensions to it.
User Local Library
R allows the user to create its own local environment. There the user can install its own packages. you can then store these extension libraries in a local folder. It is handy to keep track of the version of R used, so try to keep your folders organized.
The first step is to a load a version of R
module load 2023 module load R/4.3.2 R
The commands above load the 2023 bucket, loads the R/4.3.2 module and executes the R interactive runtime.
Once inside the runtime (or a jupyter notebook running an R kernel), you can check your library paths with
.libPaths()
Creating a new library
set the path to your new library in the variable new_library
new_library='/home/WUR/user001/.R_432_ext/'
create the library folder
dir.create(file.path(new_library), showWarnings = TRUE)
If the folder already exists, R will display a warning.
Once created, you can then append the newly created folder to your libPaths
.libPaths(c(folder, .libPaths()) )
you can check if the operation was successful by running
.libPaths()
Installing New Extensions
In order to install a new package, one needs to use the command in the example below.
install.packages("ggplot2", repos="http://cran.r-project.org", libs="~/R_432_ext")
In this example, we are installing the ggplot2 extension. We need to specify the repository we are downloading the package from, which in this case is the r-project website. Finally we need to specify the destination location of the libraries.
Submitting Slurm jobs
Slurm job script use bash as an interpreter (note the #!/bin/bash on the first line), so it cannot execute R code. Its job is to allocate resources in the cluster, load modules and execute any other bash command you need in your job.
Thus in order to launch an R job via slurm, one needs to have two scripts: an R script and a slurm (bash) script.
Here is an example of a very simple R script that will list all installed extensions.
#!/usr/bin/env Rscript installed.packages()[,1]
Let's call this script list_ext.r and let's place in ~/myRScripts. Note that the first line is used to point to the R interpreter, Rscript. In order to access it, the R (or R-bundle) module must have been loaded beforehand. This is done in the sbatch script.
Important: Make sure that the R script, in this example list_ext.r, is executable.
#!/bin/bash #SBATCH --comment="List R extensions" #SBATCH --time=0-0:10:00 # 10 minutes #SBATCH --mem=1G #SBATCH --ntasks=1 #SBATCH --output=output_%j.txt #SBATCH --error=error_output_%j.txt #SBATCH --job-name=r_job.sh module load 2023 module load R-bundle-CRAN/2023.12-foss-2023a #execute the R-script ~/myRScripts/list_ext.r
Since this is just an illustrative job, we have only allocated 1 GB of RAM and a single CPU. We can store the text above under the name r_job.sh.
Thus we can submit the job with the command
sbatch r_job.sh