R: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
Line 34: Line 34:
This indicated that the '''terra''' extension is contained in the '''R-bundle-CRAN/2023.12-foss-2023a''' module. This bundle loads the corresponding R version for that year and adds extensions to it.  
This indicated that the '''terra''' extension is contained in the '''R-bundle-CRAN/2023.12-foss-2023a''' module. This bundle loads the corresponding R version for that year and adds extensions to it.  


==Installing your own R extensions==
== User Local Library==


R allows the user to create its own local environment. There the user can install its own packages. you can then store these extension libraries in a local folder. It is handy to keep track of the version of R used, so try to keep your folders organized.  
R allows the user to create its own local environment. There the user can install its own packages. you can then store these extension libraries in a local folder. It is handy to keep track of the version of R used, so try to keep your folders organized.  

Revision as of 12:21, 5 August 2024


At the HPC R can be used in the command line with batch scripts submitted via slurm or via a web GUI, RStudio, through Open Ondemand.

Modules

One version of R is installed for every year. These are accessible through environment modules. Thus in order to access a specific version of R one must first load the year module, followed by the available R version for that year.

Additionally, extension bundle modules for R are also present.These modules contain a list of the installed extensions in their module files and are thus searcheable

Searching for extensions

In order to search for a particular extension, use the module key command

module key [extensionName]

For instance, when searching for the terra extension,

module load 2023
module key terra

The following output is then printed

-----------------------------------------------------------------------------------------------------------
The following modules match your search criteria: "terra"
-----------------------------------------------------------------------------------------------------------

  R-bundle-CRAN: R-bundle-CRAN/2023.12-foss-2023a
    Bundle of R packages from CRAN

-----------------------------------------------------------------------------------------------------------

This indicated that the terra extension is contained in the R-bundle-CRAN/2023.12-foss-2023a module. This bundle loads the corresponding R version for that year and adds extensions to it.

User Local Library

R allows the user to create its own local environment. There the user can install its own packages. you can then store these extension libraries in a local folder. It is handy to keep track of the version of R used, so try to keep your folders organized.

The first step is to a load a version of R

module load 2023
module load R/4.3.2
R


The commands above load the 2023 bucket, loads the R/4.3.2 module and executes the R interactive runtime. Once inside the runtime (or a jupyter notebook running an R kernel), you can check your library paths with

.libPaths()

Creating a new library

set the path to your new library in the variable new_library

new_library='/home/WUR/user001/.R_432_ext/'

create the library folder

dir.create(file.path(new_library), showWarnings = TRUE) 

If the folder already exists, R will display a warning.

Once created, you can then append the newly created folder to your libPaths

.libPaths(c(folder, .libPaths()) )

you can check if the operation was successful by running

.libPaths()


Installing New Extensions

In order to install a new package, one needs to use the command in the example below.

install.packages("ggplot2", repos="http://cran.r-project.org", libs="~/R_432_ext")


In this example, we are installing the ggplot2 extensions. We need to specify the repository we are downloading the package from, which in this case is the r-project website. Finally we need to specify the destination location of the libraries.

Submitting Slurm jobs

Slurm job script use bash as an interpreter (note the #!/bin/bash on the first line), so it cannot execute R code. Its job is to allocate resources in the cluster, load modules and execute any other bash command you need in your job.

Thus in order to launch an R job via slurm, one needs to have two scripts: an R script and a slurm (bash) script.


Here is an example of a very simple R script that will list all installed extensions.

#!/usr/bin/env Rscript
 
installed.packages()[,1]

Let's call this script list_ext.r and let's place in ~/myRScripts. Note that the first line is used to point to the R interpreter, Rscript. In order to access it, the R (or R-bundle) module must have been loaded beforehand. This is done in the sbatch script.

Important: Make sure that the R script, in this example list_ext.r, is executable.

#!/bin/bash
#SBATCH --comment="List R extensions" 
#SBATCH --time=0-0:10:00 # 10 minutes
#SBATCH --mem=1G
#SBATCH --ntasks=1
#SBATCH --output=output_%j.txt
#SBATCH --error=error_output_%j.txt
#SBATCH --job-name=
 
module load 2023
module load R-bundle-CRAN/2023.12-foss-2023a
 
#execute the R-script
~/myRScripts/list_ext.r
 

Since this is just an illustrative job, we have only allocated 1 GB of RAM and a single CPU.

  • RStudio in Open OnDemand