R: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
No edit summary
Line 35: Line 35:


==Installing your own R extensions==
==Installing your own R extensions==
R allows the user to create its own local environment. There the user can install its own packages. you can then store these extension libraries in a local folder. It is handy to keep track of the version of R used, so try to keep your folders organized.
The first step is to a load a version of R
<pre>
module load 2023
module load R/4.3.2
R
</pre>
The commands above load the 2023 bucket, loads the R/4.3.2 module and executes the R interactive runtime.
Once inside the runtime (or a jupyter notebook running an R kernel), you can check your library paths with
<pre>
.libPaths()
</pre>
===Creating a new library ===
set the path to your new library in the variable new_library
<pre>new_library='/home/WUR/user001/.R_432_ext/'</pre>
create the library folder
<pre>dir.create(file.path(new_library), showWarnings = TRUE) </pre>
If the folder already exists, R will display a warning.
Once created, you can then append the newly created folder to your libPaths
<pre>.libPaths(c(folder, .libPaths()) )</pre>
you can check if the operation was successful by running
<pre>
.libPaths()
</pre>
===Installing New Extensions ===
In order to install a new package, one needs to use the command in the example below.
<pre>
install.packages("ggplot2", repos="http://cran.r-project.org", libs="~/R_432_ext")
</pre>
In this example, we are installing the ggplot2 extensions. We need to specify the repository we are downloading the package from, which in this case is the r-project website. Finally we need to specify the destination location of the libraries.


==Submitting Slurm jobs==
==Submitting Slurm jobs==

Revision as of 12:21, 5 August 2024


At the HPC R can be used in the command line with batch scripts submitted via slurm or via a web GUI, RStudio, through Open Ondemand.

Modules

One version of R is installed for every year. These are accessible through environment modules. Thus in order to access a specific version of R one must first load the year module, followed by the available R version for that year.

Additionally, extension bundle modules for R are also present.These modules contain a list of the installed extensions in their module files and are thus searcheable

Searching for extensions

In order to search for a particular extension, use the module key command

module key [extensionName]

For instance, when searching for the terra extension,

module load 2023
module key terra

The following output is then printed

-----------------------------------------------------------------------------------------------------------
The following modules match your search criteria: "terra"
-----------------------------------------------------------------------------------------------------------

  R-bundle-CRAN: R-bundle-CRAN/2023.12-foss-2023a
    Bundle of R packages from CRAN

-----------------------------------------------------------------------------------------------------------

This indicated that the terra extension is contained in the R-bundle-CRAN/2023.12-foss-2023a module. This bundle loads the corresponding R version for that year and adds extensions to it.

Installing your own R extensions

R allows the user to create its own local environment. There the user can install its own packages. you can then store these extension libraries in a local folder. It is handy to keep track of the version of R used, so try to keep your folders organized.

The first step is to a load a version of R

module load 2023
module load R/4.3.2
R


The commands above load the 2023 bucket, loads the R/4.3.2 module and executes the R interactive runtime. Once inside the runtime (or a jupyter notebook running an R kernel), you can check your library paths with

.libPaths()

Creating a new library

set the path to your new library in the variable new_library

new_library='/home/WUR/user001/.R_432_ext/'

create the library folder

dir.create(file.path(new_library), showWarnings = TRUE) 

If the folder already exists, R will display a warning.

Once created, you can then append the newly created folder to your libPaths

.libPaths(c(folder, .libPaths()) )

you can check if the operation was successful by running

.libPaths()


Installing New Extensions

In order to install a new package, one needs to use the command in the example below.

install.packages("ggplot2", repos="http://cran.r-project.org", libs="~/R_432_ext")


In this example, we are installing the ggplot2 extensions. We need to specify the repository we are downloading the package from, which in this case is the r-project website. Finally we need to specify the destination location of the libraries.

Submitting Slurm jobs

Slurm job script use bash as an interpreter (note the #!/bin/bash on the first line), so it cannot execute R code. Its job is to allocate resources in the cluster, load modules and execute any other bash command you need in your job.

Thus in order to launch an R job via slurm, one needs to have two scripts: an R script and a slurm (bash) script.


Here is an example of a very simple R script that will list all installed extensions.

#!/usr/bin/env Rscript
 
installed.packages()[,1]

Let's call this script list_ext.r and let's place in ~/myRScripts. Note that the first line is used to point to the R interpreter, Rscript. In order to access it, the R (or R-bundle) module must have been loaded beforehand. This is done in the sbatch script.

Important: Make sure that the R script, in this example list_ext.r, is executable.

#!/bin/bash
#SBATCH --comment="List R extensions" 
#SBATCH --time=0-0:10:00 # 10 minutes
#SBATCH --mem=1G
#SBATCH --ntasks=1
#SBATCH --output=output_%j.txt
#SBATCH --error=error_output_%j.txt
#SBATCH --job-name=
 
module load 2023
module load R-bundle-CRAN/2023.12-foss-2023a
 
#execute the R-script
~/myRScripts/list_ext.r
 

Since this is just an illustrative job, we have only allocated 1 GB of RAM and a single CPU.

  • RStudio in Open OnDemand