Parallel R code on SLURM: Difference between revisions
No edit summary |
No edit summary |
||
Line 2: | Line 2: | ||
== Using R code on SLURM for embarrassingly parallel calculations == | == Using R code on SLURM for embarrassingly parallel calculations == | ||
The most well-known R packages that provide parallel functionality, e.g. doParallel or doSNOW, do not work properly on | The most well-known R packages that provide parallel functionality, e.g. doParallel or doSNOW, do not work properly on Anunna. Using these packages will be particularly problematic when you try to run (array) jobs over multiple nodes. However, the [https://cran.r-project.org/web/packages/rslurm/vignettes/rslurm.html rslurm package] allows you to do [https://en.wikipedia.org/wiki/Embarrassingly_parallel embarrassingly parallel] calculations on SLURM. The package automatically divides the computation over multiple nodes and writes the necessary submission scripts. It also includes functions to retrieve and combine the output from different nodes, as well as wrappers for common SLURM commands. | ||
=== Example code === | === Example code === |
Latest revision as of 20:07, 19 February 2019
Using R code on SLURM for embarrassingly parallel calculations
The most well-known R packages that provide parallel functionality, e.g. doParallel or doSNOW, do not work properly on Anunna. Using these packages will be particularly problematic when you try to run (array) jobs over multiple nodes. However, the rslurm package allows you to do embarrassingly parallel calculations on SLURM. The package automatically divides the computation over multiple nodes and writes the necessary submission scripts. It also includes functions to retrieve and combine the output from different nodes, as well as wrappers for common SLURM commands.
Example code
library(rslurm)
sjob <- slurm_apply(test_func, pars, jobname = 'test_apply',
nodes = 2, cpus_per_node = 2, submit = FALSE)
Please be aware that new Slurm jobs will have a few seconds of lead time before executing, so try to make sure that your new tasks are appropriately long-living (~60s is a good minimum). Otherwise most of your 'compute' time will be waiting for jobs to start and stop.