Running Snakemake pipelines: Difference between revisions

Latest revision as of 12:25, 5 January 2022

Author: Carolina Pita Barros
Contact: carolina.pitabarros@wur.nl
ABG

You can find my pipelines here

The Snakemake shared here use modules loaded from the HPC and tools installed with conda.

Click here for an introduction to Snakemake

Clone the repository

From github

Go to the repository’s page, click the green “Code” button and copy the path
In your terminal go to where you want to download it to and run

git clone <path you copied from github>

From the the WUR HPC (Anunna)

Go to /lustre/nobackup/WUR/ABGC/shared/PIPELINES/ and choose which pipeline you want to use.

cp -r <pipeline directory> <directory where you want to save it to>

First you’ll need to do some set up. Go to the pipeline’s directory.

Installation

Install conda if you don’t have it Update 05/01/2022:
Here I show how to install miniconda in a linux system
Download installer
Installation instructions

Download the installer to your home directory. Choose the version according to your operating system. You can right click the link, copy and download with

wget <link>

At the time of writing this update, for me it would be:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

To install miniconda, run:

bash <installer name>

installer name could be Miniconda3-latest-Linux-x86_64.sh

Set up the conda channels in this order:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

Create conda environment

conda create --name <name-of-pipeline> --file requirements.txt

I recommend giving it the same name as the pipeline

This environment contains snakemake and the other packages that are needed to run the pipeline.

Activate environment

conda activate <name-of-pipeline>

To deactivate the environment (if you want to leave the conda environment)

conda deactivate

File configuration

Create HPC config file

Necessary for snakemake to prepare and send jobs.

Start with creating the directory

mkdir -p ~/.config/snakemake/<name-of-pipeline>
cd ~/.config/snakemake/<name-of-pipeline>

Create config.yaml and include the following:

My pipelines are configured to work with SLURM

jobs: 10
cluster: "sbatch -t 1:0:0 --mem=16000 -c 16 --job-name={rule} --exclude=fat001,fat002,fat101,fat100 --output=logs_slurm/{rule}.out --error=logs_slurm/{rule}.err"

use-conda: true

Here you should configure the resources you want to use.

Go to the pipeline directory and open config.yaml

Configure your paths, but keep the variable names that are already in the config file.

OUTDIR: /path/to/output
READS_DIR: /path/to/reads/ 
ASSEMBLY: /path/to/assembly
PREFIX: <output name>

If you want the results to be written to this directory (not to a new directory), open the Snakefile and comment out workdir: config["OUTDIR"] and ignore or comment out the OUTDIR: /path/to/output in the config file.

Now the setup is complete

How to run the pipeline

Since the pipelines can take a while to run, it’s best if you use a screen session. By using a screen session, Snakemake stays “active” in the shell while it’s running, there’s no risk of the connection going down and Snakemake stopping.

Start by creating a screen session:

screen -S <name of session>

You'll need to activate the conda environment again

conda activate <name-of-pipeline>

Then run

snakemake -np

This will show you the steps and commands that will be executed. Check the commands and file names to see if there’s any mistake.

If all looks ok, you can now run your pipeline

snakemake --profile <name-of-pipeline>

If everything was set up correctly, the jobs should be submitted and you should be able to see the progress of the pipeline in your terminal.

@@ Line 14: / Line 14: @@
 ==== From github ====
-Go to the repository’s page, click the green “Code” button and copy the path In your terminal go to where you want to download it to and run
+Go to the repository’s page, click the green “Code” button and copy the path   <br/>
+In your terminal go to where you want to download it to and run
 <pre>git clone &lt;path you copied from github&gt;</pre>
@@ Line 21: / Line 22: @@
 Go to <code>/lustre/nobackup/WUR/ABGC/shared/PIPELINES/</code> and choose which pipeline you want to use.
-<pre>cp &lt;pipeline directory&gt; &lt;directory where you want to save it to&gt;</pre>
+<pre>cp -r &lt;pipeline directory&gt; &lt;directory where you want to save it to&gt;</pre>
 First you’ll need to do some set up. Go to the pipeline’s directory.
@@ Line 27: / Line 28: @@
 Install <code>conda</code> if you don’t have it
+''Update 05/01/2022:''<br />
+Here I show how to install miniconda in a linux system<br />
+[https://docs.conda.io/en/latest/miniconda.html Download installer]<br />
+[https://conda.io/projects/conda/en/latest/user-guide/install/index.html Installation instructions]
+# Download the installer to your home directory. Choose the version according to your operating system. You can right click the link, copy and download with
+<pre>wget &lt;link&gt;</pre>
+At the time of writing this update, for me it would be:
+<pre>wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh</pre>
+To install miniconda, run:
+<pre>bash &lt;installer name&gt;</pre>
+installer name could be <code>Miniconda3-latest-Linux-x86_64.sh</code>
+Set up the conda channels in this order:
+<pre>conda config --add channels defaults
+conda config --add channels bioconda
+conda config --add channels conda-forge</pre>
 === Create conda environment ===
@@ Line 80: / Line 102: @@
 <pre>screen -S &lt;name of session&gt;</pre>
+You'll need to activate the conda environment again
+<pre>conda activate &lt;name-of-pipeline&gt;</pre>
 Then run
 <pre>snakemake -np</pre>
-This will show you the steps and commands that will be executed. Check the commands and file names to see i
+This will show you the steps and commands that will be executed. Check the commands and file names to see if there’s any mistake.
+If all looks ok, you can now run your pipeline
+<pre>snakemake --profile &lt;name-of-pipeline&gt;</pre>
+If everything was set up correctly, the jobs should be submitted and you should be able to see the progress of the pipeline in your terminal.

Running Snakemake pipelines: Difference between revisions

Latest revision as of 12:25, 5 January 2022

Contents

Clone the repository

From github

From the the WUR HPC (Anunna)

Installation

Create conda environment

Activate environment

To deactivate the environment (if you want to leave the conda environment)

File configuration

Create HPC config file

Start with creating the directory

Create config.yaml and include the following:

Go to the pipeline directory and open config.yaml

How to run the pipeline

Navigation menu

Running Snakemake pipelines: Difference between revisions

Latest revision as of 12:25, 5 January 2022

Clone the repository

From github

From the the WUR HPC (Anunna)

Installation

Create conda environment

Activate environment

To deactivate the environment (if you want to leave the conda environment)

File configuration

Create HPC config file

Start with creating the directory

Create config.yaml and include the following:

Go to the pipeline directory and open config.yaml

How to run the pipeline

Navigation menu

Search