Running Snakemake pipelines: Difference between revisions
(Created page with "Author: Carolina Pita Barros <br/> Contact: carolina.pitabarros@wur.nl <br/> ABG <br/><br/> You can find my pipelines [https://github.com/CarolinaPB/ here] The Snakemake sha...") |
No edit summary |
||
Line 83: | Line 83: | ||
<pre>snakemake -np</pre> | <pre>snakemake -np</pre> | ||
This will show you the steps and commands that will be executed. Check the commands and file names to see | This will show you the steps and commands that will be executed. Check the commands and file names to see if there’s any mistake. | ||
If all looks ok, you can now run your pipeline | |||
<pre>snakemake --profile <name-of-pipeline></pre> | |||
If everything was set up correctly, the jobs should be submitted and you should be able to see the progress of the pipeline in your terminal. |
Revision as of 07:39, 1 July 2021
Author: Carolina Pita Barros
Contact: carolina.pitabarros@wur.nl
ABG
You can find my pipelines here
The Snakemake shared here use modules loaded from the HPC and tools installed with conda.
Click here for an introduction to Snakemake
Clone the repository
From github
Go to the repository’s page, click the green “Code” button and copy the path In your terminal go to where you want to download it to and run
git clone <path you copied from github>
From the the WUR HPC (Anunna)
Go to /lustre/nobackup/WUR/ABGC/shared/PIPELINES/
and choose which pipeline you want to use.
cp <pipeline directory> <directory where you want to save it to>
First you’ll need to do some set up. Go to the pipeline’s directory.
Installation
Install conda
if you don’t have it
Create conda environment
conda create --name <name-of-pipeline> --file requirements.txt
I recommend giving it the same name as the pipeline
This environment contains snakemake and the other packages that are needed to run the pipeline.
Activate environment
conda activate <name-of-pipeline>
To deactivate the environment (if you want to leave the conda environment)
conda deactivate
File configuration
Create HPC config file
Necessary for snakemake to prepare and send jobs.
Start with creating the directory
mkdir -p ~/.config/snakemake/<name-of-pipeline> cd ~/.config/snakemake/<name-of-pipeline>
Create config.yaml and include the following:
My pipelines are configured to work with SLURM
jobs: 10 cluster: "sbatch -t 1:0:0 --mem=16000 -c 16 --job-name={rule} --exclude=fat001,fat002,fat101,fat100 --output=logs_slurm/{rule}.out --error=logs_slurm/{rule}.err" use-conda: true
Here you should configure the resources you want to use.
Go to the pipeline directory and open config.yaml
Configure your paths, but keep the variable names that are already in the config file.
OUTDIR: /path/to/output READS_DIR: /path/to/reads/ ASSEMBLY: /path/to/assembly PREFIX: <output name>
If you want the results to be written to this directory (not to a new directory), open the Snakefile and comment out workdir: config["OUTDIR"]
and ignore or comment out the OUTDIR: /path/to/output
in the config file.
Now the setup is complete
How to run the pipeline
Since the pipelines can take a while to run, it’s best if you use a screen session. By using a screen session, Snakemake stays “active” in the shell while it’s running, there’s no risk of the connection going down and Snakemake stopping.
Start by creating a screen session:
screen -S <name of session>
Then run
snakemake -np
This will show you the steps and commands that will be executed. Check the commands and file names to see if there’s any mistake.
If all looks ok, you can now run your pipeline
snakemake --profile <name-of-pipeline>
If everything was set up correctly, the jobs should be submitted and you should be able to see the progress of the pipeline in your terminal.