HPCwiki - User contributions [en]

File:Whole-genome-alignment-workflow.png

2022-07-18T13:46:53Z

Moiti001:

File:Single-cell-processing-workflow.png

2022-07-18T09:03:43Z

Moiti001:

File:Population-var-calling-workflow.png

2022-03-02T13:45:46Z

Moiti001:

Workflow Engines (Snakemake, Nextflow)

2022-01-05T11:25:11Z

Moiti001: added info on how to install miniconda

Author: Carolina Pita Barros 
Contact: carolina.pitabarros@wur.nl 
ABG

 
You can find my pipelines [https://github.com/CarolinaPB/ here]

The Snakemake shared here use modules loaded from the HPC and tools installed with conda.

Click [https://github.com/CarolinaPB/snakemake-template/blob/master/Short%20introduction%20to%20Snakemake.pdf here] for an introduction to Snakemake

== Clone the repository ==

==== From github ====

Go to the repository’s page, click the green “Code” button and copy the path 
In your terminal go to where you want to download it to and run

<pre>git clone <path you copied from github></pre>
==== From the the WUR HPC (Anunna) ====

Go to <code>/lustre/nobackup/WUR/ABGC/shared/PIPELINES/</code> and choose which pipeline you want to use.

<pre>cp -r <pipeline directory> <directory where you want to save it to></pre>
First you’ll need to do some set up. Go to the pipeline’s directory.

== Installation ==

Install <code>conda</code> if you don’t have it
''Update 05/01/2022:'' 
Here I show how to install miniconda in a linux system 
[https://docs.conda.io/en/latest/miniconda.html Download installer] 
[https://conda.io/projects/conda/en/latest/user-guide/install/index.html Installation instructions]

# Download the installer to your home directory. Choose the version according to your operating system. You can right click the link, copy and download with

<pre>wget <link></pre>
At the time of writing this update, for me it would be:

<pre>wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh</pre>
To install miniconda, run:

<pre>bash <installer name></pre>
installer name could be <code>Miniconda3-latest-Linux-x86_64.sh</code>

Set up the conda channels in this order:

<pre>conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge</pre>

=== Create conda environment ===

<pre>conda create --name <name-of-pipeline> --file requirements.txt</pre>
<blockquote>I recommend giving it the same name as the pipeline
</blockquote>
This environment contains snakemake and the other packages that are needed to run the pipeline.

=== Activate environment ===

<pre>conda activate <name-of-pipeline></pre>
=== To deactivate the environment (if you want to leave the conda environment) ===

<pre>conda deactivate</pre>
== File configuration ==

=== Create HPC config file ===

Necessary for snakemake to prepare and send jobs.

==== Start with creating the directory ====

<pre>mkdir -p ~/.config/snakemake/<name-of-pipeline>
cd ~/.config/snakemake/<name-of-pipeline></pre>
==== Create config.yaml and include the following: ====

<blockquote>My pipelines are configured to work with SLURM
</blockquote>
<pre>jobs: 10
cluster: "sbatch -t 1:0:0 --mem=16000 -c 16 --job-name={rule} --exclude=fat001,fat002,fat101,fat100 --output=logs_slurm/{rule}.out --error=logs_slurm/{rule}.err"

use-conda: true</pre>
<blockquote>Here you should configure the resources you want to use.
</blockquote>
=== Go to the pipeline directory and open config.yaml ===

Configure your paths, but keep the variable names that are already in the config file.

<pre>OUTDIR: /path/to/output
READS_DIR: /path/to/reads/
ASSEMBLY: /path/to/assembly
PREFIX: <output name></pre>
If you want the results to be written to this directory (not to a new directory), open the Snakefile and comment out <code>workdir: config["OUTDIR"]</code> and ignore or comment out the <code>OUTDIR: /path/to/output</code> in the config file.

'''Now the setup is complete'''

== How to run the pipeline ==

Since the pipelines can take a while to run, it’s best if you use a [https://linuxize.com/post/how-to-use-linux-screen/ screen session]. By using a screen session, Snakemake stays “active” in the shell while it’s running, there’s no risk of the connection going down and Snakemake stopping.

Start by creating a screen session:

<pre>screen -S <name of session></pre>

You'll need to activate the conda environment again
<pre>conda activate <name-of-pipeline></pre>

Then run

<pre>snakemake -np</pre>
This will show you the steps and commands that will be executed. Check the commands and file names to see if there’s any mistake.

If all looks ok, you can now run your pipeline

<pre>snakemake --profile <name-of-pipeline></pre>
If everything was set up correctly, the jobs should be submitted and you should be able to see the progress of the pipeline in your terminal.

File:Nanopore-assembly-workflow.png

2021-11-01T10:03:49Z

Moiti001: Moiti001 uploaded a new version of File:Nanopore-assembly-workflow.png

File:Nanopore-assembly-workflow.png

2021-10-29T08:38:21Z

Moiti001:

File:Pop-sv-calling-workflow.png

2021-10-28T08:43:27Z

Moiti001: Moiti001 uploaded a new version of File:Pop-sv-calling-workflow.png

population structural variation calling pipeline workflow

File:Mapping-variant-calling-workflow.png

2021-10-22T11:48:00Z

Moiti001: workflow for mapping and variant calling pipeline

workflow for mapping and variant calling pipeline

Workflow Engines (Snakemake, Nextflow)

2021-10-08T12:57:02Z

Moiti001:

Author: Carolina Pita Barros 
Contact: carolina.pitabarros@wur.nl 
ABG

 
You can find my pipelines [https://github.com/CarolinaPB/ here]

The Snakemake shared here use modules loaded from the HPC and tools installed with conda.

Click [https://github.com/CarolinaPB/snakemake-template/blob/master/Short%20introduction%20to%20Snakemake.pdf here] for an introduction to Snakemake

== Clone the repository ==

==== From github ====

Go to the repository’s page, click the green “Code” button and copy the path 
In your terminal go to where you want to download it to and run

<pre>git clone <path you copied from github></pre>
==== From the the WUR HPC (Anunna) ====

Go to <code>/lustre/nobackup/WUR/ABGC/shared/PIPELINES/</code> and choose which pipeline you want to use.

<pre>cp -r <pipeline directory> <directory where you want to save it to></pre>
First you’ll need to do some set up. Go to the pipeline’s directory.

== Installation ==

Install <code>conda</code> if you don’t have it

=== Create conda environment ===

<pre>conda create --name <name-of-pipeline> --file requirements.txt</pre>
<blockquote>I recommend giving it the same name as the pipeline
</blockquote>
This environment contains snakemake and the other packages that are needed to run the pipeline.

=== Activate environment ===

<pre>conda activate <name-of-pipeline></pre>
=== To deactivate the environment (if you want to leave the conda environment) ===

<pre>conda deactivate</pre>
== File configuration ==

=== Create HPC config file ===

Necessary for snakemake to prepare and send jobs.

==== Start with creating the directory ====

<pre>mkdir -p ~/.config/snakemake/<name-of-pipeline>
cd ~/.config/snakemake/<name-of-pipeline></pre>
==== Create config.yaml and include the following: ====

<blockquote>My pipelines are configured to work with SLURM
</blockquote>
<pre>jobs: 10
cluster: "sbatch -t 1:0:0 --mem=16000 -c 16 --job-name={rule} --exclude=fat001,fat002,fat101,fat100 --output=logs_slurm/{rule}.out --error=logs_slurm/{rule}.err"

use-conda: true</pre>
<blockquote>Here you should configure the resources you want to use.
</blockquote>
=== Go to the pipeline directory and open config.yaml ===

Configure your paths, but keep the variable names that are already in the config file.

<pre>OUTDIR: /path/to/output
READS_DIR: /path/to/reads/
ASSEMBLY: /path/to/assembly
PREFIX: <output name></pre>
If you want the results to be written to this directory (not to a new directory), open the Snakefile and comment out <code>workdir: config["OUTDIR"]</code> and ignore or comment out the <code>OUTDIR: /path/to/output</code> in the config file.

'''Now the setup is complete'''

== How to run the pipeline ==

Since the pipelines can take a while to run, it’s best if you use a [https://linuxize.com/post/how-to-use-linux-screen/ screen session]. By using a screen session, Snakemake stays “active” in the shell while it’s running, there’s no risk of the connection going down and Snakemake stopping.

Start by creating a screen session:

<pre>screen -S <name of session></pre>

You'll need to activate the conda environment again
<pre>conda activate <name-of-pipeline></pre>

Then run

<pre>snakemake -np</pre>
This will show you the steps and commands that will be executed. Check the commands and file names to see if there’s any mistake.

If all looks ok, you can now run your pipeline

<pre>snakemake --profile <name-of-pipeline></pre>
If everything was set up correctly, the jobs should be submitted and you should be able to see the progress of the pipeline in your terminal.

File:Population-mapping-workflow.png

2021-09-29T12:51:44Z

Moiti001: workflow for population mapping snakemake pipeline

workflow for population mapping snakemake pipeline

File:Pop-sv-calling-workflow.png

2021-09-22T11:30:39Z

Moiti001: population structural variation calling pipeline workflow

population structural variation calling pipeline workflow

Workflow Engines (Snakemake, Nextflow)

2021-07-01T13:38:17Z

Moiti001:

Author: Carolina Pita Barros 
Contact: carolina.pitabarros@wur.nl 
ABG

 
You can find my pipelines [https://github.com/CarolinaPB/ here]

The Snakemake shared here use modules loaded from the HPC and tools installed with conda.

Click [https://github.com/CarolinaPB/snakemake-template/blob/master/Short%20introduction%20to%20Snakemake.pdf here] for an introduction to Snakemake

== Clone the repository ==

==== From github ====

Go to the repository’s page, click the green “Code” button and copy the path 
In your terminal go to where you want to download it to and run

<pre>git clone <path you copied from github></pre>
==== From the the WUR HPC (Anunna) ====

Go to <code>/lustre/nobackup/WUR/ABGC/shared/PIPELINES/</code> and choose which pipeline you want to use.

<pre>cp <pipeline directory> <directory where you want to save it to></pre>
First you’ll need to do some set up. Go to the pipeline’s directory.

== Installation ==

Install <code>conda</code> if you don’t have it

=== Create conda environment ===

<pre>conda create --name <name-of-pipeline> --file requirements.txt</pre>
<blockquote>I recommend giving it the same name as the pipeline
</blockquote>
This environment contains snakemake and the other packages that are needed to run the pipeline.

=== Activate environment ===

<pre>conda activate <name-of-pipeline></pre>
=== To deactivate the environment (if you want to leave the conda environment) ===

<pre>conda deactivate</pre>
== File configuration ==

=== Create HPC config file ===

Necessary for snakemake to prepare and send jobs.

==== Start with creating the directory ====

<pre>mkdir -p ~/.config/snakemake/<name-of-pipeline>
cd ~/.config/snakemake/<name-of-pipeline></pre>
==== Create config.yaml and include the following: ====

<blockquote>My pipelines are configured to work with SLURM
</blockquote>
<pre>jobs: 10
cluster: "sbatch -t 1:0:0 --mem=16000 -c 16 --job-name={rule} --exclude=fat001,fat002,fat101,fat100 --output=logs_slurm/{rule}.out --error=logs_slurm/{rule}.err"

use-conda: true</pre>
<blockquote>Here you should configure the resources you want to use.
</blockquote>
=== Go to the pipeline directory and open config.yaml ===

Configure your paths, but keep the variable names that are already in the config file.

<pre>OUTDIR: /path/to/output
READS_DIR: /path/to/reads/
ASSEMBLY: /path/to/assembly
PREFIX: <output name></pre>
If you want the results to be written to this directory (not to a new directory), open the Snakefile and comment out <code>workdir: config["OUTDIR"]</code> and ignore or comment out the <code>OUTDIR: /path/to/output</code> in the config file.

'''Now the setup is complete'''

== How to run the pipeline ==

Since the pipelines can take a while to run, it’s best if you use a [https://linuxize.com/post/how-to-use-linux-screen/ screen session]. By using a screen session, Snakemake stays “active” in the shell while it’s running, there’s no risk of the connection going down and Snakemake stopping.

Start by creating a screen session:

<pre>screen -S <name of session></pre>

You'll need to activate the conda environment again
<pre>conda activate <name-of-pipeline></pre>

Then run

<pre>snakemake -np</pre>
This will show you the steps and commands that will be executed. Check the commands and file names to see if there’s any mistake.

If all looks ok, you can now run your pipeline

<pre>snakemake --profile <name-of-pipeline></pre>
If everything was set up correctly, the jobs should be submitted and you should be able to see the progress of the pipeline in your terminal.

Workflow Engines (Snakemake, Nextflow)

2021-07-01T08:03:10Z

Moiti001:

Author: Carolina Pita Barros 
Contact: carolina.pitabarros@wur.nl 
ABG

 
You can find my pipelines [https://github.com/CarolinaPB/ here]

The Snakemake shared here use modules loaded from the HPC and tools installed with conda.

Click [https://github.com/CarolinaPB/snakemake-template/blob/master/Short%20introduction%20to%20Snakemake.pdf here] for an introduction to Snakemake

== Clone the repository ==

==== From github ====

Go to the repository’s page, click the green “Code” button and copy the path 
In your terminal go to where you want to download it to and run

<pre>git clone <path you copied from github></pre>
==== From the the WUR HPC (Anunna) ====

Go to <code>/lustre/nobackup/WUR/ABGC/shared/PIPELINES/</code> and choose which pipeline you want to use.

<pre>cp <pipeline directory> <directory where you want to save it to></pre>
First you’ll need to do some set up. Go to the pipeline’s directory.

== Installation ==

Install <code>conda</code> if you don’t have it

=== Create conda environment ===

<pre>conda create --name <name-of-pipeline> --file requirements.txt</pre>
<blockquote>I recommend giving it the same name as the pipeline
</blockquote>
This environment contains snakemake and the other packages that are needed to run the pipeline.

=== Activate environment ===

<pre>conda activate <name-of-pipeline></pre>
=== To deactivate the environment (if you want to leave the conda environment) ===

<pre>conda deactivate</pre>
== File configuration ==

=== Create HPC config file ===

Necessary for snakemake to prepare and send jobs.

==== Start with creating the directory ====

<pre>mkdir -p ~/.config/snakemake/<name-of-pipeline>
cd ~/.config/snakemake/<name-of-pipeline></pre>
==== Create config.yaml and include the following: ====

<blockquote>My pipelines are configured to work with SLURM
</blockquote>
<pre>jobs: 10
cluster: "sbatch -t 1:0:0 --mem=16000 -c 16 --job-name={rule} --exclude=fat001,fat002,fat101,fat100 --output=logs_slurm/{rule}.out --error=logs_slurm/{rule}.err"

use-conda: true</pre>
<blockquote>Here you should configure the resources you want to use.
</blockquote>
=== Go to the pipeline directory and open config.yaml ===

Configure your paths, but keep the variable names that are already in the config file.

<pre>OUTDIR: /path/to/output
READS_DIR: /path/to/reads/
ASSEMBLY: /path/to/assembly
PREFIX: <output name></pre>
If you want the results to be written to this directory (not to a new directory), open the Snakefile and comment out <code>workdir: config["OUTDIR"]</code> and ignore or comment out the <code>OUTDIR: /path/to/output</code> in the config file.

'''Now the setup is complete'''

== How to run the pipeline ==

Since the pipelines can take a while to run, it’s best if you use a [https://linuxize.com/post/how-to-use-linux-screen/ screen session]. By using a screen session, Snakemake stays “active” in the shell while it’s running, there’s no risk of the connection going down and Snakemake stopping.

Start by creating a screen session:

<pre>screen -S <name of session></pre>
Then run

<pre>snakemake -np</pre>
This will show you the steps and commands that will be executed. Check the commands and file names to see if there’s any mistake.

If all looks ok, you can now run your pipeline

<pre>snakemake --profile <name-of-pipeline></pre>
If everything was set up correctly, the jobs should be submitted and you should be able to see the progress of the pipeline in your terminal.

Workflow Engines (Snakemake, Nextflow)

2021-07-01T06:39:30Z

Moiti001:

Author: Carolina Pita Barros 
Contact: carolina.pitabarros@wur.nl 
ABG

 
You can find my pipelines [https://github.com/CarolinaPB/ here]

The Snakemake shared here use modules loaded from the HPC and tools installed with conda.

Click [https://github.com/CarolinaPB/snakemake-template/blob/master/Short%20introduction%20to%20Snakemake.pdf here] for an introduction to Snakemake

== Clone the repository ==

==== From github ====

Go to the repository’s page, click the green “Code” button and copy the path In your terminal go to where you want to download it to and run

<pre>git clone <path you copied from github></pre>
==== From the the WUR HPC (Anunna) ====

Go to <code>/lustre/nobackup/WUR/ABGC/shared/PIPELINES/</code> and choose which pipeline you want to use.

<pre>cp <pipeline directory> <directory where you want to save it to></pre>
First you’ll need to do some set up. Go to the pipeline’s directory.

== Installation ==

Install <code>conda</code> if you don’t have it

=== Create conda environment ===

<pre>conda create --name <name-of-pipeline> --file requirements.txt</pre>
<blockquote>I recommend giving it the same name as the pipeline
</blockquote>
This environment contains snakemake and the other packages that are needed to run the pipeline.

=== Activate environment ===

<pre>conda activate <name-of-pipeline></pre>
=== To deactivate the environment (if you want to leave the conda environment) ===

<pre>conda deactivate</pre>
== File configuration ==

=== Create HPC config file ===

Necessary for snakemake to prepare and send jobs.

==== Start with creating the directory ====

<pre>mkdir -p ~/.config/snakemake/<name-of-pipeline>
cd ~/.config/snakemake/<name-of-pipeline></pre>
==== Create config.yaml and include the following: ====

<blockquote>My pipelines are configured to work with SLURM
</blockquote>
<pre>jobs: 10
cluster: "sbatch -t 1:0:0 --mem=16000 -c 16 --job-name={rule} --exclude=fat001,fat002,fat101,fat100 --output=logs_slurm/{rule}.out --error=logs_slurm/{rule}.err"

use-conda: true</pre>
<blockquote>Here you should configure the resources you want to use.
</blockquote>
=== Go to the pipeline directory and open config.yaml ===

Configure your paths, but keep the variable names that are already in the config file.

<pre>OUTDIR: /path/to/output
READS_DIR: /path/to/reads/
ASSEMBLY: /path/to/assembly
PREFIX: <output name></pre>
If you want the results to be written to this directory (not to a new directory), open the Snakefile and comment out <code>workdir: config["OUTDIR"]</code> and ignore or comment out the <code>OUTDIR: /path/to/output</code> in the config file.

'''Now the setup is complete'''

== How to run the pipeline ==

Since the pipelines can take a while to run, it’s best if you use a [https://linuxize.com/post/how-to-use-linux-screen/ screen session]. By using a screen session, Snakemake stays “active” in the shell while it’s running, there’s no risk of the connection going down and Snakemake stopping.

Start by creating a screen session:

<pre>screen -S <name of session></pre>
Then run

<pre>snakemake -np</pre>
This will show you the steps and commands that will be executed. Check the commands and file names to see if there’s any mistake.

If all looks ok, you can now run your pipeline

<pre>snakemake --profile <name-of-pipeline></pre>
If everything was set up correctly, the jobs should be submitted and you should be able to see the progress of the pipeline in your terminal.

Workflow Engines (Snakemake, Nextflow)

2021-06-30T14:24:33Z

Moiti001: Created page with "Author: Carolina Pita Barros Contact: carolina.pitabarros@wur.nl ABG You can find my pipelines [https://github.com/CarolinaPB/ here] The Snakemake sha..."

Author: Carolina Pita Barros 
Contact: carolina.pitabarros@wur.nl 
ABG

 
You can find my pipelines [https://github.com/CarolinaPB/ here]

The Snakemake shared here use modules loaded from the HPC and tools installed with conda.

Click [https://github.com/CarolinaPB/snakemake-template/blob/master/Short%20introduction%20to%20Snakemake.pdf here] for an introduction to Snakemake

== Clone the repository ==

==== From github ====

Go to the repository’s page, click the green “Code” button and copy the path In your terminal go to where you want to download it to and run

<pre>git clone <path you copied from github></pre>
==== From the the WUR HPC (Anunna) ====

Go to <code>/lustre/nobackup/WUR/ABGC/shared/PIPELINES/</code> and choose which pipeline you want to use.

<pre>cp <pipeline directory> <directory where you want to save it to></pre>
First you’ll need to do some set up. Go to the pipeline’s directory.

== Installation ==

Install <code>conda</code> if you don’t have it

=== Create conda environment ===

<pre>conda create --name <name-of-pipeline> --file requirements.txt</pre>
<blockquote>I recommend giving it the same name as the pipeline
</blockquote>
This environment contains snakemake and the other packages that are needed to run the pipeline.

=== Activate environment ===

<pre>conda activate <name-of-pipeline></pre>
=== To deactivate the environment (if you want to leave the conda environment) ===

<pre>conda deactivate</pre>
== File configuration ==

=== Create HPC config file ===

Necessary for snakemake to prepare and send jobs.

==== Start with creating the directory ====

<pre>mkdir -p ~/.config/snakemake/<name-of-pipeline>
cd ~/.config/snakemake/<name-of-pipeline></pre>
==== Create config.yaml and include the following: ====

<blockquote>My pipelines are configured to work with SLURM
</blockquote>
<pre>jobs: 10
cluster: "sbatch -t 1:0:0 --mem=16000 -c 16 --job-name={rule} --exclude=fat001,fat002,fat101,fat100 --output=logs_slurm/{rule}.out --error=logs_slurm/{rule}.err"

use-conda: true</pre>
<blockquote>Here you should configure the resources you want to use.
</blockquote>
=== Go to the pipeline directory and open config.yaml ===

Configure your paths, but keep the variable names that are already in the config file.

<pre>OUTDIR: /path/to/output
READS_DIR: /path/to/reads/
ASSEMBLY: /path/to/assembly
PREFIX: <output name></pre>
If you want the results to be written to this directory (not to a new directory), open the Snakefile and comment out <code>workdir: config["OUTDIR"]</code> and ignore or comment out the <code>OUTDIR: /path/to/output</code> in the config file.

'''Now the setup is complete'''

== How to run the pipeline ==

Since the pipelines can take a while to run, it’s best if you use a [https://linuxize.com/post/how-to-use-linux-screen/ screen session]. By using a screen session, Snakemake stays “active” in the shell while it’s running, there’s no risk of the connection going down and Snakemake stopping.

Start by creating a screen session:

<pre>screen -S <name of session></pre>
Then run

<pre>snakemake -np</pre>
This will show you the steps and commands that will be executed. Check the commands and file names to see i