Mapping and variant calling pipeline: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
m (added pipeline workflow, small details)
m (changed mapping tool)
Line 2: Line 2:
Contact: carolina.pitabarros@wur.nl  <br />
Contact: carolina.pitabarros@wur.nl  <br />
ABG<br />
ABG<br />
Path to pipeline: /lustre/nobackup/WUR/ABGC/shared/PIPELINES/mapping-variant-calling 


[https://github.com/CarolinaPB/WUR_mapping-variant-calling Link to the repository]
[https://github.com/CarolinaPB/WUR_mapping-variant-calling Link to the repository]
Line 16: Line 18:
==== Tools used: ====
==== Tools used: ====


* Bwa - mapping
* Bwa-mem2 - mapping
* Samtools - processing
* Samtools - processing
* Qualimap - mapping summary
* Qualimap - mapping summary

Revision as of 10:38, 1 November 2021

Author: Carolina Pita Barros
Contact: carolina.pitabarros@wur.nl
ABG

Path to pipeline: /lustre/nobackup/WUR/ABGC/shared/PIPELINES/mapping-variant-calling

Link to the repository

First follow the instructions here:

Step by step guide on how to use my pipelines
Click here for an introduction to Snakemake

ABOUT

This is a pipeline to map short reads to a reference assembly. It outputs the mapped reads, a qualimap report and does variant calling.

Tools used:

  • Bwa-mem2 - mapping
  • Samtools - processing
  • Qualimap - mapping summary
  • Freebayes - variant calling
  • Bcftools - VCF statistics
Mapping-variant-calling-workflow.png
Pipeline workflow

Edit config.yaml with the paths to your files

OUTDIR: /path/to/output 
READS_DIR: /path/to/reads/ # don't add the reads files, just the directory where they are
ASSEMBLY: /path/to/assembly
PREFIX: <output name>
  • OUTDIR - directory where snakemake will run and where the results will be written to
  • READS_DIR - path to the directory that contains the reads
  • ASSEMBLY - path to the assembly file
  • PREFIX - prefix for the final mapped reads file

If you want the results to be written to this directory (not to a new directory), comment out

READS_DIR: /path/to/reads/

RESULTS

  • dated file with an overview of the files used to run the pipeline (for documentation purposes)
  • sorted_reads directory with the file containing the mapped reads
  • results directory containing the qualimap results
  • variant_calling directory containing the variant calling VCF file and a file with with VCF statistics