Command line tricks for manipulating fastq

From HPCwiki
Revision as of 12:25, 8 December 2013 by Megen002 (talk | contribs) (Created page with "== Extracting a sequence based on read name == <source lang='bash'> gunzip -c reads.fq.gz | sed -n '/readname/,+3 p' </source> == From BAM/SAM to fastq == <source lang='bas...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Extracting a sequence based on read name

<source lang='bash'> gunzip -c reads.fq.gz | sed -n '/readname/,+3 p' </source>

From BAM/SAM to fastq

<source lang='bash'> samtools view bamfile.bam | grep -v '^@' | awk '{print "@"$1"\n"$10"\n+\n"$11}' </source>

From fastq to fasta

<source lang='bash'> gunzip -c fastqfile.fq.gz | sed 's/^@/>/' | awk '{print;getline;print;getline;getline}' </source>

Counting number of bases in a fastq file

<source lang='bash'> gunzip -c fastqfile.fq.gz | awk '{;getline;print;getline;getline}' | wc </source>