Command line tricks for manipulating fastq

From HPCwiki
Revision as of 13:25, 8 December 2013 by Megen002 (talk | contribs) (Created page with "== Extracting a sequence based on read name == <source lang='bash'> gunzip -c reads.fq.gz | sed -n '/readname/,+3 p' </source> == From BAM/SAM to fastq == <source lang='bas...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Extracting a sequence based on read name

gunzip -c reads.fq.gz | sed -n '/readname/,+3 p'

From BAM/SAM to fastq

samtools view bamfile.bam | grep -v '^@' | awk '{print "@"$1"\n"$10"\n+\n"$11}'

From fastq to fasta

gunzip -c fastqfile.fq.gz | sed 's/^@/>/' | awk '{print;getline;print;getline;getline}'

Counting number of bases in a fastq file

gunzip -c fastqfile.fq.gz | awk '{;getline;print;getline;getline}' | wc