Making slices from BAM files

From HPCwiki
Jump to navigation Jump to search

This article describes how to automate the creation of slices from a collection of BAM files. This procedure assumes that all BAM files are together in a single folder.

Variable that points to the collection of bam files

First, make a variable BAMDIR that holds the path to the BAM files:

BAMDIR='/shared/Sus/BAM_files_hjm_newbuild10_2/'

Create a variable with all your bamfile names

Then create a variable, BAMFILE, that holds all the names of the BAM files:

BAMFILE=`ls $BAMDIR*.bam | sed 's/.\+\///' | sed 's/.bam//'`

Check that your variable contains valid information

Optional: if you are not sure that everything went ok, you can of course always check what values your variable contains:

echo $BAMFILE

Make a slice for each bamfile

The slices can then simply be extracted by a shell one-liner. It assumes that samtools is installed and in your path. The example here would slice out the region containing the KIT gene in Sus scrofa build 10.2.

for i in $BAMFILE; 
    do echo $i; 
    samtools view -b $BAMDIR$i.bam Ssc10_2_8:43212683-43981653 >$i.SSC8_43212683-43981653.bam ; 
done