Making slices from BAM files: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
== create a variable with all your bamfile names ==
This article describes how to automate the creation of slices from a collection of BAM files. This procedure assumes that all BAM files are together in a single folder.


== Variable that points to the collection of bam files ==
First, make a variable BAMDIR that holds the path to the BAM files:
<source lang='bash'>
BAMDIR='/shared/Sus/BAM_files_hjm_newbuild10_2/'
</source>
== Create a variable with all your bamfile names ==
Then create a variable, BAMFILE, that holds all the names of the BAM files:
<source lang='bash'>
<source lang='bash'>
BAMFILE=`ls $BAMDIR*.bam | sed 's/.\+\///' | sed 's/.bam//'`
BAMFILE=`ls $BAMDIR*.bam | sed 's/.\+\///' | sed 's/.bam//'`
</source>
</source>


== check that your variable contains valid information ==
== Check that your variable contains valid information ==
Optional: if you are not sure that everything went ok, you can of course always check what values your variable contains:
<source lang='bash'>
<source lang='bash'>
echo $BAMFILE
echo $BAMFILE
</source>
</source>
== make a slice for each bamfile ==
== Make a slice for each bamfile ==
The slices can then simply be extracted by a shell one-liner. It assumes that samtools is installed and in your path. The example here would slice out the region containing the KIT gene in Sus scrofa build 10.2.
<source lang='bash'>
<source lang='bash'>
for i in $BAMFILE;  
for i in $BAMFILE;  
     do echo $i;  
     do echo $i;  
     samtools view -b /shared/Sus/BAM_files_hjm_newbuild10_2/$i.bam Ssc10_2_8:43212683-43981653 >$i.SSC8_43212683-43981653.bam ;  
     samtools view -b $BAMDIR$i.bam Ssc10_2_8:43212683-43981653 >$i.SSC8_43212683-43981653.bam ;  
done
done
</source>
</source>

Latest revision as of 14:21, 14 March 2012

This article describes how to automate the creation of slices from a collection of BAM files. This procedure assumes that all BAM files are together in a single folder.

Variable that points to the collection of bam files

First, make a variable BAMDIR that holds the path to the BAM files: <source lang='bash'> BAMDIR='/shared/Sus/BAM_files_hjm_newbuild10_2/' </source>

Create a variable with all your bamfile names

Then create a variable, BAMFILE, that holds all the names of the BAM files: <source lang='bash'> BAMFILE=`ls $BAMDIR*.bam | sed 's/.\+\///' | sed 's/.bam//'` </source>

Check that your variable contains valid information

Optional: if you are not sure that everything went ok, you can of course always check what values your variable contains: <source lang='bash'> echo $BAMFILE </source>

Make a slice for each bamfile

The slices can then simply be extracted by a shell one-liner. It assumes that samtools is installed and in your path. The example here would slice out the region containing the KIT gene in Sus scrofa build 10.2. <source lang='bash'> for i in $BAMFILE;

   do echo $i; 
   samtools view -b $BAMDIR$i.bam Ssc10_2_8:43212683-43981653 >$i.SSC8_43212683-43981653.bam ; 

done </source>