JBrowse: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
No edit summary
 
(36 intermediate revisions by 3 users not shown)
Line 4: Line 4:


* JBrowse is available for multiple species:
* JBrowse is available for multiple species:
- https://jbrowse.hpcagrogenomics.wur.nl/pig/
** https://jbrowse.hpcagrogenomics.wur.nl/pig/
** https://jbrowse.hpcagrogenomics.wur.nl/chicken/
** https://jbrowse.hpcagrogenomics.wur.nl/cattle/
** https://jbrowse.hpcagrogenomics.wur.nl/turkey/
** https://jbrowse.hpcagrogenomics.wur.nl/Cyprinus_carpio/
* Users are free to add usefull commands to this tutorial
 
=== Install JBrowse ===
 
Download the latest JBrowse here: http://jbrowse.org/
 
Make a directory in <code>/shared/apps/jbrowse/</code> for your species of interested (e.g. <code>mkdir Cyprinus_carpio</code>). Move the downloaded JBrowse source files there. All further procedures detailed in this Wiki page assume working from that directory (NOTE: if your species of interest is already there, contact the maintainer of that JBrowse instance).
Run the setup script to install perl dependencies and required modules
 
<pre>
unzip JBrowse-1.12.0.zip
mv JBrowse-1.12.0/* $PWD
./setup.sh
</pre>
 
=== Add reference sequence ===
 
Example code for chicken genome
 
<pre>
bin/prepare-refseqs.pl --fasta /lustre/nobackup/WUR/ABGC/shared/public_data_store/genomes/chicken/Ensembl74/Gallus_gallus.Galgal4.74.dna.toplevel.fa
</pre>
 
To remove tracks use following command:
 
<pre>
bin/remove-track.pl -D --trackLabel 'trackname'
</pre>
 
===Add annotation files (GFF/BED)===
 
Data can be downloaded from the Ensembl FTP site: http://www.ensembl.org/info/data/ftp/index.html
 
Add gene features:
 
<pre>
bin/flatfile-to-json.pl --key "Genes" --type gene --config '{ "category": "GalGal4.83 Annotation" }' --trackLabel Genes --gff ../ensembl_data/Gallus_gallus.Galgal4.83.gff3
</pre>
 
Add corresponding transcripts:
 
<pre>
bin/flatfile-to-json.pl --key "Transcripts" --className transcript --subfeatureClasses '{"exon": "exon", "CDS": "CDS", "five_prime_UTR": "five_prime_UTR", "three_prime_UTR": "three_prime_UTR"}' --config '{ "category": "GalGal4.83 Annotation" }' --type transcript --trackLabel Transcripts --gff ../ensembl_data/Gallus_gallus.Galgal4.83.gff3
</pre>
 
===Alignment tracks (BAM)===
 
You can load single BAM-files by following command:
<pre>
bin/add-bam-track --label <label> --bam_url <url>
</pre>
 
To load multiple BAM files present in a certain directory use:
<pre>
 
for bam in /<dir>*.bam; do
        ln -s $bam track_symlinks/ ## Make symlinks from the BAM files
        ln -s $bam.bai track_symlinks/ ## Make symlinks to the BAM index files
        tissue=`echo $bam | rev | cut -c 5- | cut -d'/' -f1 | rev` ## USe the name of the file without .bam as trackLabel
       
        ## Add BAM in alignment mode (Alignments2)
        echo '{
                "label" : "'${tissue}'_alignment",
                "key" : "'${tissue}'_alignment",
                "storeClass" : "JBrowse/Store/SeqFeature/BAM",
                "urlTemplate" : "../track_symlinks/'${tissue}'",
                "category" : "3. RNA-seq alignments",
                "type" : "Alignments2"
        }' | bin/add-track-json.pl data/trackList.json
 
        ## Add BAM in coverage mode (SNPCoverage)
        echo '{
                "label" : "'${tissue}'_coverage",
                "key" : "'${tissue}'_coverage",
                "storeClass" : "JBrowse/Store/SeqFeature/BAM",
                "urlTemplate" : "../track_symlinks/'${tissue}'",
                "category" : "3. RNA-seq alignments",
                "type" : "SNPCoverage"
        }' | bin/add-track-json.pl data/trackList.json
 
done
</pre>


- https://jbrowse.hpcagrogenomics.wur.nl/chicken/
Make sure the BAM file can be read by a everybody if not use:
<pre>
chmod +r <BAM_file>
</pre>


- https://jbrowse.hpcagrogenomics.wur.nl/cattle/
Make sure that all directoryies in the full path of the BAMfile are executable:
<pre>
chmod +x <dir>
</pre>


- https://jbrowse.hpcagrogenomics.wur.nl/turkey/
===Variant tracks (VCF)===


* Users are free to add usefull commands to this tutorial
To load a VCF file in JBrowse make sure the file is gzipped and indexed
* Use case in wintermoth. 12 samples (1 Egg, 1 Caterpillar, 3x3 Pupa, 1 Moth).
 
<pre>
tabix -p vcf Gallus_gallus_incl_consequences.vcf.gz
 
echo ' {
      "label" : "Gallus_gallus_incl_consequences",
      "key" : "Gallus_gallus_incl_consequences",
      "storeClass" : "JBrowse/Store/SeqFeature/VCFTabix",
      "urlTemplate" : "../../ensembl_data/VCF/Gallus_gallus_incl_consequences.vcf.gz",
      "category" : "2. Variants",
      "type" : "HTMLVariants"
    } ' | bin/add-track-json.pl data/trackList.json
</pre>
 
===Wiggle/BigWig tracks (WIG)===
 
You can load single BigWig-files by following command:
<pre>
bin/add-bw-track --label <label> --bw_url <url>
</pre>
 
===Evidence tracks===
 
Evidence tracks can be loaded in bed, gff and gbk format using
 
<pre>
bin/flatfile-to-json.pl
</pre>


===Trim sequences for quality and adapters===
Examples are given above.

Latest revision as of 10:01, 16 June 2023

Typical commands used to set up a JBrowse

Author: Martijn Derks

Install JBrowse

Download the latest JBrowse here: http://jbrowse.org/

Make a directory in /shared/apps/jbrowse/ for your species of interested (e.g. mkdir Cyprinus_carpio). Move the downloaded JBrowse source files there. All further procedures detailed in this Wiki page assume working from that directory (NOTE: if your species of interest is already there, contact the maintainer of that JBrowse instance). Run the setup script to install perl dependencies and required modules

unzip JBrowse-1.12.0.zip
mv JBrowse-1.12.0/* $PWD
./setup.sh

Add reference sequence

Example code for chicken genome

bin/prepare-refseqs.pl --fasta /lustre/nobackup/WUR/ABGC/shared/public_data_store/genomes/chicken/Ensembl74/Gallus_gallus.Galgal4.74.dna.toplevel.fa

To remove tracks use following command:

bin/remove-track.pl -D --trackLabel 'trackname'

Add annotation files (GFF/BED)

Data can be downloaded from the Ensembl FTP site: http://www.ensembl.org/info/data/ftp/index.html

Add gene features:

bin/flatfile-to-json.pl --key "Genes" --type gene --config '{ "category": "GalGal4.83 Annotation" }' --trackLabel Genes --gff ../ensembl_data/Gallus_gallus.Galgal4.83.gff3

Add corresponding transcripts:

bin/flatfile-to-json.pl --key "Transcripts" --className transcript --subfeatureClasses '{"exon": "exon", "CDS": "CDS", "five_prime_UTR": "five_prime_UTR", "three_prime_UTR": "three_prime_UTR"}' --config '{ "category": "GalGal4.83 Annotation" }' --type transcript --trackLabel Transcripts --gff ../ensembl_data/Gallus_gallus.Galgal4.83.gff3

Alignment tracks (BAM)

You can load single BAM-files by following command:

bin/add-bam-track --label <label> --bam_url <url>

To load multiple BAM files present in a certain directory use:


for bam in /<dir>*.bam; do
        ln -s $bam track_symlinks/ ## Make symlinks from the BAM files
        ln -s $bam.bai track_symlinks/ ## Make symlinks to the BAM index files
        tissue=`echo $bam | rev | cut -c 5- | cut -d'/' -f1 | rev` ## USe the name of the file without .bam as trackLabel
        
        ## Add BAM in alignment mode (Alignments2)
        echo '{
                "label" : "'${tissue}'_alignment",
                "key" : "'${tissue}'_alignment",
                "storeClass" : "JBrowse/Store/SeqFeature/BAM",
                "urlTemplate" : "../track_symlinks/'${tissue}'",
                "category" : "3. RNA-seq alignments",
                "type" : "Alignments2"
        }' | bin/add-track-json.pl data/trackList.json

        ## Add BAM in coverage mode (SNPCoverage)
        echo '{
                "label" : "'${tissue}'_coverage",
                "key" : "'${tissue}'_coverage",
                "storeClass" : "JBrowse/Store/SeqFeature/BAM",
                "urlTemplate" : "../track_symlinks/'${tissue}'",
                "category" : "3. RNA-seq alignments",
                "type" : "SNPCoverage"
        }' | bin/add-track-json.pl data/trackList.json

done

Make sure the BAM file can be read by a everybody if not use:

chmod +r <BAM_file>

Make sure that all directoryies in the full path of the BAMfile are executable:

chmod +x <dir>

Variant tracks (VCF)

To load a VCF file in JBrowse make sure the file is gzipped and indexed

tabix -p vcf Gallus_gallus_incl_consequences.vcf.gz

echo ' {
       "label" : "Gallus_gallus_incl_consequences",
       "key" : "Gallus_gallus_incl_consequences",
       "storeClass" : "JBrowse/Store/SeqFeature/VCFTabix",
       "urlTemplate" : "../../ensembl_data/VCF/Gallus_gallus_incl_consequences.vcf.gz",
       "category" : "2. Variants",
       "type" : "HTMLVariants"
     } ' | bin/add-track-json.pl data/trackList.json

Wiggle/BigWig tracks (WIG)

You can load single BigWig-files by following command:

bin/add-bw-track --label <label> --bw_url <url>

Evidence tracks

Evidence tracks can be loaded in bed, gff and gbk format using

bin/flatfile-to-json.pl

Examples are given above.