Assembly & Annotation: Difference between revisions

← Older edit Newer edit →

VisualWikitext

Revision as of 11:45, 21 January 2016

Protocol with typical commands used for de novo assembly and annotation

Preprocessing
Assembly
Annotation
Submission
Visualization

Software

Preprocessing

Quality control:

Check quality of your data using FastQC and fastq_stats.py.

 fastqc ../*.gz

Explore the report to do the quality check and identify potential adapters and primers in the sequences.

K-mer analysis:

Us the script kmer_analysis.sh to get the genomic properties based on the k-mer distribution. Genomic properties include genome size and percentage of heterozygosity.

 Preprocessing/kmer_analysis.sh  -m <kmer_size> -c <error-cutoff> -s <hash-size> -t <threads> -i <R1.fastq.gz R2.fastq.gz ...> -o <output_dir>

Trimming:

Use Trimmomatic to trim Illumina data. Make sure your fasta file with the adapters corresponds to the adapters found in the FastQC report.

Use the following script:

 preprocessing/run_trimmomatic.sh -t <num_threads> -f <FW_reads.fastq> -r <RV_reads.fastq>

Error correction

Lighter is a fast tool to error correct your Illumina data.

Use the following script:

 preprocessing/run_lighter_error_correction.sh -g <genome_size> -c <coverage> -f <FW_reads.fastq> -r <RV_reads.fastq>

Organelle assembly

Download a proper reference from the NCBI database. Use the IOGA pipeline to assemble to organellar genome.

 assembly/run_IOGA.sh -a <assembly> -f <fw_reads.fastq> -r <reverse_reads.fastq> -i <insert_size> -t <num_threads> -n <name_prefix>

Map your reads to the newly assembled genome and manually check if it is circular.

Use Pilon to correct remaining errors in the assembly using the mapped reads.

Annotate using MITOS or DOGMA online tools.

Submit here: http://www.ncbi.nlm.nih.gov/LargeDirSubs/dir_submit.cgi

Assembly

Annotation

Functional annotation

Submission

Other types of analysis

Assembly & Annotation: Difference between revisions

Revision as of 11:45, 21 January 2016

Contents

Protocol with typical commands used for de novo assembly and annotation

Software

Preprocessing

Assembly

Annotation

Functional annotation

Submission

Other types of analysis

Navigation menu

@@ Line 12: / Line 12: @@
 Quality control:
 Check quality of your data using FastQC and fastq_stats.py.
 <source lang='bash'> fastqc ../*.gz </source>
@@ Line 17: / Line 18: @@
 K-mer analysis:
 Us the script kmer_analysis.sh to get the genomic properties based on the k-mer distribution. Genomic properties include genome size and percentage of heterozygosity.
@@ Line 22: / Line 24: @@
 Trimming:
 Use Trimmomatic to trim Illumina data. Make sure your fasta file with the adapters corresponds to the adapters found in the FastQC report.
 Use the following script: <source lang='bash'> preprocessing/run_trimmomatic.sh -t <num_threads> -f <FW_reads.fastq> -r <RV_reads.fastq> </source>
 Error correction
 Lighter is a fast tool to error correct your Illumina data.
 Use the following script: <source lang='bash'> preprocessing/run_lighter_error_correction.sh -g <genome_size> -c <coverage> -f <FW_reads.fastq> -r <RV_reads.fastq> </source>
 Organelle assembly
 Download a proper reference from the NCBI database.
 Use the IOGA pipeline to assemble to organellar genome.

Assembly & Annotation: Difference between revisions

Revision as of 11:45, 21 January 2016

Protocol with typical commands used for de novo assembly and annotation

Software

Preprocessing

Assembly

Annotation

Functional annotation

Submission

Other types of analysis

Navigation menu

Search