Assembly & Annotation: Difference between revisions

Revision as of 10:45, 21 January 2016

Protocol with typical commands used for de novo assembly and annotation

Preprocessing
Assembly
Annotation
Submission
Visualization

Software

Preprocessing

Quality control:

Check quality of your data using FastQC and fastq_stats.py. <source lang='bash'> fastqc ../*.gz </source> Explore the report to do the quality check and identify potential adapters and primers in the sequences.

K-mer analysis:

Us the script kmer_analysis.sh to get the genomic properties based on the k-mer distribution. Genomic properties include genome size and percentage of heterozygosity.

<source lang='bash'> Preprocessing/kmer_analysis.sh -m <kmer_size> -c <error-cutoff> -s <hash-size> -t <threads> -i <R1.fastq.gz R2.fastq.gz ...> -o <output_dir> </source>

Trimming:

Use Trimmomatic to trim Illumina data. Make sure your fasta file with the adapters corresponds to the adapters found in the FastQC report. Use the following script: <source lang='bash'> preprocessing/run_trimmomatic.sh -t <num_threads> -f <FW_reads.fastq> -r <RV_reads.fastq> </source>

Error correction

Lighter is a fast tool to error correct your Illumina data. Use the following script: <source lang='bash'> preprocessing/run_lighter_error_correction.sh -g <genome_size> -c <coverage> -f <FW_reads.fastq> -r <RV_reads.fastq> </source>

Organelle assembly

Download a proper reference from the NCBI database. Use the IOGA pipeline to assemble to organellar genome. <source lang='bash'> assembly/run_IOGA.sh -a <assembly> -f <fw_reads.fastq> -r <reverse_reads.fastq> -i <insert_size> -t <num_threads> -n <name_prefix> </source> Map your reads to the newly assembled genome and manually check if it is circular.

Assembly & Annotation: Difference between revisions

Revision as of 10:45, 21 January 2016

Contents

Protocol with typical commands used for de novo assembly and annotation

Software

Preprocessing

Assembly

Annotation

Functional annotation

Submission

Other types of analysis

Navigation menu

@@ Line 12: / Line 12: @@
 Quality control:
 Check quality of your data using FastQC and fastq_stats.py.
 <source lang='bash'> fastqc ../*.gz </source>
@@ Line 17: / Line 18: @@
 K-mer analysis:
 Us the script kmer_analysis.sh to get the genomic properties based on the k-mer distribution. Genomic properties include genome size and percentage of heterozygosity.
@@ Line 22: / Line 24: @@
 Trimming:
 Use Trimmomatic to trim Illumina data. Make sure your fasta file with the adapters corresponds to the adapters found in the FastQC report.
 Use the following script: <source lang='bash'> preprocessing/run_trimmomatic.sh -t <num_threads> -f <FW_reads.fastq> -r <RV_reads.fastq> </source>
 Error correction
 Lighter is a fast tool to error correct your Illumina data.
 Use the following script: <source lang='bash'> preprocessing/run_lighter_error_correction.sh -g <genome_size> -c <coverage> -f <FW_reads.fastq> -r <RV_reads.fastq> </source>
 Organelle assembly
 Download a proper reference from the NCBI database.
 Use the IOGA pipeline to assemble to organellar genome.

Assembly & Annotation: Difference between revisions

Revision as of 10:45, 21 January 2016

Protocol with typical commands used for de novo assembly and annotation

Software

Preprocessing

Assembly

Annotation

Functional annotation

Submission

Other types of analysis

Navigation menu

Search