Plant & Animal de novo Sequencing
Cases
Technical Information
Contact Us / Wish List

de novo genome sequencing is used to sequence uncharacterized genomes where there is no reference sequence available, or known genomes where significant structural variation is expected. A detailed genetic analysis of an organism is possible only after de novo sequencing has been performed. To date, BGI Tech has sequenced 656 plant and animal reference genomes. The completed projects include rice, cucumber, potato, wheat, silkworm, panda, ant, oyster, minke whale, and so on.

Benefits:

  1. Best assembly quality with the highest N50 scaffold and contig sizes provided
  2. More comprehensive maps of genetic variation
  3. Variable gradient insert libraries that enable fine mapping of the genome

Comparative Analysis of Bat Genomes Provides Insight into the Evolution of Flight and Immunity. Science. 339: 456-460 (2013).

Comparative analysis of Fruit bat Pteropus alecto and insectivorous Myotis davidii genomes (~ 2 Gb) provides insight into the phylogenetic placement of bats, and moreover reveals evidence of genetic changes that may have contribution to their evolution.

A Heterozygous Moth Genome Provides Insights Into Herbivory and Detoxification. Nature Genetics. 45, 220–225 (2013).

The first genome sequence of the diamondback moth (DBM, 339 Mb) has been published, and this insect is the most destructive pest of brassica crops. This work shows the genetic and molecular bases for the evolutionary success of this worldwide herbivore, and offers insect insights into insect adaptation to host plant and opens new ways for more sustainable pest management.

Draft Genome of the Wheat A-genome Progenitor Triticum urartu. Nature. 496, 87–90 (2013).

The Triticum urartu (AA) draft genome sequence (4.94 Gb) provides a diploid reference for analysis of polyploid wheat genomes and is a valuable resource for the evolution, domestication, and genetic improvement of wheat.

Aegilops tauschii Draft Genome Sequence Reveals a gene Repertoire for Wheat Adaptation. Nature. 496, 91–95 (2013).

The Aegilops tauschii draft genome (4.36 Gb) provides novel insights into its role in enabling environmental adaptation of common wheat and in defining the large and complicated genomes of wheat species.

The Sequence and de novo Assembly of the Giant Panda Genome. Nature. 463:311-317 (2010).

The panda genome was the first genome completely sequenced by next generation sequencing platform alone. It provides clues to the understanding of everything from the panda’s strict bamboo diet to it’s genetic diversity. It may also aid in the panda conservation in the future.

To provide better quality service for our customers, we offer two types of plant and animal de novo sequencing services, X bio and Intelligen.

X bio, which includes genome survey, sequencing, and genome assembly, is best for those customers with some bioinformatics knowledge and the ability to conduct further analysis of the assembly results themselves.

Intelligen is suitable for those customers with greater needs for bioinformatics analysis, and we can provide end-to-end solutions for your specific research needs. This service includes what is offered in X bio plus gene annotation, evolution analysis, and any other personalized analysis.

Workflow:

workflow

Research Strategies:

  1. Library construction strategies:
    1. Paired-end library construction strategy with insert size of 170 bp, 500 bp, and 800 bp
    2. Long-span mate-pair library construction strategy with insert size of 2 kb, 5 kb, 10 kb, 20 kb, and 40 kb
  2. Sequencing strategy:
    1. The average sequence depth is greater than 80X.
    2. Sequencing strategy of pair end library: 91PE or 101PE sequencing
    3. Sequencing strategy of mate-pair library: 50PE or 91PE sequencing
    4. Sequencing using PacBio’s single molecule real-time (SMRT) technology is also available for specific research needs.

Bioinformatics:

X bio -- Best for those customers with some bioinformatics knowledge

Genome survey
  1. Data filtering
  2. K-mer depth distribution analysis and genome size estimation
  3. Genome heterozygous rate estimation
  4. Preliminary assembly
Genome assembly
  1. Genome assembly
  2. GC-Depth distribution analysis
  3. GC-Content distribution analysis
  4. Sequencing depth analysis

Intelligen -- Best for those customers with greater needs for bioinformatics analysis

Genome survey
  1. Data filtering
  2. K-mer depth distribution analysis and genome size estimation
  3. Genome heterozygous rate estimation
  4. Preliminary assembly
Genome assembly
  1. Genome assembly
  2. GC-Depth distribution analysis
  3. GC-Content distribution analysis
  4. Sequencing depth analysis
  5. Evaluation of coverage of autosomal chromosome (BAC or Fosmid sequence should be provided)
  6. Evaluation of coverage of interested genes (EST or transcriptome data should be provided)
Annotation
  1. Repeat annotation
  2. Gene prediction
  3. Gene function annotation
  4. ncRNA annotation
Evolution analysis
  1. Genome map with GC skew and annotation
  2. Gene family identification (animal: TreeFam; plant: OrthoMCL)
  3. Phylogenetic analysis
  4. Estimation of species divergence time
  5. Genomic island prediction
  6. Genome-wide synteny analysis
  7. Segmental duplication analysis (animal: WGD; plant: WGAC)

Sample Requirements

  1. Sample quantity required (single pair):
    • Short-insert libraries: ≥3 µg
    • 2 kb large-insert libraries: ≥20 µg
    • 5 kb-6 kb large-insert libraries: ≥20 µg
    • 10 kb large-insert libraries: ≥30 µg
    • 20 kb and 40 kb large-insert libraries: ≥60 µg
    Note: The total quantity of sample required is also determined by the experimental strategy, as well as the type and number of libraries to be constructed.
  2. Sample concentration:
    • Short-insert libraries: ≥30 ng/ µL
    • Large-insert libraries: ≥133 ng/ µL
  3. Sample purity: OD260/280= 1.8-2.0

Turnaround Time:

Services X bio Intelligen – Common genome Intelligen – Complex genome
Turnaround time 4–9 months (depending on genome complexity) 6 months 1 year

Completion Criteria

Species Completion Indicator
Birds , Mammals (except Chiroptera) Contig N50 ≥ 30 kb; Scaffold N50 ≥ 2 Mb
Plants Contig N50 ≥ 30 kb; Scaffold N50 ≥ 800 kb
Complex genome Contig N50 ≥ 20 kb; Scaffold N50 ≥ 300 kb
Others Contig N50 ≥ 20 kb; Scaffold N50 ≥ 600 kb
 

Technologies