de novo Sequencing
Cases
Technical Information
Contact Us / Wish List

de novo sequencing provides the first genome sequence of an organism. With the advent of rapid, low-cost next-generation sequencing technology researchers can now obtain whole genome data for organisms previously considered too low a priority to sequence. The availability of this whole genome data has allowed large-scale genomic studies to be performed that were unimaginable just a few years ago. To date, BGI Tech has sequenced 656 plant and animal reference genomes. The completed projects include rice, cucumber, potato, wheat, silkworm, panda, ant, oyster, minke whale, and so on.

Benefits:

  1. More comprehensive maps of genetic variation
  2. Variable gradient insert libraries enable fine mapping of the genome
  3. Reliable genome assembly by BGI’s independently developed software- SOAPdenovo
  4. NGS high-throughput sequencing reduces cost
  5. Experienced bioinformatics team

Comparative Analysis of Bat Genomes Provides Insight into the Evolution of Flight and Immunity. Science. 339: 456-460 (2013).

Comparative analysis of Fruit bat Pteropus alecto and insectivorous Myotis davidii genomes (~ 2 Gb) provides insight into the phylogenetic placement of bats, and moreover reveals evidence of genetic changes that may have contribution to their evolution.

A Heterozygous Moth Genome Provides Insights Into Herbivory and Detoxification. Nature Genetics. 45, 220–225 (2013).

The first genome sequence of the diamondback moth (DBM, 339 Mb) has been published, and this insect is the most destructive pest of brassica crops. This work shows the genetic and molecular bases for the evolutionary success of this worldwide herbivore, and offers insect insights into insect adaptation to host plant and opens new ways for more sustainable pest management.

Draft Genome of the Wheat A-genome Progenitor Triticum urartu. Nature. 496, 87–90 (2013).

The Triticum urartu (AA) draft genome sequence (4.94 Gb) provides a diploid reference for analysis of polyploid wheat genomes and is a valuable resource for the evolution, domestication, and genetic improvement of wheat.

Aegilops tauschii Draft Genome Sequence Reveals a gene Repertoire for Wheat Adaptation. Nature. 496, 91–95 (2013).

The Aegilops tauschii draft genome (4.36 Gb) provides novel insights into its role in enabling environmental adaptation of common wheat and in defining the large and complicated genomes of wheat species.

The Sequence and de novo Assembly of the Giant Panda Genome. Nature. 463:311-317 (2010).

The panda genome was the first genome completely sequenced by next generation sequencing platform alone. It provides clues to the understanding of everything from the panda’s strict bamboo diet to it’s genetic diversity. It may also aid in the panda conservation in the future.

Workflow:

workflow

Bioinformatics:

Assembly

  1. K-mer depth distribution analysis and genome size estimate
  2. Genome heterozygous rate estimate
  3. Preliminary assembly
  4. GC-Depth distribution analysis
  5. Sequence depth distributions

Annotation

  1. Repeat annotation
  2. Gene prediction
  3. Gene function annotation
  4. ncRNA annotation

Evolution analysis for animal and plant species

  1. Orthologous gene clusters (animal: TreeFam, plant: OrthoMCL)
  2. Phylogenetic analysis
  3. Divergence time estimation
  4. Whole genome alignment (genome synteny)
  5. Segmental duplication (animal: WGAC, plant: WGD)

Advanced bioinformatics for microbial species

  1. Genome map with GC skew and annotation
  2. Synteny analysis
  3. Gene family
  4. CRISPR prediction
  5. Genomic island prediction
  6. Prophage prediction
  7. Secreted protein prediction

 

Sample Requirements

  1. Sample quantity required (single pair):
    • Short-insert libraries: ≥2.5 µg
    • 2 kb large-insert libraries: ≥20 µg
    • 5 kb-6 kb large-insert libraries: ≥20 µg
    • 10 kb large-insert libraries: ≥30 µg
    • 20 kb and 40 kb large-insert libraries: ≥60 µg
    • PCR-free libraries with high or low GC content: ≥30 µg

    Note: the total sample quantity required is also determined by the experimental strategy, as well as the type and number of libraries to be constructed.

  2. Sample concentration:
    • Short-insert libraries: ≥25 ng/ µL
    • Large-insert libraries: ≥133 ng/ µL
  3. Sample quality: genomic DNA should be intact.
  4. Sample purity: OD260/280= 1.8-2.0

Turnaround Time:

Animals/Plants Survey: 2 months from sample qualification Common genome: 6 months from sample qualification Complex genome: 12 months from sample qualification
Fungi Survey: 40 business days Draft map: 50 business days Fine map: 50 business days (from completion of survey)
Bacteria Survey: 40 business days Fine map: 60 business days Complete map: 75 business days

 

Completion Criteria

 

Genomic map of plant or animal species (common genome)

Genome Size (GS) Assembly Indicator
≤ 300 Mb Contig N50 > 20 kb; Scaffold N50 > 600 kb
300 Mb < GS ≤ 1500 Mb (except birds) Contig N50 > 20 kb; Scaffold N50 > 600 kb
1500 Mb < GS ≤ 3000 Mb (except mammals) Contig N50 > 20 kb; Scaffold N50 > 300 kb
Contig N50 > 10 kb; Scaffold N50 > 150 kb
GS < 1600 Mb (birds) Contig N50 > 20 kb; Scaffold N50 > 300 kb
GS < 3200 Mb (mammals, except Chiroptera) Contig N50 > 20 kb; Scaffold N50 > 2 Mb

 

Genomic map of microbial species

Fungi Survey Sequencing depth ≥ 30X
Draft map Sequencing depth ≥ 50X
Fine map The coverage of chromosome or chromatin genome is > 95%.The coverage of a gene region is > 98%Scaffold N50 > 300 kb, with an overall sequencing depth ≥ 50X.
Bacteria Survey Sequencing depth ≥ 100X
Fine map The coverage of chromosome or chromatin genome is > 95%.The coverage of a gene region is above 98%The overall sequencing depth is ≥ 100X.
Complete map Provide 1 contig sequence and PCR validation.
Small genome Survey Sequencing depth ≥ 100X

 

Technologies