Target Region Sequencing
Cases
Technical Information
Contact Us / Wish List

Target region capture enriches specific regions (e.g., the MHC region) or specific genes by probe hybridization based on probes designed according to the genomic regions of interest. It is cost effective to use targeted region sequencing to find variants with large samples. BGI has completed over 65,000 analyses using target region sequencing and has developed a series of software (e.g., SOAPsnp) to analyze the sequencing data and generate precise alignment results, accurate variance results, and custom analysis results. Many successful cases have been published in journals such as Science and Mammalian Genome.

Benefits:

  1. Targeted: Focus on the regions of interest, such as exons, promoters, and enhancers.
  2. Cost effective: Much lower cost for narrowed region sequencing, which is a significant advantage for projects with large sample sizes and deep sequencing.
  3. Wide applications: QTL fine mapping, association studies, result validation for large amount of samples, and clinical sequencing.
  4. High quality data (See examples of data generated by BGI).
  5. Rich experience: Our technicians have performed a large number of targeted region sequencing projects; they are familiar with experimental methods and trouble-shooting techniques.
  6. Custom-tailored: Custom bioinformatics analysis is available for specific research purposes and data characteristics. Internal software evaluation is available to update the analysis pipeline and ensure an optimized final analysis result for the customer.

Customer Testimonial:

"We use BGI as a NGS service provider for many of our targeted region re-sequencing projects. BGI has consistently provided high-quality data with a very quick turnaround time. I know exactly what to expect in terms of delivery time and data quality, which is vital for our business operations." -Dr. Eric Lin, Chairman of the Board, Otogenetics Corporation

Targeted Next-generation Sequencing as a Comprehensive Test for Patients With and Female Carriers of DMD/BMD: a Multi-population Diagnostic Study. European Journal of Human Genetics. Doi: 10.1038/ejhg. (2013).

Duchenne and Becker muscular dystrophies (DMD/BMD) are the most commonly inherited neuromuscular diseases. However, accurate and convenient molecular diagnosis cannot be achieved easily because of the enormous size of the dystrophin gene and complex causative mutation spectrum. Here, we introduce a new single-step method for the genetic analysis of DMD patients and female carriers in real clinical settings and demonstrate the validation of its accuracy. A total of 89 patients, 18 female carriers and 245 non-DMD patients were evaluated using our targeted NGS approaches. We detected novel partial deletions of exons in nine samples for which the breakpoints were located within exonic regions. The results proved that our new method is suitable for routine clinical practice, with a shorter turnaround time, higher accuracy, and better insight into comprehensive genetic information (detailed breakpoints) for ensuing gene therapy.

An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People. Science. 337(6090): 100-4. (2012).

Rare genetic variants contribute to complex disease risk. However, the abundance of rare variants in human populations remains unknown. This paper explored this spectrum of variation by sequencing 202 genes encoding drug targets in 14,002 individuals. The research found that rare variants are abundant (1 every 17 bases) and geographically localized, so that even with large sample sizes, rare variant catalogs will be largely incomplete. The paper used the observed patterns of variation to estimate population growth parameters, the proportion of variants in a given frequency class that are putatively deleterious, and mutation rates for each gene. The paper concludes that because of rapid population growth and weak purifying selection, human populations harbor an abundance of rare variants, many of which are deleterious and have relevance to understanding disease risk.

Bioinformatics:

Standard Bioinformatics Analysis

  1. Data filtering (removing adaptors contamination and low-quality reads from raw reads)
  2. Align reads to the human reference genome (UCSC build HG19) using BWA software
  3. Assessment of sequencing quality, including data production statistics, sequencing depth distribution and coverage uniformity
  4. SNP calling (tools: SAMtools, SOAPsnp, or GATK)
  5. SNP annotation (annotate each SNP to the corresponding gene functional units in RefGene database, including nucleotide and amino acid changes)
  6. SNP validation and comparison [with dbSNP database, 1000 Genomes Project database, publicly available exome databases (ESP), and YH (YH is only applied in Pan Asia-pacific region)]
  7. Functionality and conservation prediction of SNPs (based on SIFT, Polyphen-2, PhyloP, GERP score, Mutation assessor, Condel and FATHMM)
  8. Statistics of SNPs in each functional element
  9. InDel calling (tools: SAMtools or GATK)
  10. InDel annotation (annotate each InDel to the corresponding gene functional units in RefGene database, including nucleotide and amino acid changes)
  11. InDel validation and comparison [with dbSNP database, 1000 Genomes Project database, publicly available exome databases (ESP), and YH (YH is only applied in Pan Asia-pacific region)]
  12. Statistics of InDels in each functional element

Advanced Bioinformatics Analysis

General Advanced Analysis

  1. Non-coding SNP calling, annotation, and statistics
  2. Non-coding InDel calling, annotation, and statistics

Cancer Advanced Analysis

  1. Preliminary identification for the paired tumor-normal samples based on MassARRAY (recommended prior to sequencing)
  2. Somatic SNV and InDel calling, annotation, and statistics [somatic SNV and InDel will be compared against the public databases such as dbSNP, 1000 Genomes Project, and YH (YH is only applied in Pan Asia-pacific region)]
  3. Somatic SNV/InDel annotation against the COSMIC database
  4. Functionality and conservation prediction of somatic SNVs (based on SIFT, Polyphen-2, PhyloP, GERP score, Mutation assessor, Condel, and FATHMM)
  5. Non-synonymous annotation for mutated genes against the CancerGeneCensus database

Complex Diseases Advanced Analysis

  1. Sample design and power calculation (during project design stage)
  2. Population-level SNP calling and linkage disequilibrium (LD)-based genotype calling
  3. SNP annotation and statistics (including OMIM and ENCODE annotation)
  4. Quality control (QC) for population SNPs, including base quality, reads map quality, allele balance, DNA strand bias, homo-polymer, HWE test, and SNP filtering within 5bp around an InDel
  5. Sample QC, including cryptic relatedness analysis, sample contamination detection based on inbreeding coefficient, and PCA population stratification detection
  6. Single site SNP association test
  7. eQTL analysis

Population Advanced Analysis

  1. Population-level SNP calling, annotation, and statistics
  2. Quality control (QC) for population SNPs, including base quality, reads map quality, allele balance, DNA strand bias, homo-polymer, HWE test, and SNP filtering within 5bp around an InDel
  3. Genotype imputation and haplotype phasing analyses based on reference panels
  4. Sample QC, including relatedness detection and contamination detection based on inbreeding coefficient
  5. Selection signal detection, especially recent selection events, based on iHS and XP-EHH tests with validation using DDAF, Fst, and Tajima's D methods
  6. GO functional analysis, KEGG and panther pathway enrichment analyses for the candidate selected genes

Mendelian disorders Advanced Analysis

Please contact technical support([email protected]) for details.

Custom Bioinformatics Analysis

Customized Analysis of Complex Disease

(Applied to large-scale sample design, ≥200 cases & controls are recommended)
  1. Gene-based association analysis (CMC, SKAT-O, WSS, KBAC, Collapse)
  2. Candidate gene prioritization
  3. Conditional analysis for candidate SNPs
  4. LD and haplotype block analyses involving candidate SNPs
  5. GO functional analysis and KEGG pathway enrichment analysis for the candidate selected genes
  6. Pathway-based protein interaction analysis (pairwise or multiple interaction)
  7. ROC curve plot and heritability estimation
  8. eQTL analysis
  9. Population-level CNV calling, annotation, and statistics
  10. Association analysis of CNVs (Merge & Split)

De novo mutation analysis based on family samples

  1. De novo SNV calling, annotation, and statistics
  2. De novo InDel calling, annotation, and statistics
  3. (The following items are applied to multiple families, ≥30 trios or 10-20 quads are recommended)

  4. Association analysis of de novo SNVs/InDels with disease phenotypes
  5. GO functional analysis, KEGG pathway enrichment analysis, and protein-protein interaction analysis
  6. Parents tracing for de novo SNV mutations

Kits Available for Target Region Sequencing

Sample Requirements:

Human Target Region Sequencing

For the genomic DNA samples:
  1. Purity:OD260/280= 1.8-2.0, without degradation and RNA contamination
  2. Concentration: ≥ 37.5ng/μl
  3. Quantity: 1μg (2.5 μg gDNA recommended)

Mouse Target Region Sequencing

For the genomic DNA samples:
  1. Purity: OD260/280= 1.8-2.0, without degradation and RNA contamination
  2. Concentration: ≥ 37.5 ng/μl
  3. Quantity: 1μg (2.5 μg gDNA recommended)

Turnaround Time:

The standard turnaround time can be as fast as 6~8 weeks from the receipt of chips up to 100 samples. This includes library construction, sequencing, and standard bioinformatics analysis.

Technologies