Virus Integration Sequencing
Technical Information
Contact Us / Wish List

A century of tumor virology has revealed that seven types of viruses cause 10–15% of all human malignancies. Viruses can cause cellular transformation by expression of viral oncogenes, genomic integration to alter the activity of cellular proto-oncogenes or tumor suppressors, or inducing inflammation that promotes oncogenesis. Viral etiology is particularly evident in cervical carcinoma and ovarian cancer, which are almost exclusively caused by high-risk human papillomaviruses (HPV), and in hepatocellular carcinoma, where infection with hepatitis B virus (HBV) or hepatitis C virus (HCV) is the predominant cause in some countries. Here, BGI provides comprehensive solutions to identify virus integration sites in viral-related tumors using whole genome sequencing and target region sequencing, which will lead to an understanding of the mechanism of virus-induced tumorigenesis and development.


  1. Novel and more accurate method to capture probes based on virus sequence
  2. Comprehensive investigation of virus integration sites in a cost-effective way

Genome-wide Survey of Recurrent HBV Integration in Hepatocellular Carcinoma. Nature Genetics. 44:765-769 (2012).

Hepatocellular carcinoma (HCC) is a common solid tumor and represents the third leading cause of cancer deaths worldwide. Thus far, three mechanisms have been reported to explain how hepatitis B virus (HBV) promotes carcinogenesis. In this study, we investigated the events of HBV integration and their effects on the HCC genome using whole-genome sequencing (~30X depth, coverage >99%) and integrated expression profiling analyses. Ultimately, 339 HBV integration breakpoints were discovered. With validation using RNA-Seq and Sanger sequencing, recurrent HBV integration events (i.e., in ≥4 HCCs) were identified at several genes that have been linked to cancer, including TERT, MLL4 and CCNE1, which also showed upregulated gene expression in tumors compared to normal tissue.

Traditional research methods include chromosome walking PCR, qPCR, and FISH. However, these methods are tedious, low throughput, and imprecise with regards to location and copy number, which greatly limit development of the field. To alleviate these problems, new techniques, such as whole genome sequencing (WGS), have been developed to study virus integration. WGS has single base resolution, thus all integration sites could be detected accurately in a single experiment. Unfortunately, this technology has been cost prohibitive thus far. To resolve this issue, BGI has developed an HBV Integration capture sequencing technique that captures probes based on viral sequence. This technique can comprehensively, accurately, and cost-effectively identify virus integration sites and virus type in virus-related tumors.

The workflow is as follows:

Figure 1. Workflow of virus integration sequencing @BGI

Specifically, BGI has developed a virus capture array using Agilent SureSelect target enrichment (Figure 2). We have unique virus capture chips for four of the most relevant viruses (HBV, HPV, EBV, and HIV), and we have the most experience and the most sophisticated process for HBV capture. To date, more than 1,000 liver cancer samples have been analyzed by HBV integration research at BGI.

Virus integration capture sequencing was highly concordant with whole genome sequencing (WGS) and could detect some low frequency integration, which was not detected by WGS. For example, when these two methods were conducted on the same lung cancer sample, WGS detected 63 breakpoints, while virus integration capture sequencing identified an additional 41 new breakpoints as well as those 63 ones (our internal data). This indicates that the sensitivity of virus integration capture sequencing is even higher than the sensitivity of WGS.

Figure 2. Virus capture array developed by BGI.