Identification of a novel gene fusion (BMX-ARHGAP) in gastric cardia adenocarcinoma
Diagnostic Pathologyvolume 9, Article number: 218 (2014)
Gastric cardia adenocarcinoma (GCA) is one of the major causes of cancer related mortality worldwide. We aim to provide new understanding in the pathogenesis of GCA through investigations on gene expression alterations.
We preformed RNA-Seq for one pair of GCA and matched non-tumor tissues. Differentially expressed genes (DEGs) and fusion genes were acquired. PCR and gel analysis in additional 14 pairs of samples were performed to validate the chimeric transcripts.
1590 up-regulated and 709 down-regulated genes were detected. Functional analysis revealed that these DEGs were significantly overrepresented in gene ontology items of cell cycle, tumor invasion and proliferation. Moreover, we firstly discovered 3 fusion genes in GCA, including BMX-ARHGAP, LRP5- LITAF and CBX3-C15orf57. The chimeric transcript BMX-ARHGAP was validated and recurrently occurred in 4/15 independent tumor tissues.
Our results may provide new understanding of GCA and biomarkers for further therapeutic studies.
The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/13000_2014_218
Gastric cancer is the fourth most common malignant cancer and the second major cause of cancer-related death ,. It is widely believed that gastric cancer is a heterogeneous disease with multiple environmental and genetic etiologies. To date, aberrant gene expression and epigenetic alterations were identified to be involved in the pathogenesis of gastric cancer. These abnormal changes could lead to perturbations in normal cellular homeostasis and neoplastic transformation of the gastric mucosa . In particular, disruption in a number of regulatory signal pathways could create a permissive environment for carcinogenesis, invasiveness and metastasis. Gastric adenocarcinoma comprises 95% of the malignant gastric tumors . It is classified as proximal (originating in the cardia) and distal (originating distal to the cardia). Different from distal adenocarcinomas, incidence of gastric cardia adenocarcinoma (GCA) has increased significantly recently -. In addition, compared with distal adenocarcinomas, GCA seems to be more aggressive with deeper gastric wall invasion and worse prognosis ,. Therefore, it is necessary to spend more efforts on the investigations for uncovering the pathogenesis of GCA.
With the rapid development of next generation sequencing (NGS), many cancer-related genes have been identified. NGS technology makes it possible to comprehensively illuminate whole map of genetic alteration of cancer. Specifically, massively parallel RNA-Sequencing (RNA-Seq) allows identification of entire gene expression and structural variation in individual samples, and facilitates fully characterization of cellular transcriptomes. Consequently, RNA-Seq becomes a revolutionary tool to study transcriptome profiling and measure the expression levels of various transcripts and isoforms . Currently, investigations on GCA by using RNA-seq are still limited.
In this study, we generated comprehensive mRNA profiles in a pair of GCA and adjacent non-tumor tissues. We performed transcriptome-wide, unbiased analyses of the RNA-Seq data to identify different kinds of gene transcriptional aberrations (mRNA expression and chimeric transcript) to investigate the molecular mechanism of gastric cancer pathogenesis. Interesting fusion genes were further validated in other independent samples. Our results may provide new understanding of GCA pathogenesis and new targets for future therapeutic studies.
The GCA tissue and adjacent non-tumor tissue (5 cm away from tumor) were collected from a 65 years old male patient who was diagnosed as T4N0M0 stage IIB GCA in 2012. The tumor size was 5 × 4 × 1 cm. The tumor was moderately differentiated and didn’t spread to nearby lymph nodes or other organs. However, both venous invasion and nerve invasion were positive.
For validation of interesting fusion genes, another 14 pairs of stage IIB GCA and adjacent non-tumor tissues were obtained with the same procedure. Signed informed consent documents were obtained from all patients. The scientific use of these samples was approved by the Institutional Review Boards of the People’s hospital, Jingjiang, Jiangsu, China.
RNA-Seq library preparation and Illumina sequencing
Total RNA was isolated from frozen tissue by Trizol (Invotrogen) and its quality was assessed using Agilent Bioanalyzer. RNA-Seq libraries were prepared by TruSeq RNA Sample Prep Kit (Illumina Cat. No. FC-122-1001) according to standard protocols. Then the Libraries were sequenced on an Illumina Genome Analyzer IIx with 115 bp pair-end read length.
RNA-Seq data processing
For the raw data, low quality reads were filtered according to the criteria as follows: (1) reads containing adaptors were filtered; (2) nucleotides with a quality score lower than 20 were trimmed. The clean reads were aligned against both genome hg19 and transcripts reference using Bowtie 2.0.0  and Tophat1.3.1 . The mapping reads without biological interest, such as ribosomal RNAs and mitochondrial RNA were removed. The results of read mapping was visualized by using Integrative Genomics Viewer (IGV 2.0.26) .
Detection of differentially expressed genes (DEGs)
To identify DEGs, the values of human genes were normalized by per kilobase of exon per million mapped reads (FPKM) using Cufflinks (1.0.3) . DEGs between GCA and non-tumor tissues were determined with significance cutoff of q-value <0.05. In addition, gene ontology (GO) and KEGG pathway enrichment analyses were performed via the web tool DAVID (http://david.abcc.ncifcrf.gov/)  with the false discovery rate (FDR) <0.05.
Alternative splicing event detection
The alternative splicing events were detected by MISO software , including alternative 3’ splice sites (A3SS), alternative 5’ splice sites (A5SS), mutually exclusive exons (MXE), retained intron (RI) and skipped exons (SE). Differentially expressed isoforms were further identified with following criteria: 1) the value of gene expression difference >0.2; 2) Bayes factors >10; 3) inclusive reads >1, exclusive reads > 1 and sum of inclusive reads and exclusive reads >10.
Fusion gene identification
Fusion gene was identified by Defuse  and TopHat . The filtering processes of Defuse were carried out as previously described . Fusion genes detected by TopHat should meet the following criteria: 1) support reads number should be more than 3; 2) the supporting reads should not be mapped to ribosomal protein or small nuclear ribonucleoproteins. Only fusion genes detected by both methods were remained.
Gene fusion validation
To validate fusion transcripts, we performed PCR and gel analysis in the sequence sample and another 14 pairs of GCA and adjacent non-tumor tissues. All patients were also diagnosed as T4N0M0 stage IIB GCA. Primer pairs were designed using Primer 5 software, and RT-PCR was performed using the following procedure: 94°C for 1 min, 40 cycles of 94°C for 20 sec, 55°C for 20 sec and 72°C for 15 sec, followed by 72°C for 1 min. The PCR products of the fusion genes were visualized through agarose gel electrophoresis.
In order to explore the spectrum of gene transcriptional changes in GCA, we sequenced the whole gene transcripts in one pair of GCA and matched non-tumor tissues from a 65 years old patient. We generated a total of 5.36 and 6.12 Gb data for the tumor and non-tumor tissue, respectively. Approximately 80% reads were mapped to the reference genome, achieving an average depth of coverage of 38.63 X and 43.32 X, respectively (Table 1). The sequencing reads achieved fairly well evenness and integrity (Figure 1).
Next we analyzed gene expression level via calculating FPKM using the Cufflinks. About 15 thousand genes were expressed (FPKM >1) in our samples, among which 1590 genes were up-regulated and 709 genes were down-regulated in tumor. According to the enrichment analysis, these DEGs were significant overrepresented in 16 GO terms and 3 pathways, which were related to cell cycle, tumor invasion and proliferation (Table 2).
Alternative splicing events were also detected by using MISO. We found 311 differentially alternative spliced events. After the filtering process, 61 alternative splicing events were remained according to Bayes factors, Psi values and confidence intervals (Table 3).
Many fusion genes were reported as the potential cause for tumorigenesis, including gastric cancer ,. Using Defuse and TopHat, 7 candidate fusion genes were identified with stringent filtering criteria. And finally 3 fusion genes were selected by TopHat realignment and manual inspection (Table 4, Figure 2). The fusion gene BMX-ARHGAP was validated using RT-PCR and gel analysis (Table 5), which was recurrently present in tumor tissues in about 26.7% (4/15) GCA patients (Figure 2).
The development of cancer is a multistep process during which cells acquires a series of mutations that eventually lead to unrestrained cell growth and division, inhibition of cell differentiation, and evasion of cell death. In order to comprehensively study aberrant gene expression in GCA, we performed the current study by using paired-end RNA-Seq technology. The results revealed the information of DEGs, alternative splicing, and gene fusion, which may have potential application in therapeutic studies.
In the present study, over 400 M reads were sequenced on the Illumina platform, reaching about 40 X coverage for the whole transcriptome. With those high sequencing depth, 2299 DEGs were detected between GCA and non-tumor tissue. Further analysis showed ECM and cell cycle were the most enriched biological pathway among those abnormal expressed genes.
Alternative splicing is an important regulatory process during gene expression and substantially results in diverse transcripts. Abnormally spliced mRNAs were also found in multiple cancerous cells, including gastric cancer, breast cancer, and colon cancer ,. In our study, 61 alternative spliced events were identified by using MISO software. Among them, CD44 was also found be abnormally spliced in colorectal cancer and was suggested to be the character of metastatically potent tumor cells .
Gene fusion is another oncogenic activation mechanisms involved in the development of various types of malignancies including leukemia, lymphoma, breast and prostate cancer . Importantly, several fusion genes, DUS4L–BCAP29, CD44-SLC1A2 have been reported in gastric cancer in western country. In our study, 3 chimeric transcripts were supported by the two detection methods (Table 4). Among them, CBX3-C15orf57 was detected in healthy human beings before . In addition, both CBX3 and C15orf57 were recurrently partnered with others in breast cancer . Our results implicate the potential involvement of CBX3-C15orf57 in GCA. BMX-ARHGAP12 was firstly validated and was detected recurrently in 4 out of 15 GCA patients (Figure 2). BMX encodes a non-receptor tyrosine kinase and ARHGAP12 encodes a Rho GTPase activating protein. It is possible that the chimeric transcript active the BMX mediation of tumorigenicity in GCA patients since BMX has been suggested to promote tumor growth and metastasis other cancers ,. In addition, overexpression of ARHGAP12 would impair cell scattering, invasion and adhesion to fibronectin . This fusion gene may play important roles in the tumorigenesis and may be potential targets for novel therapeutic strategy.
In summary, we performed transcriptome-wide analysis to identify gene expression aberrations in GCA. One of identified fusion gene, BMX-ARHGAP12, was further confirmed in another independent patient. Our results may provide new understanding of the pathogenesis and new targets for further therapeutic investigations.
Shen L, Shan Y-S, Hu H-M, Price TJ, Sirohi B, Yeh K-H, Yang Y-H, Sano T, Yang H-K, Zhang X, Park SR, Fujii M, Kang Y-K, Chen L-T: Management of gastric cancer in Asia: resource-stratified guidelines. Lancet Oncol. 2013, 14: e535-e547. 10.1016/S1470-2045(13)70436-4.
Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D: Global cancer statistics. CA Cancer J Clin. 2011, 61: 69-90. 10.3322/caac.20107.
Nagini S: Carcinoma of the stomach: a review of epidemiology, pathogenesis, molecular genetics and chemoprevention. World J Gastrointest Oncol. 2012, 4: 156-169. 10.4251/wjgo.v4.i7.156.
Dicken BJ, Bigam DL, Cass C, Mackey JR, Joy AA, Hamilton SM: Gastric adenocarcinoma: review and considerations for future directions. Ann Surg. 2005, 241: 27-39.
Brown LM, Devesa SS: Epidemiologic trends in esophageal and gastric cancer in the United States. Surg Oncol Clin N Am. 2002, 11: 235-256. 10.1016/S1055-3207(02)00002-9.
Orengo MA, Casella C, Fontana V, Filiberti R, Conio M, Rosso S, Tumino R, Crosignani P, De Lisi V, Falcini F, Vercelli M, Group AW: Trends in incidence rates of oesophagus and gastric cancer in Italy by subsite and histology, 1986–1997. Eur J Gastroenterol Hepatol. 2006, 18: 739-746. 10.1097/01.meg.0000223905.78116.38.
Fock KM: Review article: the epidemiology and prevention of gastric cancer. Aliment Pharmacol Ther. 2014, 40: 250-260. 10.1111/apt.12814.
Piazuelo MB, Correa P: Gastric cancer: overview. Colombia Medica. 2013, 44: 192-201.
Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009, 10: 57-63. 10.1038/nrg2484.
Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009, 10: R25-10.1186/gb-2009-10-3-r25.
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL: TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013, 14: R36-10.1186/gb-2013-14-4-r36.
Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP: Integrative genomics viewer. Nat Biotechnol. 2011, 29: 24-26. 10.1038/nbt.1754.
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L: Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012, 7: 562-578. 10.1038/nprot.2012.016.
da Huang W, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009, 4: 44-57. 10.1038/nprot.2008.211.
Katz Y, Wang ET, Airoldi EM, Burge CB: Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods. 2010, 7: 1009-1015. 10.1038/nmeth.1528.
McPherson A, Hormozdiari F, Zayed A, Giuliany R, Ha G, Sun MG, Griffith M, Heravi Moussavi A, Senz J, Melnyk N, Pacheco M, Marra MA, Hirst M, Nielsen TO, Sahinalp SC, Huntsman D, Shah SP: deFuse: an algorithm for gene fusion discovery in tumor RNA-Seq data. PLoS Comput Biol. 2011, 7: e1001138-10.1371/journal.pcbi.1001138.
Kim D, Salzberg SL: TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biol. 2011, 12: R72-10.1186/gb-2011-12-8-r72.
Steidl C, Shah SP, Woolcock BW, Rui L, Kawahara M, Farinha P, Johnson NA, Zhao Y, Telenius A, Neriah SB, McPherson A, Meissner B, Okoye UC, Diepstra A, van den Berg A, Sun M, Leung G, Jones SJ, Connors JM, Huntsman DG, Savage KJ, Rimsza LM, Horsman DE, Staudt LM, Steidl U, Marra MA, Gascoyne RD: MHC class II transactivator CIITA is a recurrent gene fusion partner in lymphoid cancers. Nature. 2011, 471: 377-381. 10.1038/nature09754.
Kim HP, Cho GA, Han SW, Shin JY, Jeong EG, Song SH, Lee WC, Lee KH, Bang D, Seo JS, Kim JI, Kim TY: Novel fusion transcripts in human gastric cancer revealed by transcriptome analysis.Oncogene 2013, doi:10.1038/onc.2013.490.,
Tao J, Deng NT, Ramnarayanan K, Huang B, Oh HK, Leong SH, Lim SS, Tan IB, Ooi CH, Wu J, Lee M, Zhang S, Rha SY, Chung HC, Smoot DT, Ashktorab H, Kon OL, Cacheux V, Yap C, Palanisamy N, Tan P: CD44-SLC1A2 gene fusions in gastric cancer. Sci Transl Med. 2011, 3: 77ra30-10.1126/scitranslmed.3001423.
Li H, Guo L, Li J, Liu N, Liu J: Alternative splicing of RHAMM gene in chinese gastric cancers and its in vitro regulation. Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 2000, 17: 343-347.
Pal S, Gupta R, Davuluri RV: Alternative transcription and alternative splicing in cancer. Pharmacol Ther. 2012, 136: 283-294. 10.1016/j.pharmthera.2012.08.005.
Banky B, Raso-Barnett L, Barbai T, Timar J, Becsagh P, Raso E: Characteristics of CD44 alternative splice pattern in the course of human colorectal adenocarcinoma progression. Mol Cancer. 2012, 11: 83-10.1186/1476-4598-11-83.
Heim S, Mitelman F: Molecular screening for new fusion genes in cancer. Nat Genet. 2008, 40: 685-686. 10.1038/ng0608-685.
Schrider DR, Navarro FC, Galante PA, Parmigiani RB, Camargo AA, Hahn MW, de Souza SJ: Gene copy-number polymorphism caused by retrotransposition in humans. PLoS Genet. 2013, 9: e1003242-10.1371/journal.pgen.1003242.
Chen K, Navin NE, Wang Y, Schmidt HK, Wallis JW, Niu B, Fan X, Zhao H, McLellan MD, Hoadley KA, Mardis ER, Ley TJ, Perou CM, Wilson RK, Ding L: BreakTrans: uncovering the genomic architecture of gene fusions. Genome Biol. 2013, 14: R87-10.1186/gb-2013-14-8-r87.
Guo S, Sun F, Guo Z, Li W, Alfano A, Chen H, Magyar CE, Huang J, Chai TC, Qiu S, Qiu Y: Tyrosine kinase ETK/BMX is up-regulated in bladder cancer and predicts poor prognosis in patients with cystectomy. PLoS ONE. 2011, 6: e17778-10.1371/journal.pone.0017778.
Cenni B, Gutmann S, Gottar-Guillier M: BMX and its role in inflammation, cardiovascular disease, and cancer. Int Rev Immunol. 2012, 31: 166-173. 10.3109/08830185.2012.663838.
Gentile A, D’Alessandro L, Lazzari L, Martinoglio B, Bertotti A, Mira A, Lanzetti L, Comoglio PM, Medico E: Met-driven invasive growth involves transcriptional regulation of Arhgap12. Oncogene. 2008, 27: 5590-5598. 10.1038/onc.2008.173.
The authors declare that they have no competing interests.
JW conceived of the study and revised the manuscript. XX and LX drafted the manuscript. FG performed statistical analysis. JY and MZ revised the manuscript. YZ and LT performed validation experiments. All authors read and approved the final manuscript.