Aberrant Overexpression of Satellite Repeats in Pancreatic and Other Epithelial Cancers

Authors
David T. Ting,1* Doron Lipson,2* Suchismita Paul,1 Brian W. Brannigan,1 Sara Akhavanfard,1 Erik J. Coffman,1 Gianmarco Contino,1 Vikram Deshpande,1 A. John Iafrate,1 Stan Letovsky,2 Miguel N. Rivera,1 Nabeel Bardeesy,1 Shyamala Maheswaran,1 Daniel A. Haber
06-20-2011
12:00pm
PST
Categories
RNA & Disease
Speaker
Ian Vaughn
Abstract
Satellite repeats in heterochromatin are transcribed into noncoding RNAs that have been linked to gene silencing and maintenance of chromosomal integrity. Using digital gene expression analysis, we showed that these transcripts are greatly overexpressed in mouse and human epithelial cancers. In 8 of 10 mouse pancreatic ductal adenocarcinomas (PDACs), pericentromeric satellites accounted for a mean 12% (range 1 to 50%) of all cellular transcripts, a mean 40-fold increase over that in normal tissue. In 15 of 15 human PDACs, alpha satellite transcripts were most abundant and HSATII transcripts were highly specific for cancer. Similar patterns were observed in cancers of the lung, kidney, ovary, colon, and prostate. Derepression of satellite transcripts correlated with overexpression of the long interspersed nuclear element 1 (LINE-1) retrotransposon and with aberrant expression of neuroendocrine-associated genes proximal to LINE-1 insertions. The overexpression of satellite transcripts in cancer may reflect global alterations in heterochromatin silencing and could potentially be useful as a biomarker for cancer detection. Genome-wide sequencing approaches have revealed an increasing set of transcribed noncoding sequences (ncRNA), including “pervasive transcription” by heterochromatic regions of the genome linked to transcriptional silencing and chromosomal integrity (1, 2). In the mouse, heterochromatin is composed of centric (minor) and pericentric (major) satellite repeats that are required for formation of the mitotic spindle complex and faithful chromosome segregation (3), whereas human satellite repeats have been divided into multiple classes with similar functions (4). Accumulation of satellite transcripts in mouse and human cell lines results from DNA demethylation, heat shock, or the induction of apoptosis, and their overexpression has been associated with genomic instability (5, 6). Stressinduced transcription of satellites in cultured cells has also been linked to the activation of retroelements encoding RNA polymerase activity such as longinterspersed nuclear element 1 (LINE-1) (L1TD1) (7, 8). The global expression of repetitive ncRNAs in primary tumors has not been analyzed owing to the bias of microarray platforms toward annotated coding sequences and the specific exclusion of repeat sequences from standard analytic programs. We used a next-generation digital gene expression (DGE) method (9) to obtain a comprehensive view of the transcriptome of primary tumors. We first evaluated mouse pancreatic ductal adenocarcinomas (PDACs) generated through pancreastargeted expression of activated Kras and loss of Tp53 (10). These tumors are histopathological and genetic mimics of human PDAC, which almost universally display mutations in the KRAS oncogene and show frequent loss of the TP53 tumor suppressor gene. Notably, 47% of transcripts sequenced in the first PDAC (468,359 transcripts per million; tpm) were not annotated and mapped to the major mouse satellite, which contributes to only 0.02 to 0.4% of transcripts in normal pancreas or liver. In the tumor, satellite reads were found in both sense and antisense directions and were absent from purified polyadenylated RNA. The number of transcripts was >100 times that of normal tissue and 3600 times as abundant in the tumor as mRNA transcripts of the Gapdh (glyceraldehyde-3-phosphate dehydrogenase) housekeeping gene. We extended DGE analysis to additional mouse tumors with diverse genotypes: Increased satellite expression was noted in 7 of 9 PDACs, 2 of 3 colon cancers, and 2 of 2 lung cancers (range 12,236 to 160,186 tpm) (Fig. 1A and table S1). In primary tumors overexpressing satellites, the composite distribution of all RNA reads among coding, ribosomal, and other nc transcripts differed significantly from that of normal tissues (Fig. 1B), suggesting that the cellular transcriptional machinery is affected by the massive expression of satellites. Genomic amplification of satellites did not account for the exceptional abundance of these transcripts, as determined by next-generation DNA digital copy n