Skip to content

Study provides first ever comprehensive profile of lncRNAs in human cancers.

Growing insights about a significant part of the genome, the dark matter of DNA, have fundamentally changed the way scientists approach the study of diseases. The human genome contains about 20,000 protein-coding genes, less than 2 percent of the total, however, 70 percent of the genome is made into non-coding RNA. Nevertheless, a systematic characterization of these segments, called long non-coding RNAs (lncRNAs), and their alterations in human cancer, is still lacking. Most studies of genomic alterations in cancer have focused on the miniscule portion of the human genome that encodes protein.

Now, an international team led by researchers at the University of Pennsylvania has mined these RNA sequences more fully to identify non-protein-coding segments whose expression is linked to 13 different types of cancer.  The opensource study is published in the journal Cancer Cell.

Previous studies show that cancer is a genetic disease involving multi-step changes in the genome. The human genome contains 20,000 protein-coding genes (PCGs), representing less than 2% of the total genome whereas up to 70% of the human genome is transcribed into RNA, yielding many thousands of non-coding RNAs.  Long non-coding RNAs (lncRNAs) are operationally defined as transcripts that are larger than 200 nt and transcriptional control of lncRNAs is subject to typical histone modification-mediated regulation.  Importantly, rapidly accumulating evidence indicates that lncRNAs are associated with chromatin-modifying complexes and guide epigenetic regulations in both physiological and pathological conditions.  Recent studies suggest that lncRNA is involved in the initiation and progression of cancer. In addition to the fact that they are highly deregulated in tumours.  The team state that with non-coding RNA sequences constituting almost three quarters of the human genome, there is a great need to characterize genomic, epigenetic, and other alterations of long non-coding segments.  The current study fills this significant gap in cancer research.

The current study analyzed lncRNAs at transcriptional, genomic, and epigenetic levels in over 5,000 tumor specimens across the different cancer types obtained from The Cancer Genome Atlas (TCGA) and in 935 cancer cell lines from the Cancer Cell Line Encyclopedia (CCLE).  Results show that lncRNA alterations are highly tumour- and cell line-specific compared to protein-coding genes. The lab note that in addition, lncRNA alterations are often associated with changes in epigenetic modifiers that act directly on gene expression.  The group state that they believe the results from this multidimensional analysis provide a rich resource for researchers to investigate the dysregulation of lncRNAs and to identify lncRNAs with diagnostic and therapeutic potential.

The researchers also developed two bioinformatics-based platforms to identify cancer-associated lncRNAs and explore their biological functions. The lab explain that one is a searchable database that incorporates clinical information with lncRNA molecular alterations to generate short lists of candidate lncRNAs to study.  They go on to add that the molecular profiling data used for this are linked to clinical and drug response annotations in the TCGA because of its high-quality, multiple-level profiles of human primary tumor specimens and detailed clinical notes for a broad selection of human cancer specimens, along with the CCLE, the best available resource for molecular profiles of cancer cell lines and details about their responses to drugs.

The second approach they developed, predicting the biological function of lncRNAs, successfully identified a novel oncogenic lncRNA called BCAL8. Data findings show that when BCAL8 is overexpressed it works to promote the cell cycle, which in turn controls cell division. This part of the study provided a proof of concept for the lncRNA search strategy, and a customizable database for other investigators to look for lncRNAs of interest and investigate their function. This database is called the Cancer LncRNome Atlas and is administered by the Abramson Cancer Center at Penn.

The team surmise that their findings provide convincing evidence that dysregulation of lncRNAs takes place at multiple levels in the cancer genome and that these alterations are strikingly cancer-type specific.  For the future, the researchers state that they have laid the critical groundwork for developing lncRNA-based tools to diagnose and treat cancer in new ways.  They go on to conclude that they expect additional important lncRNA discoveries will be enabled by their work.

Source: Perelman School of Medicine at the University of Pennsylvania

The Expression of lncRNAs Is a Specific Biomarker in Cancer To evaluate the potential value of lncRNAs as biomarkers in can- cer, we first asked whether the expression signature of lncRNAs can differentiate between tumors and their corresponding normal tissues. In all nine tumor types in which both tumor and normal tissues were available, we were able to use unsupervised cluster analysis to differentiate normal tissues from tumors. Although the expression of lncRNAs in tumor demonstrated diverse patterns, the expression in normal tissue was relatively homogeneous and could be clearly separated from the expres- sion patterns in tumor tissues ( Figures 5 A, 5B, and S4 A). To further examine the value of lncRNAs as biomarkers, we chose to study breast cancer, because it is a heterogeneous cancer type with well-characterized pathological and molecular sub- types. We selected 817 breast tumors for which the molecular subtype had been defined by the University of California, Santa Cruz, Cancer Genome Browser. A cluster analysis showed that the unsupervised lncRNA expression subtypes demonstrated a high correlation with the defined PAM50 subtypes and also had a high correlation with clinical subtypes ( Figure 5 C). In particular, almost all of the basal-like/triple-negative breast tumors were clustered together and clearly separated from other tumor and normal tissue samples. Importantly, it has been re- ported that lncRNA expression is strikingly tissue and cell type specific compared with PCGs in normal tissues ( Cabili et al., 2011; Mercer et al., 2008; Ravasi et al., 2006 ). We decided to compare the tissue specificity among lncRNAs, PCGs, and pseudogenes in cancer. We used an entropy-based metric that relies on Jensen-Shannon (JS) divergence to calculate specificity scores ( Cabili et al., 2011 ) for each gene in breast specimens and found that the expression of lncRNA demon- strated the highest subtype specificity, followed by pseudo- genes, while PCGs demonstrated the least subtype specificity ( Figure 5 D). About 18.27% of lncRNAs showed subtype speci- ficity, whereas only 10.55% of PCGs were subtype specific ( Fig- ure 5 E). To rule out the possibility that the higher specificity of lncRNAs is a result of their lower abundance, we calculated the specificity scores of highly expressed transcripts from these three different types of genes. Again, lncRNA showed a higher tissue specificity than PCG and pseudogenes ( Figure 5 D). We also sought to determine if the expression signatures of lncRNAs are also cancer type specific using RNA-seq profiles from the Cancer Cell Line Encyclopedia (CCLE) in 935 human tumor cell lines ( Table S6 ). As shown in Figure 5 F, tumors of epithelia, melanoma, hematological, and neurological origins formed distinctive clusters on the basis of lncRNA expression. Sarcoma tumors displayed a diffuse lncRNA expression pattern, which may be explained by the fact that this type of tumor arises from various tissues. Using the JS divergence calculation, we compared the tissue specificity of lncRNAs, PCGs, and pseudo- genes. Similar to our findings regarding subtype specificity in TCGA, the JS divergence measurements across cell lines of different origins revealed that lncRNA are more tissue specific than PCGs and pseudogenes ( Figure 5 G). Finally, we compared cancer type specificity across cell lines from 22 cancer types, and consistent results were observed ( Figure S4 B). These studies suggest that lncRNAs have the potential to serve as Many Cancer-Associated SNPs Are Located in lncRNA Loci. A genome-wide view of the most significant cancer-associated index SNPs. The peaks in each track are proportional to the p values between the chromosomal locations of the index-SNPs. Comprehensive Genomic Characterization of Long Non-coding RNAs across Human Cancers. Zhang et al 2015.
Many Cancer-Associated SNPs Are Located in lncRNA Loci. A genome-wide view of the most significant cancer-associated index SNPs. The peaks in each track are proportional to the p values between the chromosomal locations of the index-SNPs. Comprehensive Genomic Characterization of Long Non-coding RNAs across Human Cancers. Zhang et al 2015.

Michelle Petersen View All

Michelle is a health industry veteran who taught and worked in the field before training as a science journalist.

Featured by numerous prestigious brands and publishers, she specializes in clinical trial innovation--expertise she gained while working in multiple positions within the private sector, the NHS, and Oxford University.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.