BMIR Research in Progress: Hong Zheng “Benchmark of lncRNA Quantification for RNA-Seq of Cancer Samples”

When:
February 1, 2018 @ 12:00 pm – 1:00 pm
2018-02-01T12:00:00-08:00
2018-02-01T13:00:00-08:00
Where:
MSOB, Conference Room X-275
1265 Welch Rd
Stanford, CA 94305
USA
Cost:
Free
Contact:
Marta Vitale-Soto

Hong
Hong Zheng,
Postdoctoral Scholar,
BMIR, Stanford University

ABSTRACT:
Long non-coding RNAs (lncRNAs) emerge as important regulators of various biological processes. While many studies have exploited public resources such as The Cancer Genome Atlas to study lncRNAs in cancer, it is crucial to choose the optimal method for expression quantification of lncRNAs. In this benchmarking study, we compared the performance of pseudoalignment methods Kallisto and Salmon, and alignment-based methods HTSeq, featureCounts, and RSEM, in lncRNA quantification, by applying them to a simulated RNA-Seq dataset and a pan-cancer dataset. Pseudoalignment-based methods detect more lncRNAs than alignment-based methods and correlate highly with simulated ground truth, while alignment-based methods underestimate the expression for some lncRNAs, including cancer-relevant lncRNAs TERC and ZEB2-AS1. Overall, 10-16% of lncRNAs are detected in the samples, with antisense and lincRNAs the two most abundant categories. A higher proportion of antisense RNAs are detected than lincRNAs. Moreover, antisense RNAs, lncRNAs with fewer transcripts, less than three exons, and lower sequence uniqueness are more discordant with ground truth. Full transcriptome annotation, including both protein coding and noncoding RNAs, greatly improves the specificity of lncRNA quantification. In summary, pseudoalignment methods Kallisto or Salmon in combination with full transcriptome annotation is our recommended strategy for RNA-Seq analysis for lncRNAs.