1 Lund Group, BRIC Research Groups, BRIC, Københavns Universitet2 Department of Anthropology, Faculty of Social Sciences, Københavns Universitet3 Molekylær medicinsk afdeling (MOMA)4 Det Samfundsvidenskabelige Fakultet5 unknown6 Graduate School of Health and Medical Sciences, Faculty of Health and Medical Sciences, Københavns Universitet7 Department of Anthropology, Faculty of Social Sciences, Københavns Universitet8 Lund Group, BRIC Research Groups, BRIC, Københavns Universitet9 Graduate School of Health and Medical Sciences, Faculty of Health and Medical Sciences, Københavns Universitet
The past decade has shown mammalian genomes to be pervasively transcribed and identified thousands of noncoding (nc) transcripts. It is currently unclear to what extent these transcripts are of functional importance, as experimental functional evidence exists for only a small fraction. Here, we characterize the expression and evolutionary conservation properties of 12,115 known and novel nc transcripts, including structural RNAs, long nc RNAs (lncRNAs), antisense RNAs, EvoFold predictions, ultraconserved elements, and expressed nc regions. Expression levels are evaluated across 12 human tissues using a custom-designed microarray, supplemented with RNAseq. Conservation levels are evaluated at both the base level and at the syntenic level. We combine these measures with epigenetic mark annotations to identify subsets of novel nc transcripts that show characteristics similar to known functional ncRNAs. Few novel nc transcripts show both high expression and conservation levels. However, overall, we observe a positive correlation between expression and both conservation and epigenetic annotations, suggesting that a subset of the expressed transcripts are under purifying selection and likely functional. The identified subsets of expressed and conserved novel nc transcripts may form the basis for further functional characterization.