1 Paediatrics, Herlev and Gentofte Hospital, The Capital Region of Denmark
Long non-coding RNAs are a new class of non-coding RNAs that are at the crosshairs in many human diseases such as cancers, cardiovascular disorders, inflammatory and autoimmune disease like Inflammatory Bowel Disease (IBD) and Type 1 Diabetes (T1D). Nearly 90% of the phenotype-associated single-nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) lie outside of the protein coding regions, and map to the non-coding intervals. However, the relationship between phenotype-associated loci and the non-coding regions including the long non-coding RNAs (lncRNAs) is poorly understood. Here, we systemically identified all annotated IBD and T1D loci-associated lncRNAs, and mapped nominally significant GWAS/ImmunoChip SNPs for IBD and T1D within these lncRNAs. Additionally, we identified tissue-specific cis-eQTLs, and strong linkage disequilibrium (LD) signals associated with these SNPs. We explored sequence and structure based attributes of these lncRNAs, and also predicted the structural effects of mapped SNPs within them. We also identified lncRNAs in IBD and T1D that are under recent positive selection. Our analysis identified putative lncRNA secondary structure-disruptive SNPs within and in close proximity (+/-5 kb flanking regions) of IBD and T1D loci-associated candidate genes, suggesting that these RNA conformation-altering polymorphisms might be associated with diseased-phenotype. Disruption of lncRNA secondary structure due to presence of GWAS SNPs provides valuable information that could be potentially useful for future structure-function studies on lncRNAs.