1 Department of Molecular Biology and Genetics - Molecular Genetics and Systems Biology, Department of Molecular Biology and Genetics, Science and Technology, Aarhus University2 Molekylær Genetik og Systembiologi, Faculty of Agricultural Sciences, Aarhus University, Aarhus University3 Department of Clinical Medicine - Molekylær Medicinsk afdeling (MOMA), Department of Clinical Medicine, Health, Aarhus University4 Wageningen University, Animal Breeding and Genomics Centre5 The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh6 Department of Animal Sciences, University of Illinois at Urbana-Champaign7 Department of Clinical Medicine - Molekylær Medicinsk afdeling (MOMA), Department of Clinical Medicine, Health, Aarhus University8 Department of Molecular Biology and Genetics - Molecular Genetics and Systems Biology, Department of Molecular Biology and Genetics, Science and Technology, Aarhus University
The release of Sus scrofa genome assembly 10 supports improvement of the pig genome annotation and in depth transcriptome analyses using next-generation sequencing technologies. In this study we analyze RNA-seq reads from a tissue collection, including 10 separate tissues from Duroc boars and 10 fetal tissues from “Pinky”, a clone of Tabasco that was used for genome sequencing and assembly. Sequencing was carried out either on a mixed cDNA library for “Pinky” tissues or on individual libraries for the remainder of the tissues, all using the Illumina sequencing platform. Using the Tophat RNA short read alignment software we mapped the reads to the genome assembly 10. We extracted contig sequences of gene transcripts using the Cufflinks software. Based on this information we identified expressed genes that are present in the genome assembly. The portion of these genes being previously known was roughly estimated by sequence comparison to known genes. Similarly, we searched for genes that are expressed in the tissues but not present in the genome assembly by aligning the non-genome-mapped reads to known gene transcripts. For the genes predicted to have alternative transcript variants by Cufflinks we computed the occurrence of various alternative splicing events. Finally, we made a comparison of coding sequences represented by the genome and transcriptome respectively, to identify possible short sequence variations.