1 Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark2 Department of Systems Biology, Technical University of Denmark3 Comparative Microbial Genomics, Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark4 Department of Bio and Health Informatics, Technical University of Denmark
During the past few years, innovations in the DNA sequencing technology has led to an explosion in available DNA sequence information. This has revolutionized biological research and promoted the development of high throughput analysis methods that can take advantage of the vast amount of sequence data. For this, the DNA microarray technology has gained enormous popularity due to its ability to measure the presence or the activity of thousands of genes simultaneously. Microarrays for high throughput data analyses are not limited to a few organisms but may be applied to everything from bacteria to higher Eukaryotes and new applications are constantly being reported. In this PhD thesis, various applications for DNA microarrays are explored. Consequently, research results are presented where the use of microarray data has been essential. The thesis comprises three main topics: gene expression analysis, analysis of chromosomal aberrations and DNA sequence dependent gene expression. First, this thesis contains a description of how the gene expression profiles from children with acute lymphoblastic leukemia may be used to improve the diagnosis of these patients and potentially improve their treatment. Next, a new method is presented that utilizes a large repository of gene expression microarray data to derive functional associations between for instance a mutant and a compendium of gene expression responses. By this approach, an extensive functional characterization of a given mutant or experimental factor such as compound treatment may be obtained. The same characterization could otherwise be time consuming and require an extensive biological knowledge of the investigated biological system. Often, solid tumors are characterized by a multitude of chromosomal aberrations where parts of the chromosomes have either been lost or additional copies might have been gained. By targeting microarrays at chromosomal DNA, it is possible to measure the socalled DNA copy number and thereby obtain a DNA copy number profile of each chromosome. Numerous analysis methods have been published that aims at identifying the exact breakpoints where DNA has been gained or lost. In this thesis, three popular methods are compared and a realistic simulation model is presented for generating artificial data with known breakpoints and known DNA copy number. By using simulated data, we obtain a realistic evaluation of each method’s ability to analyze DNA copy number data. Moreover, our study shows that analysis methods developed for cancer research may also successfully be applied to DNA copy number profiles from bacterial genomes. However, here the purpose is to characterize variations in the gene content of various strains of the bacteria, e.g. Escherichia coli, with regard to genes involved in pathogenesis. Finally, this thesis present results demonstrating that the gene expression level is sequence dependent, that is, it depends on both DNA structure and codon usage bias. Here, microarray data was used to verify predictions of highly expressed genes. Moreover, the codon bias of microbial genomes was found to constitute an environmental signature. For example, soil bacteria have very similar codon bias.
Main Research Area:
Wassenaar, Gertrude Maria, Lund, Ole, Ussery, David