Annotation and characterization of Class 2 transposable elements in cereal grass genomes
MetadataShow full item record
With the improvement in DNA sequencing technology, more and more species have and will be sequenced. To date, one of the amazing genomic discoveries is that the genomes of most higher eukaryotes are largely composed of sequences derived from transposable elements (TEs). My research interests are related to Class 2 (DNA) TEs and especially one type called MITEs. MITEs have been hypothesized to be important players in gene regulation and genome evolution because of their high copy number and preferential association with genes. The availability of genomic sequences from many plant species provides the raw material to determine the location of TEs relative to plant genes at the genome level. To accomplish this requires computer programs that can accurately discovery TEs from large genomic databases. Although numerous TE annotation programs have been developed, none are very successful at both identifying and characterizing new Class 2 elements. For this reason I developed my own tools. This dissertation is composed of four chapters. Chapter 1 provides an introduction to the TE classification system and currently available TE discovery algorithms and programs. Chapter 2 describes a multifunctional pipeline named TARGeT (Tree Analysis of Related Genes and Transposons) that can use either a DNA or a protein sequence as the query to identify, retrieve and characterize homologs from DNA sequence databases. Chapter 3 describes MITE-Hunter, a TE discovery program that can find small nonautonomous DNA TEs (especially MITEs) in genomic datasets. MITE-Hunter was evaluated by applying it to the rice genome and comparing its results with rice TE databases as well as results generated by similar programs. Chapter 4 describes DNA TEs annotated using TARGeT and MITE-Hunter and their distribution in sequenced genomes from the grass clade. Of greatest interest is that MITEs cluster in the promoters of a large fraction of annotated genes, and are especially enriched within 1 Kb of the transcription start site. Putative mechanisms for the observed enrichment and the potential impact of MITEs on the evolution of gene expression are discussed.