Computational search of RNA pseudoknots and structural variations in genomes
MetadataShow full item record
Non-coding RNA (ncRNA) secondary structural homologs can be detected effectively in genomes based on a covariance model (CM) and associated dynamic programming algorithms. However, the computational difficulty in aligning an RNA sequence to a pseudoknot structure has prohibited high throughput search for RNA pseudoknot structures in sequences. Due to the lack of appropriate ncRNA structural evolution models, accurate search of distant RNA structural homologs also remains difficult. The core of both problems is the sequence structure alignment that requires intensive computation for complex structure. Based on a conformational graph model we built to incorporate all the interactions of stem and loop, including the crossing stem pattern of pseudoknots, the sequence-structure alignment problem can be modeled as a subgraph isomorphism problem. Based on the graph tree decomposition and naturally small tree width in ncRNA structures including pseudoknots, the problem of searching ncRNA with pseudoknot structures in genomes can be solved efficiently by the tree decomposition based dynamic programming algorithm. Further, the sequence-structure alignment problem for distant RNA structural homolog search can be modeled as a graph homomorphism problem. Tree decomposition based dynamic programming algorithm equipped with the new technique of NULL stem is applied to solving the RNA structural variation search problem more effectively. In this dissertation, we developed two search frameworks, RNATOPS and its extension RNAv, based on a general conformational graph model. Our genome search test results demonstrate RNATOPS has an advantage over Infernal and other methods in accuracy and computational efficiency when searching for the ncRNA pseudoknot structures in genomes, and RNAv, with the capability of detecting pseudoknot, also has an advantage over Infernal in detection of some distant homologs.
Showing items related by title, author, creator and subject.
Wadsworth, Joe R., Jr. (Georgia Marine Science Center, University System of Georgia, Skidaway Island, Georgia, 1981-11)
Walker, Randal L. (Georgia Marine Science Center, University System of Georgia, 1987)
On developing a high-throughput gene-to-protein-to-structure pipeline and its application for the production of 10 crystal structures Zhao, Min (uga, 2007-12)By now, genome-sequencing projects have contributed to the systematical identification of a large number of proteins with unknown functions. A cost effective, high-throughput method for going from gene to structure, ...