Computing images of protein cores for protein threading
MetadataShow full item record
Protein threading is one of the most important computational approaches for protein tertiary structure prediction. The main computation portion in threading is to align a query protein sequence (with unknown structure) to each of existing protein structure templates. An often used technique for the sequence-structure alignment is to preprocess the query sequence with the template structure before combinatorial, linear-programming, or graph- theoretic algorithms are applied to the alignment. The preprocessing finds the top candidates in the query protein sequence for each of the cores (i.e., rigid secondary structure elements) in the structure template. Whether the top candidates include the true correspondence in the query sequence to each core in the template determines the overall score of the alignment and consequently whether the threading algorithm can correctly identify the right structure for the query sequence. This work focuses on the threading preprocessing. In this work, three methods, direct scan, global alignment based scan and anchor-based hybrid scan, are presented for threading preprocessing. From the experiment evaluation with Dali dataset, it is found that the methods of global alignment based scan and anchor-based hybrid scan outperform the direct scan method when the template and the query are close in length; the direct scan achieves better performance when the lengths of the template and the query are very different. Based on the overall experiment evaluation, it is believed that accurate energy functions are critical to effective threading preprocessing.