Gene families as soft cliques with backbones: Amborella contrasted with other flowering plants
Date
2014-10-17Author
Zheng, Chunfang
Kononenko, Alexey
Leebens-Mack, Jim
Lyons, Eric
Sankoff, David
Metadata
Show full item recordAbstract
Abstract
Background
Chaining is a major problem in constructing gene families.
Results
We define a new kind of cluster on graphs with strong and weak edges: soft cliques with backbones (SCWiB). This differs from other definitions in how it controls the "chaining effect", by ensuring clusters satisfy a tolerant edge density criterion that takes into account cluster size. We implement algorithms for decomposing a graph of similarities into SCWiBs. We compare examples of output from SCWiB and the Markov Cluster Algorithm (MCL), and also compare some curated Arabidopsis thaliana gene families with the results of automatic clustering. We apply our method to 44 published angiosperm genomes with annotation, and discover that Amborella trichopoda is distinct from all the others in having substantially and systematically smaller proportions of moderate- and large-size gene families.
Conclusions
We offer several possible evolutionary explanations for this result.