Inferring 3D chromatin structure using a multiscale approach based on quaternions
MetadataShow full item record
Abstract Background The knowledge of the spatial organisation of the chromatin fibre in cell nuclei helps researchers to understand the nuclear machinery that regulates dna activity. Recent experimental techniques of the type Chromosome Conformation Capture (3c, or similar) provide high-resolution, high-throughput data consisting in the number of times any possible pair of dna fragments is found to be in contact, in a certain population of cells. As these data carry information on the structure of the chromatin fibre, several attempts have been made to use them to obtain high-resolution 3d reconstructions of entire chromosomes, or even an entire genome. The techniques proposed treat the data in different ways, possibly exploiting physical-geometric chromatin models. One popular strategy is to transform contact data into Euclidean distances between pairs of fragments, and then solve a classical distance-to-geometry problem. Results We developed and tested a reconstruction technique that does not require translating contacts into distances, thus avoiding a number of related drawbacks. Also, we introduce a geometrical chromatin chain model that allows us to include sound biochemical and biological constraints in the problem. This model can be scaled at different genomic resolutions, where the structures of the coarser models are influenced by the reconstructions at finer resolutions. The search in the solution space is then performed by a classical simulated annealing, where the model is evolved efficiently through quaternion operators. The presence of appropriate constraints permits the less reliable data to be overlooked, so the result is a set of plausible chromatin configurations compatible with both the data and the prior knowledge. Conclusions To test our method, we obtained a number of 3d chromatin configurations from hi-c data available in the literature for the long arm of human chromosome 1, and validated their features against known properties of gene density and transcriptional activity. Our results are compatible with biological features not introduced a priori in the problem: structurally different regions in our reconstructions highly correlate with functionally different regions as known from literature and genomic repositories.