Recent studies have shown that chromosomes in many organisms divide into discrete sections into different types of chromatin domains. In Mammals, chromosomes form smaller compartments along with small Topologically associating domains or simply TAD’s. These TAD’s will define the functional domains of gene regulation. The full level of functions of a TAD’s are yet to be known, as there is so much to learn from TAD’s functions like what more can these do, and some studies have shown that disrupting TADs leads to disease. Because changing the 3D structure of a chromosome will disrupt the entire gene regulation. The complex structure of a TADs is not fully understood, as researchers are still trying to find new things about TADs.
Prof. Angsheng Li a prominent Chinese scientist from the State Key Laboratory of Software Development Environment, Beihang University has discovered a new method for decoding topologically associating domains with ultra-low resolution Hi-C data. A Hi-C is one of the Chromosome conformation capture techniques, these are various molecular biology methods used to identify the structure of a chromatin in a cell which includes 3C, 4C, 5C, and Hi-C. Prof. Angsheng Li and his team have developed a new method for decoding TADs by structural entropy. With his research, Submegabase-size topologically associating domains (TAD) were found in high-throughput chromatin interaction data (Hi-C). But in order to detect the accurate TADs one has to follow ultra-deep sequencing methods or most advanced normalization procedures.
But, Prof. Angsheng Li has proposed a fast and normalization free method to decode the domains in chromosomes (deDoc) using structural information theory, a new theory proposed early by Prof Li and his coworker. In his method, he has treated the Hi-C contact matrix as the representation of a graph, in this, (deDoc) divides the graph into segments with minimal structural entropy and also showed that structural entropy can also be used to determine proper bin size of the Hi-C data. By the application of (deDoc) to collected Hi-C data from 10 single cells, Prof. Angsheng Li and his team were able to detect megabase-size TAD-like domains. This result indicates that the modular structure of a genome spatial organization is common to even a small cohort of single cells. His algorithms will help in conducting further investigations of chromosomal domains on a larger scale.
About the author:
Prof. Angsheng Li is from the State Key Laboratory of Software Development Environment, Beihang University. Angsheng Li was Research Professor of Institute of Software, Chinese Academy of Sciences from 1999 – 2018. He was born in 1964. He got first degree in Mathematics in Yunnan Normal University in 1984, and ph D in 1993 in Institute of Software, Chinese Academy of Sciences. He has been working for the Institute of Software, Chinese Academy of Sciences since 1993 after he finished his ph D. From 1998 to 2002, he was a visiting and Research Fellow in University of Leeds, working with Professor Barry Cooper (an academic descendant of Alan Turing) in Computability Theory. In 2003, he was awarded the Distinguished Young Investigator award of the National Natural Science Foundation of China. In 2008, he was selected by the Hundred Talent Program of Chinese Academy of Sciences. From 2008 to 2009, he was a visiting scientist in Computer Science Department, working with Professor Juris Hartmanis (the founder of Computational Complexity Theory). In 2012, he was invited as a visiting fellow by Isaac Newton Institute of Mathematical Sciences. His research areas include Computability Theory, Computational Theory, Network Theory and Information Science. In Computability theory, he solved a 40-year old open problem proposed by Lachlan. In the theory of information and computation, he proposed the notion of encoding tree of graphs, the metrics of structural entropy of graphs, compressing information of graphs and decoding information of graphs, and established the fundamental theory of structural information theory. The new theory provides the principles for networks and for data analysis. This theory resolved the grand challenges proposed by Shannon and Brooks in 1953 and 2003, respectively.