Trends in information theory-based chemical structure codification
详细信息    查看全文
  • 作者:Stephen J. Barigye (1)
    Yovani Marrero-Ponce (1) (2) (3)
    Facundo Pérez-Giménez (3)
    Danail Bonchev (4)
  • 关键词:Information theory ; Chemical structure ; Statistical pattern ; Information indices ; Shannon’s entropy ; Mutual ; Conditional and joint entropies
  • 刊名:Molecular Diversity
  • 出版年:2014
  • 出版时间:August 2014
  • 年:2014
  • 卷:18
  • 期:3
  • 页码:673-686
  • 全文大小:1,235 KB
  • 参考文献:1. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379-23. doi:10.1002/j.1538-7305.1948.tb01338.x CrossRef
    2. Mandelbrot BB (1968) Information theory and psycholinguistics: a theory of word frequencies. In: Lazarsfeld PF, Henry NW (eds) Readings in mathematical social science. The MIT press, Cambridge
    3. McMillan B (1997) Scientific impact of the work of C. E. Shannon. Paper presented at the Proceedings of the Norbert Wiener centenary congress on Norbert Wiener centenary congress, East Lansing, Michigan, 1997
    4. Ebling W, Jiminez-Montano MA (1980) On grammars, complexity and information measures of biological macromolecules. Math Biosci 52:53-1. doi:10.1016/0025-5564(80)90004-8 CrossRef
    5. Cosmi C, Cuomo V, Ragosta M, Macchiato MF (1990) Characterization of nucleotidic sequences using maximum entropy techniques. J Theor Biol 147:423-32. doi:10.1016/S0022-5193(05)80497-7 CrossRef
    6. Schneider TD, Mastronarde DV (1996) Fast multiple alignment of ungapped DNA sequences using information theory and a relaxation method. Discrete Appl Math 71:259-68. doi:10.1016/S0166-218X(96)00068-6 CrossRef
    7. Theil H (1967) Econ Inf Theory. North Holland Publishing Company, Amsterdam
    8. Maasoumi E (1993) A compendium to information theory in economics and econometrics. Econ Rev 12:137-81. doi:10.1080/07474939308800260
    9. Dimitrov AG, Lazar AA, Victor JD (2011) Information theory in neuroscience. J Comput Neurosci 30:1-. doi:10.1007/s10827-011-0314-3 CrossRef
    10. Jaynes ET (1957) Information theory and statistical mechanics. Phys Rev 106:620. doi:10.1103/PhysRev.106.620 CrossRef
    11. Ulanowicz RE (2011) The central role of information theory in ecology towards an information theory of complex networks. In: Dehmer M, Emmert-Streib F, Mehler A (eds.) Birkh?user Boston, pp 153-67. doi: 10.1007/978-0-8176-4904-3_7
    12. Bernaola-Galvan P, Roman-Roldan R, Oliver J (1996) Compositional segmentation and long-range fractal correlations in DNA sequences. Phys Rev E 53:5181-189. doi:10.1103/PhysRevE.53.5181 CrossRef
    13. Bonchev D (2009) Information theoretic measures of complexity. In: Meyers R (ed) Encyclopedia of complexity and system science, vol 5. Springer, Heidelberg, Germany, pp 4820-838. doi:10.1007/978-0-387-30440-3_285 CrossRef
    14. Desurvire E (2009) Classical and quantum information theory an introduction for the telecom scientist. Cambridge University Press, New York CrossRef
    15. Jaynes ET (1957) Information theory and statistical mechanics II. Phys Rev 108:171-90. doi:10.1103/PhysRev.108.171 CrossRef
    16. Ben-Naim A (2011) Entropy: order or information. J Chem Educ 88:594-96. doi:10.1021/ed100922x CrossRef
    17. Balaban AT, Ivanciuc O (1999) Histological development of topological indices. In: Devillers J, Balaban AT (eds) Topological indices and related descriptors in QSAR and QSPR. Gordon and Breach Science Publishers, The Netherlands, pp 32-9
    18. Dehmer M, Mowshowitz A (2011) A history of graph entropy measures. Inf Sci 181:57-8. doi:10.1016/j.ins.2010.08.041 CrossRef
    19. Bonchev D (1983) Information theoretic indices for characterization of chemical structures. Research Studies Press, Chichester, UK
    20. Bonchev D (2005) My life-long journey in mathematical chemistry. Int Electron J Mol Des 4:434-90
    21. Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics, 1st edn. Wiley-VCH, Weinheim CrossRef
    22. García-Domenech R, Gálvez J, de Julián-Ortiz JV, Pogliani L (2008) Some new trends in chemical graph theory. Chem Rev 108:1127-169. doi:10.1021/cr0780006 CrossRef
    23. Bonchev D, Tashkova C, Ljuzkanova R (1975) On the correlation between enthalpy of formation, atomic number, and information content of alkali halides. Dokl BAN 28:225-28
    24. Bonchev D, Kamenska V, Kamenski D (1977) Informationsgehalt chemischer elemente. Monatsh Chem 108:477-87. doi:10.1007/BF00902003 CrossRef
    25. Bonchev D, Kamenska V (1978) Informationscharacteristiken der perioden und unterperioden im periodensystem. Monatsh Chem 109:551-56 CrossRef
    26. Bonchev D, Kamenska V (1978) Information theory in describing the electronic structure of atoms. Croat Chem Acta 51:19-7
    27. Nalewajski RF, Parr RG (2001) Information theory thermodynamics of molecules and their hirshfeld fragments. J Phys Chem A 105:7391-400. doi:10.1021/jp004414q CrossRef
    28. Nalewajski RF (2002) Applications of the information theory to problems of molecular electronic structure and chemical reactivity. Int J Mol Sci 3:237-59. doi:10.3390/i3040237 CrossRef
    29. Nalewajski RF, Broniatowska E (2003) Entropy displacement and information distance analysis of electron distributions in molecules and their hirshfeld atoms. J Phys Chem A 107:6270-280. doi:10.1021/jp030208h CrossRef
    30. Parr RG, Ayers PW, Nalewajski RF (2005) What is an atom in a molecule? J Phys Chem A 109:3957-959. doi:10.1021/jp0404596 CrossRef
    31. Dancoff SM, Quastler H (1953) The information content and error rate of living things. In: Quastler H (ed) Essays on the use of information theory in biology. University of Illinois Press, Urbana, pp 263-73
    32. Cayley A (1875) Ueber die analytischen figuren, welche in der mathematik b?ume genannt werden und ihre anwendung auf die theorie chemischer verbindungen. Ber deutsch chem Ges 8:1056-059. doi:10.1002/cber.18750080252 CrossRef
    33. Rouvray DH (1989) The pioneering contributions of Cayley and Sylvester to the mathematical description of chemical structure. J Mol Struct (Theochem) 185:1-4. doi:10.1016/0166-1280(89)85003-1 CrossRef
    34. Pogliani L (2000) From molecular connectivity indices to semiempirical connectivity terms: recent trends in graph theoretical descriptors. Chem Rev 100:3827-858. doi:10.1021/cr0004456 CrossRef
    35. Randi? M (2003) Aromaticity of polycyclic conjugated hydrocarbons. Chem Rev 103:3449-606. doi:10.1021/cr9903656
    36. Randi? M, Zupan J, Balaban AT, Vikic-Topic D, Plav?ic D (2011) Graphical representation of proteins. Chem Rev 111:790-62. doi:10.1021/cr800198j
    37. Rashewsky N (1955) Life, information theory, and topology. Bull Math Biophys 17:229-35. doi:10.1007/BF02477860 CrossRef
    38. Trucco E (1956) A note on the information content of graphs. Bull Math Biophys 18:129-35. doi:10.1007/BF02477836 CrossRef
    39. Trucco E (1956) On the information content of graphs: compound symbols; different states for each point. Bull Math Biophys 18:237-53. doi:10.1007/BF02481859 CrossRef
    40. Mowshowitz A (1968) Entropy and the complexity of the graphs I: an index of the relative complexity of a graph. Bull Math Biophys 30:175-04 CrossRef
    41. Mowshowitz A (1968) Entropy and the complexity of graphs IV: entropy measures and graphical structure. Bull Math Biophys 30:533-46. doi:10.1007/BF02476673 CrossRef
    42. Bonchev D, Kamenski Kamenska V (1976) Symmetry and information content of chemical structures. Bull Math Biol 38:119-33. doi:10.1007/BF02471752 CrossRef
    43. Bertz SH (1981) The first general index of molecular complexity. J Am Chem Soc 103:3599-601. doi:10.1021/ja00402a071 CrossRef
    44. Bonchev D (2003) Shannon’s information and complexity. In: Bonchev D, Rouvray DH (eds) Complexity in chemistry, vol 7., Mathematical chemistry SeriesTaylor & Francis, London, UK, pp 155-87
    45. Hosoya H (1971) Topological index. A newly proposed quantity characterizing the topological nature of structural isomers of saturated hydrocarbons. Bull Chem Soc Jpn 44:2332-339. doi:10.1246/bcsj.44.2332 CrossRef
    46. Bonchev D, Trinajstic N (1977) Information theory, distance matrix, and molecular branching. J Chem Phys 38:4517-533. doi:10.1063/1.434593 CrossRef
    47. Basak SC, Roy AB, Ghosh JJ (1979) Study of the structure-function relationship of pharmacological and toxicological agents using information theory. In: Avula XJR, Bellman R, Luke YL, Riegler AK (eds) Proceedings of 2nd international conference on mathematical modelling, University of Missouri, Rolla, pp 851-56
    48. Basak SC, Raychaudhury C, Roy AB, Ghosh JJ (1981) Quantitative structure-activity relationships (QSAR) studies of bioactive agents using structural information indices. Ind J Pharmacol 13:112-116
    49. Basak SC, Magnuson VR (1983) Molecular topology and narcosis. A quantitative structure-activity relationship (QSAR) study of alcohols using complementary information content (CIC). Arzneim-Forsch/Drug Res 33:501-03
    50. Raychaudhury C, Ray SK, Roy AB, Ghosh JJ, Basak SC (1984) Discrimination of isomeric structures using information theoretic topological indices. J Comput Chem 5:581-88. doi:10.1002/jcc.540050612 CrossRef
    51. Basak SC (1987) Use of molecular complexity indices in predictive pharmacology and toxicology: a QSAR approach. Med Sci Res 15:605-09
    52. Basak SC (1999) Information theoretic indices of neighborhood complexity and their applications. In: Devillers J, Balaban AT (eds) Topological indices and related descriptors in QSAR and QSPR. Gordon and Breach, Reading, UK, pp 563-93
    53. Balaban AT (1979) Chemical graphs. XXXIV. Five new topological indices for the branching of tree-like graphs. Theor Chim Acta 53:355-75. doi:10.1007/BF00555695 CrossRef
    54. Balaban AT, Bertelsen S, Basak SC (1994) New centric topological indexes for acyclic molecules (trees) and substituents (rooted trees), and coding of rooted trees. MATCH Commun Math Comput Chem 30:55-2
    55. Bonchev D, Balaban AT, Mekenyan O (1980) Generalization of the graph center concept, and derived topological indexes. J Chem Inf Comput Sci 20:106-13. doi:10.1021/ci60022a011 CrossRef
    56. Bonchev D (1989) The concept for the centre of a chemical structure and its applications. J Mol Struct (Theochem) 185:155-68. doi:10.1016/0166-1280(89)85011-0 CrossRef
    57. Dosmorov SV (1982) Generation of homogeneous reaction mechanism. Kinetics and Catalysis
    58. Dehmer M, Varmuza K, Borgert S, Emmert-Streib F (2009) On entropy-based molecular descriptors: statistical analysis of real and synthetic chemical structures. J Chem Inf Model 49:1655-663. doi:10.1021/ci900060x CrossRef
    59. Dehmer M, Grabner M, Varmuza K (2012) Information indices with high discriminative power for graphs. PLoS ONE 7(2):e31214. doi:10.1371/journal.pone.0031214 CrossRef
    60. Dehmer M, Borgert S, Emmert-Streib F (2008) Entropy bounds for molecular hierarchical networks. PLoS ONE 3(8):e3079. doi:10.1371/journal.pone.0031214 CrossRef
    61. Dehmer M, Emmert-Streib F (2008) Structural information content of networks: graph entropy based on local vertex functionals. Comp Biol Chem 32:131-38. doi:10.1016/j.compbiolchem.2007.09.007 CrossRef
    62. Gregori-Puigjané E, Mestres J (2006) SHED: Shannon entropy descriptors from topological feature distributions. J Chem Inf Model 46:1615-622. doi:10.1021/ci0600509 CrossRef
    63. Poincaré H (1900) Second complément à l’Analysis situs. Proc London Math Soc 32:277-08. doi:10.1112/plms/s1-32.1.277 CrossRef
    64. Harary F (1969) Graph theory. Addison-Wesley, Reading, MA
    65. Jane?i? D, Mili?evi? A, Nikoli? S, Trinajsti? N (2007) Graph theoretical matrices in chemistry., Mathematical chemistry monographsUniversity of Kragujevac & Faculty of Science Kragujevac, Kragujevac
    66. Ivanciuc O, Balaban AT (1996) Design of topological indices. Part 3. New identification numbers of chemical structures: MINID and MINSID. Croat Chem Acta 69:9-6
    67. Wiener H (1947) Structural determination of paraffin boiling points. J Am Chem Soc 69:17-0. doi:10.1021/ja01193a005 CrossRef
    68. Skorobogatov VA, Konstantinova EV, Nekrasov YS, Sukharev YN, Tepfer EE (1991) On the correlation between the molecular information topological and mass spectra indices of organometallic compounds. MATCH Commun Math Comput Chem 26:215-28
    69. Consonni V, Todeschini R, Pavan M (2002) Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. Part 1. Theory of the novel 3D molecular descriptors. J Chem Inf Comput Sci 42:682-92. doi:10.1021/ci015504a
    70. Consonni V, Todeschini R, Pavan M, Gramatica P (2002) Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. Part 2. Application of the novel 3D molecular descriptors to QSAR/QSPR studies. J Chem Inf Comput Sci 42:693-05. doi:10.1021/ci0155053 CrossRef
    71. Hall LH, Kier LB (1995) Electrotopological state indices for atom types: a novel combination of electronic, topological, and Valence state information. J Chem Inf Comput Sci 35:1039-045. doi:10.1021/ci00028a014 CrossRef
    72. Klopman G, Raychaudhury C (1988) A novel approach to the use of graph theory in structure-activity relationship studies. Application to the qualitative evaluation of mutagenicity in a series of nonfused ring aromatic compounds. J Comput Chem 9:232-43. doi:10.1002/jcc.540090307 CrossRef
    73. Klopman G, Raychaudhury C, Henderson RV (1988) A new approach to structure-activity using distance information content of graph vertices: a study with phenylalkylamines. Math Comput Modelling 11:635-40. doi:10.1016/0895-7177(88)90570-5 CrossRef
    74. Balaban AT, Balaban TS (1991) New vertex invariants and topological indices of chemical graphs on information on distances. J Math Chem 8:383-97. doi:10.1007/BF01166951 CrossRef
    75. Ivanciuc O, Balaban TS, Balaban AT (1993) Chemical graphs with degenerate topological indices based on information on distances. J Math Chem 14:21-3. doi:10.1007/BF01164452 CrossRef
    76. Konstantinova EV, Paleev AA (1990) Sensitivity of topological indices of polycyclic graphs. Vychisl Sistemy 136:38-8
    77. Ivanciuc O (2002) Building-block computation of the Ivanciuc-Balaban indices for the virtual screening of combinatorial libraries. Int Electron J Mol Des 1:1-
    78. Mekenyan O, Bonchev D, Balaban AT (1988) Topological indices for molecular fragment and new graph invariants. J Math Chem 2:347-75. doi:10.1007/BF01166300
    79. Balaban AT, Feroiu V (1990) Correlations between structure and critical data or vapor pressures of alkanes by means of topological indices. Rep Mol Theory 1:133-39
    80. Ivanciuc O, Ivanciuc T, Cabrol-Bass D, Balaban AT (2000) Evaluation in quantitative structure-property relationship models of structural descriptors derived from information theory operators. J Chem Inf Comput Sci 40:631-43. doi:10.1021/ci9900884
    81. Ivanciuc O, Ivanciuc T, Balaban AT (1999) Vertex- and edge-weighted molecular graphs and derived structural descriptors. In: Devillers J, Balaban AT (eds) Topological indices and related descriptors in QSAR and QSPR. Gordon and Breach Science Publishers, Amsterdam, The Netherlands, pp 169-20
    82. Ivanciuc O, Balaban AT (1999) Design of topological indices. Part 20. Molecular structure descriptors computed with information on distance operators. Rev Roum Chim 44:479-89
    83. Ramos de Armas R, González Díaz H (2004) Markovian backbone negentropies: molecular descriptors for protein research. I. Predicting protein stability in arc repressor mutants. Protein struct Funct Bioinform 56:715-23. doi:10.1002/prot.20159 CrossRef
    84. Hamming RW (1986) Coding and information theory, 2nd edn. Prentice-Hall, Englewood Cliffs
    85. Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley, Hoboken, New Jersey
    86. Lin S, Costello DJ Jr (1983) Error control coding: fundamentals and applications. Prentice-Hall, Englewood Cliffs, NJ
    87. Blahut RE (1983) Theory and practice of error control codes. Addison-Wesley, Reading, MA
    88. Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79-6. doi:10.1214/aoms/1177729694 CrossRef
    89. Barigye SJ, Marrero-Ponce Y, López YM, Santiago OM, Torrens F, Domenech RG, Galvez J (2012) Event-based criteria in GT-STAF information indices: theory, exploratory diversity analysis and QSPR applications. SAR & QSAR Environ Res 24:3-4. doi:10.1080/1062936X.2012
    90. Barigye SJ, Marrero-Ponce Y, Santiago OM, López YM, Torrens F (2013) Shannon’s, mutual, conditional and joint entropy-based information indices. Generalization of global indices defined from local vertex invariants. Curr Comput-Aided Drug Des 9:164-83 CrossRef
    91. Barigye SJ, Marrero-Ponce Y, Martínez-López Y, Torrens F, Artiles-Martínez LM, Pino-Urias RW, Martínez-Santiago O (2013) Relations frequency hypermatrices in mutual, conditional and joint entropy-based information indices. J Comp Chem 34:259-74. doi:10.1002/jcc.23123 CrossRef
    92. Dmítriev VI (1989) Applied information theory. Mir Publishers, Moscow
  • 作者单位:Stephen J. Barigye (1)
    Yovani Marrero-Ponce (1) (2) (3)
    Facundo Pérez-Giménez (3)
    Danail Bonchev (4)

    1. Unit of Computer-Aided Molecular “Biosilico-Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Universidad Central “Martha Abreu-de Las Villas, 54830?, Santa Clara, Villa Clara, Cuba
    2. Facultad de Química Farmacéutica, Universidad de Cartagena, Cartagena de Indias, Bolívar, Colombia
    3. Unidad de Investigación de Dise?o de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
    4. Center for the Study of Biological Chemistry, Virginia Commonwealth University, P. O. Box 842030, Richmond, VA, 23284-2030, USA
  • ISSN:1573-501X
文摘
This report offers a chronological review of the most relevant applications of information theory in the codification of chemical structure information, through the so-called information indices. Basically, these are derived from the analysis of the statistical patterns of molecular structure representations, which include primitive global chemical formulae, chemical graphs, or matrix representations. Finally, new approaches that attempt to go “back to the roots-of information theory, in order to integrate other information-theoretic measures in chemical structure coding are discussed.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700