详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
     在第三章中,建立了代谢组学中一维谱数据的归一化新方法——聚类部分和归一化(CPIN, Clustering Partial Integral Normalization)。归一化主要是找一个合理的参考标准来衡量代谢物的变化。我们用层次聚类方法得到可能的参考组,通过平衡每个参考组的相似性与一致性,并采用OPLS提高其一致性。我们详细论述了聚类部分和归一化方法的流程及合理性,用两组数据展示其有效性。
     在第四章中,探讨了代谢组学一维谱数据的降维可视化,利用核方法把常规的线性降维方法拓广到了相应的非线性降维方法。首先给出了核磁共振数据降维中常用的一些线性降维方法(如PCA, LDA, PLS, OPLS)的严格数学推导过程;结合核技巧把上述PLS及OPLS线性降维方法拓广到经验核空间;并利用一组实际NMR数据展示上述方法的降维效果和核函数的参数设置对分类降维效果的影响。
Nuclear magnetic resonance (NMR) is widely used in physics, chemistry, biology, medicine, and other scientific fields. Biological NMR provides methodological support for life science re-search in molecular level, cellular level and overall level, especially in protein structure&dynam-ics and metabonomics. We concentrate on the interdisciplinary field:bio-NMR data analysis and mathematical modeling. The experimental data and phenomenon are provided by our collabora-tors in Wuhan Magnetic Resonance Center.
     The thesis consists of six parts. The first chapter introduces backgrounds on data analysis and bio-NMR related to our work.
     In the second chapter, we prove a central-limit theorem of order a(0<α<1) Renyi condi-tional entropy and obtain sharp rate of convergence. By carefully analyzing the Renyi conditional entropy between the distribution of the normalized sum of iid random variables and Gaussian dis-tribution, we show the central-limit theorem related to α(0<α<1) order Renyi conditional entropy, and obtain sharp convergence rate. Such a rate of convergence is used to model selection and model diagnosis.
     In the third chapter, we propose a new method for the normalization of metabolomics in one-dimensional spectral data-CPIN(clustering partial integral normalization). The key idea of normalization is to select a group of bins as a reference to show the variations of metabolites. We uses the hierarchical clustering to obtain candidate groups, balance the trade off between similarity and diversity, and improve the consistency by OPLS. The procedure and the rationality of CPIN are described in detail. The validity of CPIN is demonstrated by two groups of samples of1H spectrum.
     Chapter four discusses the dimension reduction and visualization of the NMR spectrum of metabolites. We generalize conventional linear dimensionality reduction method to the appropri-ate nonlinear dimension reduction method by using kernel methods. We give the rigorous mathe-matical derivation of NMR data dimensionality reduction methods widely used in metabonomics (such as PCA, LDA, PLS, OPLS), then extend PLS and OPLS by using kernel methods to ker-nel space. We use real NMR data of metabolites to show the validity of the proposed nonlinear dimension reduction method.
     The fifth chapter depicts the mathematical modeling work in dynamics of biological macro-molecules with magnetic resonance experiments. For E. coli sugar phosphotransferase system. We establish a dynamic model of the protein using ordinary differential equations, elaborate the weak interaction of proteins. The model grasps the underlying biological mechanisms from new NMR experiments. Specifically, it explains the relationship between Phosphate group transfer efficiency and Dissociation constant through a simple reaction model. It also shows the meaning among proteins weak interaction; further transfer of binary systems containing a mathematical path model, then establishing mathematical model including binary channel to the transporting system, it could predict the Phosphate group transfer efficiency of2-pathway and3-pathway.
     We summarize current works and some problems for further research in chapter six.
[1]Artstein S, Ball K M, Barthe F, et al. Solution of Shannon's problem on the monotonicity of entropy. J. Amer. Math. Soc. 17:975-982. (2004).
    [2]Artstein S, Ball K M, Barthe F, et al. On the rate of convergence in the entropic central limit theorem. Probab. Theory Relat. Fields. 129:381-390. (2004).
    [3]Aubrun G, Szarek S, Werner E, Nonadditivity of Renyi entropy and Dvoretzky's theorem. J. Math. Phys. 51,022102, (2001).
    [4]Barron A R, Entropy and the central limit theorem. Ann. Probab. 14:336-342. (1986).
    [5]Bhattacharya R N, Ranga Rao R, Normal approximation and asymptotic expansion. John Wiley & Sons, Inc. (1976).
    [6]Bobkov S G, Chistyakov G P, Gotze F, Rate of convergence and edgeworth-type expansion in the entropic central limit theorem. Ann. Probab. 41:2479-2512.(2013)
    [7]Erven V T, Harrenmoes P, Renyi, divergence and Kullback-Leibler divergence. (arXiv:1206.2459, 2012).
    [8]Johnson O, Information theory and the central limit theorem. (Imperical College Press, 2004).
    [9]Johnson O, Barron A, Fisher Information inequalities and the central limit theorem. Probab. Theory Relat. Fields. 129:391-409. (2004).
    [10]Johnson O, Vignat C, Some results concerning maximum Renyi entropy distributions. Ann. Inst. H. PoincarRenyi Probab. Statist. 43:339-351. (2007).
    [11]Linnik J V, An information theoretic proof of the central limit theorem with Lindeberg conditions. Theory Probab. Appl.4:288-299. (1959).
    [12]Lutwak E, Yang D, Zhang G, Cramer-Rao and moment-entropy inequalities for Renyi entropy and gen-eralized Fisher information. IEEE Trans. Inform. Theory, 51:473-478. (2005).
    [13]Madiman M, Barron A R, Generalized entropy power inequalities and monotonicity properties of infor-mation. IEEE Transactions on Information Theory. 53:2317-2329. (2007).
    [14]Petrov V V, Sums of independent random variables. (Springer-Verlag, 206-206, 1975).
    [15]Renyi A, On measures of information and entropy. The 4th Berkeley Symposium on Mathematics Statis-tics and Probability. 547-561. (1960)
    [16]Shannon C E, Weaver W W, A mathematical theory of communication. (Urbana, IL: University of Illinois Press, 1949).
    [17]Tulino A M, Verdu S, Monotonic decrease of the non-Gaussianness of the sum of independent random variables: a simple proof.IEEE Trans. Information Theory. 52:4295-4297. (2006)
    [18]F. Dieterle, A. Ross, G. Schlotterbeck and H. Senn. Probabilistic Quotient Normalization as Robust Method to Account for Dilution of Complex Biological Mixtures. Application in 1HNMR Metabonomics. Anal. Chem.78:4281-4290. (2006).
    [19]P. Jatlow, S. Mckee and S. S. O'Malley, Correction of Urine Cotinine Concentrations for Creatinine Excretion: Is It Useful.Clin. Chem. 49:1932-1934. (2003).
    [20]G. Fauler, H. J. Leis, E. Huber, C. Schellauf, R. Kerbl, C. Urban and H. Gleispach, Determination of homovanillic acid and vanillylmandelic acid in neuroblastoma screening by stable isotope dilution GC-MS.J. Mass Spectrom.32(5):507-514. (1997).
    [21]A. Craig, O. Cloarec, E. Holemes, J. K. Nicholson and J. C. Lindon, Scaling and normalization effects in NMR spectroscopic metabonomic data sets. Anal. Chem. 78:2262-2267. (2006).
    [22]J. P. Shochcor and E. Holmes, Metabonomic applications in toxicity screening and disease diagnosis. Curr Top Med Chem.2(1):35-51.
    [23]O. Beckonert, E. Bollard M, T. M. D. Ebbels, H. C. Keun, H. Antti, E. Holmes, J. C. Lindon and J. K. Nicholson, NMR-based metabonomic toxicity classification: hierarchical cluster analysis and k-nearest-neighbour approaches. Analytica Chimica Acta. 490(1):3-5. (2003).
    [24]A. Ranalli, M. L. Ferrante, G. De Mattia, and N. Costantini, Analytical Evaluation of Virgin Olive Oil of First and Second Extraction. J. Agric. Food Chem.47:417-424. (1999).
    [25]G. Fragaki, A. Spyros, G. Siragakis, E. Salivaras and P. Dais, Detection of Extra Virgin Olive Oil Adulter-ation with Lampante Olive Oil and Refined Olive Oil Using Nuclear Magnetic Resonance Spectroscopy and Multivariate Statistical Analysis. J. Agric. Food Chem. 53:2810-2816. (2005).
    [26]G. K. Pierens, M. E. Palframan, C. J. Tranter, A. R. Carroll and R. J. Quinn, A robust clustering approach for NMR spectra of natural product extracts. Magn. Reson. Chem. 43:359-365. (2005).
    [27]M. E. Dumas, C. Canlet, F. Andre, J. Vercauteren, and A. Paris, Metabonomic assessment of physiologi-cal disruptions using 1H-13C HMBC-NMR spectroscopy combined with pattern recognition procedures performed on filtered variables. Anal. Chem. 74:2261-73. (2002).
    [28]T. Hastie, R. Tibshirani and J. Friedman. The elements of statistical learning. Springer, 472-479. (2001).
    [29]J. C. Lindon, J. K. Nicholson and E. Holmes, The handbook of metabonomics and metabolomics, Elsevier BV. (2007).
    [30]J. F. Wu, W. X Xu, Z. P. Ming, H. F Dong, H. R Tang and Y. L Wang, Metabolic Changes Reveal the Development of Schistosomiasis in Mice, PLoS Negl Trop Dis. 4(8):e807. (2010).
    [31]S. Theodoridis and K. Koutroumbas, Pattern regcognition. (Publishing House of Electronics Industry, 2012).
    [32]P. J. Bickel and E. Levina, Some theory for Fisher's linear discriminant function, "naive bayes ", and some alternatives when there are many more variables than observations. Bernoulli 10:989-1010. (2004).
    [33]J. Fan and Y. Fan, Hihg-dimensional classification using features annealed independence rules. Ann. Statist 36:2605-2637. (2008).
    [34]T. E. Bellman, Adaptive control processes. Princeton. (Princeton University Press,1961).
    [35]L. Maaten, E.O. Postma and H.J. Herik, Dimensionality reduction: a comparative review. (Tilburg Uni-versity, 2008).
    [37]P. Hall, Y. Pittelkow and M. Ghosh, Theoretical measures of relative performance of classifiers for high dimensional data with small sample sizes. J. R. Statist. Soc. B 70:159-173. (2008).
    [38]H. Wold, Estimation of principal components and related models by iterative least squares. In Krishna-iaah, P.R. Multivariate Analysis. (New York:Academic Press, 391-420.1966).
    [39]S. Wold, M. Sjostrom and L. Eriksson, PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems 58 (2):109-130. (2001).
    [40]P.Geladi and B.Kowalski, Partial least-squares regression: a tutorial Analytica Chimica Acta. 185:1-17. (1986).
    [42]R. Fisher, The use of multiple measurements in taxonomic problems. Annals of Eugenics. 7 (2):179-188. (1936).
    [43]H. Hotelling, Relations Between Two Sets of Variates. Biometrika 28 (3-4):321-377. (1936).
    [44]D. R. Hardoon, S. Szedmak and J. S. Tayloy, Canonical correlation analysis: an overview with applica-tion to learning method. Neural Computation. 16:2639-2664. (2004).
    [45]K. Pearson, On lines and planes of closest fit to systems of points in space. Phiosophical Magazine Series 6.2(11):559-572.(1901).
    [46]H. Hotelling, Aanlysis of a complex of statistical variables into principal components. Journal of Educa-tional Psychology. 24:417-441. (1933).
    [47]X. He and P. Niyogi, Locality preservingprojections. Proc. of Advances in Neural Information Processing Systems 16:153-160. (2003).
    [48]J. M. Fonvillea, M. Bylesjob,..., M. RantaIainenc, Non-linear modeling of 1HNMR metabonomic data us-ing kernel-based orthogonal projections to latent structures optimized by simulated annealing. Analytica Chimica Acta 705:72-80. (2011).
    [49]M. Bylesjjo, M. Rantalainen,..., J. Trygg. K-OPLS package: Kernel-based orthogonal projections to latent structures for prediction and interpretation in feature space. BMC Bioinformatics 9 106 (2008).
    [50]D. L. Swets and J. Weng, Using Discriminant Eigenfeatures for Image Retrieval. IEEE Trans. Pattern Analysis and Machine Intelligence 18(8):831-836. (1996).
    [51]S. Mika, G. Ratsch, J. Weston B. Scholkopf and K. R Muller, Fisher discriminant analysis with kernels. Neural Networks for Signal Processing.41-48 (1999)
    [52]J. Trygg and S. Wold, Orthogonal projections to latent structures (O-PLS). J. Chemometrics.16:119-128. (2002).
    [53]J. Trygg, O2-PLSfor qualitative and quantitative analysis in multivariate calibration. J. Chemometrics. 16:283-293. (2002).
    [54]J. Trygg and S. Wold, O2-PLS, a two-block (X-Y) latent variable regression (LVR) method with an integral OSC filter. J. Chemometrics. 17:53-64. (2003).
    [56]B. Scholkopf, A. Smola and K. R. Muller. Nonlinear Component Analysis as a Kernel Eigenvalue Prob-lem. Neural Computation. 10(5):1299-1319. (1998).
    [57]J. Mercer, Function of positive and negative type and their connection with the theory of integral equa-tions. Philosophical Transactions of the Royal Society A.209 (441-458):415-446. (1909).
    [58]T. Hunter, Signaling-2000 and beyond. Cell.100:113-127. (2000).
    [60]W. Kundig, S. Ghosh, S. Roseman, Phosphate bound to histidine in a protein as an intermediate in a novel phospho-transferase system. Proc Natl Acad Sci USA. 52(4):1067-1074. (1964).
    [61]P. W. Postma, J. W. Lengeler and G. R. Jacobson, Phosphoenolpyruvate: carbohydrate phosphotrans-ferase systems of bacteria. Microbiol Rev. 57(3):543-594. (1993).
    [63]J. M. Rohwer, N. D. Meadow, S. Roseman, H. V. Westerhoff, and P. W. Postma, Understanding Glucose Transport by the Bacterial Phosphoenolpyruvate: Glycose Phosphotransferase System on the Basis of Kinetic Measurements in Vitro. J Biol Chem. 275(45):34909-34921. (2000).
    [65]D. W. Saffen, K. A. Presper, T. L. Docring and S. Roseman, Sugar transport by the bacterial phos-photransferase system. Molecular cloning and structural analysis of Escherichia coli ptsH, ptsI, and crr genes. J. Biol. Chem. 262:16241-16253. (1987).
    [66]B. Magasanik, Catabolite repression. Cold Spring Harbor Symp Quant Biol. 26:249-256. (1961).
    [67]J. Deutscher, The mechanisms of carbon catabolite repression in beactia. Curr Opin Microbiol. 11(2):87-93. (2008).
    [68]M. Sondej, J. Z. Sun, Y. J. Seok, H. R. Kaback and A. Peterkofsky, Deduction of consensus binding sequences on proteins that bind AIIGlc of the phosphoenolpyruvate:sugar phos-photransferase sys-tem by cysteine scanning mutagenesis of Escherichia coli lactose permease. Proc Natl Acad Sci.96(7): 3525-3530. (1999).

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700