A Novel Method of Gene Regulatory Network Structure Inference from Gene Knock-Out Expression Data
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Novel Method of Gene Regulatory Network Structure Inference from Gene Knock-Out Expression Data
  • 作者:Xiang ; Chen ; Min ; Li ; Ruiqing ; Zheng ; Siyu ; Zhao ; Jianxin ; Wang ; Fang-Xiang ; Wu ; Yaohang ; Li
  • 英文作者:Xiang Chen;Min Li;Ruiqing Zheng;Siyu Zhao;Jianxin Wang;Fang-Xiang Wu;Yaohang Li;the School of Computer Science and Engineering, Central South University;the Department of Mechanical Engineering and Division of Biomedical Engineering,University of Saskatchewan;the Department of Computer Science, Old Dominion University;
  • 英文关键词:gene regulatory networks;;network inference;;path consistency algorithm
  • 中文刊名:QHDY
  • 英文刊名:清华大学学报自然科学版(英文版)
  • 机构:the School of Computer Science and Engineering, Central South University;the Department of Mechanical Engineering and Division of Biomedical Engineering,University of Saskatchewan;the Department of Computer Science, Old Dominion University;
  • 出版日期:2019-04-10
  • 出版单位:Tsinghua Science and Technology
  • 年:2019
  • 期:v.24
  • 基金:supported in part by the National Natural Science Foundation of China(Nos.61622213and 61732009);; the 111 Project(No.B18059);; the Hunan Provincial Science and Technology Program(No.2018WK4001)
  • 语种:英文;
  • 页:QHDY201904008
  • 页数:9
  • CN:04
  • ISSN:11-3745/N
  • 分类号:78-86
摘要
Inferring Gene Regulatory Networks(GRNs) structure from gene expression data has been a challenging problem in systems biology. It is critical to identify complicated regulatory relationships among genes for understanding regulatory mechanisms in cells. Various methods based on information theory have been developed to infer GRNs. However, these methods introduce many redundant regulatory relationships in the network inference process due to external noise in the original data, topology sparseness in the network structure, and non-linear dependency among genes. Especially as the network size increases, the performance of these methods decreases dramatically. In this paper, a novel network structure inference method named Loc-PCA-CMI is proposed that first identifies local overlapped gene clusters, and then infers the local network structure for each cluster by a Path Consistency Algorithm based on Conditional Mutual Information(PCA-CMI). The final structure of the GRN is denoted as dependence among genes by an ensemble of the obtained local network structures. Loc-PCA-CMI was evaluated on DREAM3 knock-out datasets, and its performance was compared to other information theorybased network inference methods including ARACNE, MRNET, PCA-CMI, and PCA-PMI. Experimental results demonstrate our novel method Loc-PCA-CMI outperforms the other four methods in DREAM3 datasets especially in size 50 and 100 networks.
        Inferring Gene Regulatory Networks(GRNs) structure from gene expression data has been a challenging problem in systems biology. It is critical to identify complicated regulatory relationships among genes for understanding regulatory mechanisms in cells. Various methods based on information theory have been developed to infer GRNs. However, these methods introduce many redundant regulatory relationships in the network inference process due to external noise in the original data, topology sparseness in the network structure, and non-linear dependency among genes. Especially as the network size increases, the performance of these methods decreases dramatically. In this paper, a novel network structure inference method named Loc-PCA-CMI is proposed that first identifies local overlapped gene clusters, and then infers the local network structure for each cluster by a Path Consistency Algorithm based on Conditional Mutual Information(PCA-CMI). The final structure of the GRN is denoted as dependence among genes by an ensemble of the obtained local network structures. Loc-PCA-CMI was evaluated on DREAM3 knock-out datasets, and its performance was compared to other information theorybased network inference methods including ARACNE, MRNET, PCA-CMI, and PCA-PMI. Experimental results demonstrate our novel method Loc-PCA-CMI outperforms the other four methods in DREAM3 datasets especially in size 50 and 100 networks.
引文
[1]G.Altay and F.Emmert-Streib,Inferring the conservative causal core of gene regulatory networks,BMC Systems Biology,vol.4,no.1,p.132,2010.
    [2]K.Basso,A.A.Margolin,G.Stolovitzky,U.Klein,R.Dalla-Favera,and A.Califano,Reverse engineering of regulatory networks in human b cells,Nature Genetics,vol.37,no.4,p.382,2005.
    [3]L.Elnitski,V.X.Jin,P.J.Farnham,and S.J.Jones,Locating mammalian transcription factor binding sites:A survey of computational and experimental techniques,Genome Research,vol.16,no.12,pp.1455-1464,2006.
    [4]T.R.Hughes,M.J.Marton,A.R.Jones,C.J.Roberts,R.Stoughton,C.D.Armour,H.A.Bennett,E.Coffey,H.Dai,Y.D.He,et al.,Functional discovery via a compendium of expression profiles,Cell,vol.102,no.1,pp.109-126,2000.
    [5]S.R.Maetschke,P.B.Madhamshettiwar,M.J.Davis,and M.A.Ragan,Supervised,semi-supervised and unsupervised inference of gene regulatory networks,Briefings in Bioinformatics,vol.15,no.2,pp.195-211,2013.
    [6]A.A.Margolin,K.Wang,W.K.Lim,M.Kustagi,I.Nemenman,and A.Califano,Reverse engineering cellular networks,Nature Protocols,vol.1,no.2,p.662,2006.
    [7]V.A.Huynh-Thu,A.Irrthum,L.Wehenkel,and P.Geurts,Inferring regulatory networks from expression data using tree-based methods,PLoS One,vol.5,no.9,pp.1-10,2010.
    [8]A.-C.Haury,F.Mordelet,P.Vera-Licona,and J.-P.Vert,TIGRESS:Trustful Inference of Gene REgulation using Stability Selection,BMC Syst.Biol.,vol.6,no.1,p.145,2012.
    [9]V.A.Huynh-Thu,G.Sanguinetti,A.Huynh-thu,and T.Jump,Combining tree-based and dynamical systems for the inference of gene regulatory networks,Bioinformatics,vol.31,no.10,pp.1614-1622,2014.
    [10]L.-Z.Liu,F.-X.Wu,and W.-J.Zhang,A group lasso-based method for robustly inferring gene regulatory networks from multiple time-course datasets,BMC Systems Biology,vol.8,no.S3,p.S1,2014.
    [11]M.Li,R.Zheng,Y.Li,F.-X.Wu,and J.Wang,Mgt-sm:A method for constructing cellular signal transduction networks,IEEE/ACM Transactions on Computational Biology and Bioinformatics,doi:10.1109/TCBB.2017.2705143.
    [12]R.Zheng,M.Li,X.Chen,F.-X.Wu,Y.Pan,and J.Wang,Bixgboost:A scalable,flexible boosting-based method for reconstructing gene regulatory networks,Bioinformatics,doi:10.1093/bioinformatics/bty908.
    [13]E.Sakamoto and H.Iba,Inferring a system of differential equations for a gene regulatory network by using genetic programming,in Proceedings of the 2001 Congress on Evolutionary Computation,2001,vol.1,pp.720-726.
    [14]A.R.Chowdhury,M.Chetty,and R.Evans,Stochastic s-system modeling of gene regulatory network,Cognitive Neurodynamics,vol.9,no.5,pp.535-547,2015.
    [15]Z.Li,P.Li,A.Krishnan,and J.Liu,Large-scale dynamic gene regulatory network inference combining differential equation models with local dynamic bayesian network analysis,Bioinformatics,vol.27,no.19,pp.2686-2691,2011.
    [16]K.Murphy and S.Mian,Modelling gene expression data using dynamic Bayesian networks,Technical report,Computer Science Division,University of California,Berkeley,CA,USA,1999.
    [17]M.Zou and S.D.Conzen,A new Dynamic Bayesian Network(DBN)approach for identifying gene regulatory networks from time course microarray data,Bioinformatics,vol.21,no.1,pp.71-79,2004.
    [18]N.X.Vinh,M.Chetty,R.Coppel,and P.P.Wangikar,Globalmit:Learning globally optimal dynamic Bayesian network with the mutual information test criterion,Bioinformatics,vol.27,no.19,pp.2765-2766,2011.
    [19]W.C.Young,A.E.Raftery,and K.Y.Yeung,Fast Bayesian inference for gene regulatory networks using scanbma,BMC Systems Biology,vol.8,no.1,p.47,2014.
    [20]F.Liu,S.-W.Zhang,W.-F.Guo,Z.-G.Wei,and L.Chen,Inference of gene regulatory network based on local Bayesian networks,PLOS Comput.Biol.,vol.12,no.8,p.e1005024,2016.
    [21]N.Omranian,J.M.Eloundou-Mbebi,B.Mueller-Roeber,and Z.Nikoloski,Gene regulatory network inference using fused lasso on multiple data sets,Scientific Reports,vol.6,p.20533,2016.
    [22]F.-X.Wu,W.-J.Zhang,and A.J.Kusalik,Modeling gene expression from microarray expression data with statespace equations,in Biocomputing 2004.World Scientific,2003,pp.581-592.
    [23]M.Quach,N.Brunel,and F.d’Alch′e Buc,Estimating parameters and hidden variables in non-linear state-space models based on odes for biological networks inference,Bioinformatics,vol.23,no.23,pp.3209-3216,2007.
    [24]Y.Wang,T.Joshi,X.-S.Zhang,D.Xu,and L.Chen,Inferring gene regulatory networks from multiple microarray datasets,Bioinformatics,vol.22,no.19,pp.2413-2420,2006.
    [25]V.A.Huynh-Thu,A.Irrthum,L.Wehenkel,and P.Geurts,Inferring regulatory networks from expression data using tree-based methods,PloS One,vol.5,no.9,p.e12776,2010.
    [26]W.J.Longabaugh,E.H.Davidson,and H.Bolouri,Computational representation of developmental genetic regulatory networks,Developmental Biology,vol.283,no.1,pp.1-16,2005.
    [27]G.Karlebach and R.Shamir,Modelling and analysis of gene regulatory networks,Nature Reviews-Molecular Cell Biology,vol.9,no.10,p.770,2008.
    [28]I.Shmulevich,E.R.Dougherty,S.Kim,and W.Zhang,Probabilistic boolean networks:A rule-based uncertainty model for gene regulatory networks,Bioinformatics,vol.18,no.2,pp.261-274,2002.
    [29]H.Kim,J.K.Lee,and T.Park,Boolean networks using the chi-square test for inferring large-scale gene regulatory networks,BMC Bioinformatics,vol.8,no.1,p.37,2007.
    [30]S.Bornholdt,Boolean network models of cellular regulation:Prospects and limitations,Journal of the Royal Society Interface,vol.5,no.Suppl 1,pp.S85-S94,2008.
    [31]J.X.Zhou,A.Samal,A.F.d’H′erou¨el,N.D.Price,and S.Huang,Relative stability of network states in boolean network models of gene regulation in development,Biosystems,vol.142,pp.15-24,2016.
    [32]S.Y.Kim,S.Imoto,and S.Miyano,Inferring gene networks from time series microarray data using dynamic bayesian networks,Briefings in Bioinformatics,vol.4,no.3,pp.228-235,2003.
    [33]X.-W.Chen,G.Anantha,and X.Wang,An effective structure learning method for constructing gene networks,Bioinformatics,vol.22,no.11,pp.1367-1374,2006.
    [34]C.J.Needham,J.R.Bradford,A.J.Bulpitt,and D.R.Westhead,A primer on learning in Bayesian networks for computational biology,PLoS Computational Biology,vol.3,no.8,p.e129,2007.
    [35]L.-Y.Lo,M.-L.Wong,K.-H.Lee,and K.-S.Leung,Highorder dynamic Bayesian network learning with hidden common causes for causal gene regulatory network,BMCBioinformatics,vol.16,no.1,p.395,2015.
    [36]T.S.Gardner,D.Di Bernardo,D.Lorenz,and J.J.Collins,Inferring genetic networks and identifying compound mode of action via expression profiling,Science,vol.301,no.5629,pp.102-105,2003.
    [37]D.di Bernardo,M.J.Thompson,T.S.Gardner,S.E.Chobot,E.L.Eastwood,A.P.Wojtovich,S.J.Elliott,S.E.Schaus,and J.J.Collins,Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks,Nature Biotechnology,vol.23,no.3,pp.377-383,2005.
    [38]M.Bansal,G.D.Gatta,and D.Di Bernardo,Inference of gene regulatory networks and compound mode of action from time course gene expression profiles,Bioinformatics,vol.22,no.7,pp.815-822,2006.
    [39]A.Honkela,C.Girardot,E.H.Gustafson,Y.-H.Liu,E.E.Furlong,N.D.Lawrence,and M.Rattray,Modelbased method for transcription factor target identification with limited data,Proceedings of the National Academy of Sciences,vol.107,no.17,pp.7793-7798,2010.
    [40]T.Lu,H.Liang,H.Li,and H.Wu,High-dimensional odes coupled with mixed-effects modeling techniques for dynamic gene regulatory network identification,Journal of the American Statistical Association,vol.106,no.496,pp.1242-1258,2011.
    [41]W.-P.Lee and W.-S.Tzou,Computational methods for discovering gene networks from expression data,Briefings in Bioinformatics,vol.10,no.4,pp.408-423,2009.
    [42]D.M.Chickering,D.Heckerman,and C.Meek,Largesample learning of Bayesian networks is np-hard,Journal of Machine Learning Research,vol.5,pp.1287-1330,2004.
    [43]M.Hecker,S.Lambeck,S.Toepfer,E.Van Someren,and R.Guthke,Gene regulatory network inference:Data integration in dynamic models-A review,Biosystems,vol.96,no.1,pp.86-103,2009.
    [44]D.Marbach,J.C.Costello,R.K¨uffner,N.M.Vega,R.J.Prill,D.M.Camacho,K.R.Allison,M.Kellis,J.J.Collins,G.Stolovitzky,et al.,Wisdom of crowds for robust gene network inference,Nature Methods,vol.9,no.8,pp.796-804,2012.
    [45]F.-X.Wu,Inference of gene regulatory networks and its validation,Current Bioinformatics,vol.2,no.2,pp.139-144,2007.
    [46]L.-Z.Liu,F.-X.Wu,and W.-J.Zhang,Reverse engineering of gene regulatory networks from biological data,Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,vol.2,no.5,pp.365-385,2012.
    [47]M.Li,H.Gao,J.Wang,and F.-X.Wu,Control principles for complex biological networksli control principles for biological networks,Briefings in Bioinformatics,doi:10.1093/bib/bby088.
    [48]Y.R.Wang and H.Huang,Review on statistical methods for gene network reconstruction using expression data,Journal of Theoretical Biology,vol.362,pp.53-61,2014.
    [49]J.Ruyssinck,P.Geurts,T.Dhaene,P.Demeester,and Y.Saeys,Nimefi:Gene regulatory network inference using multiple ensemble feature importance algorithms,PLoSOne,vol.9,no.3,p.e92709,2014.
    [50]H.Brunel,J.-J.Gallardo-Chac′on,A.Buil,M.Vallverd′u,J.M.Soria,P.Caminal,and A.Perera,Miss:A nonlinear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis,Bioinformatics,vol.26,no.15,pp.1811-1818,2010.
    [51]X.Zhang,X.-M.Zhao,K.He,L.Lu,Y.Cao,J.Liu,J.-K.Hao,Z.-P.Liu,and L.Chen,Inferring gene regulatory networks from gene expression data by path consistency algorithm based on conditional mutual information,Bioinformatics,vol.28,no.1,pp.98-104,2011.
    [52]D.Marbach,R.J.Prill,T.Schaffter,C.Mattiussi,D.Floreano,and G.Stolovitzky,Revealing strengths and weaknesses of methods for gene network inference,Proceedings of the National Academy of Sciences,vol.107,no.14,pp.6286-6291,2010.
    [53]A.A.Margolin,I.Nemenman,K.Basso,C.Wiggins,G.Stolovitzky,R.Dalla Favera,and A.Califano,Aracne:An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context,BMCBioinformatics,vol.7,no.1,p.S7,2006.
    [54]P.E.Meyer,K.Kontos,F.Lafitte,and G.Bontempi,Information-theoretic inference of large transcriptional regulatory networks,EURASIP Journal on Bioinformatics and Systems Biology,vol.2007,no.1,p.79879,2007.
    [55]H.Peng,F.Long,and C.Ding,Feature selection based on mutual information criteria of max-dependency,maxrelevance,and min-redundancy,IEEE Transactions on Pattern Analysis and Machine Intelligence,vol.27,no.8,pp.1226-1238,2005.
    [56]J.Zhao,Y.Zhou,X.Zhang,and L.Chen,Part mutual information for quantifying direct associations in networks,Proceedings of the National Academy of Sciences,vol.113,no.18,pp.5130-5135,2016.
    [57]H.Jeong,B.Tombor,R.Albert,Z.N.Oltvai,and A.-L.Barab′asi,The large-scale organization of metabolic networks,Nature,vol.407,no.6804,pp.651-654,2000.
    [58]P.Spirtes,C.N.Glymour,and R.Scheines,Causation,Prediction,and Search.MIT Press,2000.
    [59]T.Schaffter,D.Marbach,and D.Floreano,GeneNetWeaver:In silico benchmark generation and performance profiling of network inference methods,Bioinformatics,vol.27,no.16,pp.2263-2270,2011.
    [60]T.Saito and M.Rehmsmeier,The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets,PloS One,vol.10,no.3,p.e0118432,2015.
    [61]P.E.Meyer,F.Lafitte,and G.Bontempi,minet:AR/Bioconductor package for inferring large transcriptional networks using mutual information,BMC Bioinformatics,vol.9,no.1,p.461,2008.
    [62]C.Olsen,P.E.Meyer,and G.Bontempi,On the impact of entropy estimation on transcriptional regulatory network inference based on mutual information,EURASIP Journal on Bioinformatics and Systems Biology,vol.2009,no.1,p.308959,2008.
    [63]P.Meyer,D.Marbach,S.Roy,and M.Kellis,Informationtheoretic inference of gene networks using backward elimination,in BioComp,2010,pp.700-705.
    [64]M.Li,X.Meng,R.Zheng,F.-X.Wu,Y.Li,Y.Pan,and J.Wang,Identification of protein complexes by using a spatial and temporal active protein interaction network,IEEE/ACM Transactions on Computational Biology and Bioinformatics,doi:10.1109/TCBB.2017.2749571.
    [65]M.Li,J.Yang,F.-X.Wu,Y.Pan,and J.Wang,Dynetviewer:A cytoscape app for dynamic network construction,analysis and visualization,Bioinformatics,vol.34,no.9,pp.1597-1599,2017.
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.