生物命名实体识别及生物文本分类
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近年来,随着生物医学文本的大规模出现,对文本进行自动化处理的文本挖掘技术变得越来越重要,如对海量生物医学文本数据进行自动分类,从文本中挖掘感兴趣的生物命名实体,研究这些生物命名实体之间的内在关系等。生物医学文本中生物命名实体的识别是所有生物数据挖掘的最基础部分,也是将非结构化数据转换为结构化数据的关键步骤。本文主要研究生物医学文本中命名实体的识别和生物文本自动分类的关键技术,所取得的主要研究成果为:
     1、研究了基于改进二进制粒子群优化的特征选择算法。二进制粒子群优化是离散粒子群优化的一个变种,不同与传统的实数粒子群优化,二进制粒子群优化的每个变量取值非0即1。基于改进二进制粒子群优化的特征选择算法用翻转角度来控制粒子群进化,在多维空间搜索目标函数的最优二进制解,求出最佳特征权重向量,权重为0的特征是冗余特征,权重为1的特征为有效特征。
     2、研究了基于膜粒子群优化的特征选择算法。膜粒子群优化算法利用了膜系统的分层结构和消息传递机制,将粒子群优化算法作为区域子算法部署到各个区域中。不同于传统粒子群优化算法,本文将粒子群优化的搜索速率分解为局部搜索速率和全局搜索速率。膜系统的所有外层区域采用局部搜索速率,搜索局部最优解,最内层区域采用全局搜索速率,搜索全局最优解。所有外部区域将最优解传递给相邻内部区域,内部区域将最差解传递给相邻外部区域,最内层区域向相邻外部区域传递最差解。当各个区域之间的解传递在一段时间内停止,或者算法迭代次数达到限定次数,算法收敛,取最内层区域的最优解为最终解。利用膜粒子群优化算法在多维空间搜索目标函数的最优解,求出最佳特征权重向量,选取权重系数大于阈值的特征,去除权重系数小于阈值的特征,达到清除冗余特征的目的。
     3、研究了条件随机场模型的参数估计问题。针对传统的条件随机域模型参数估计算法过度拟合的问题,提出了改进粒子群优化算法并将该算法应用到条件随机域的参数估计中。改进的粒子群优化算法引入粒子群聚集度来防止粒子群过早的陷入局部收敛,用迭代间对数似然相对变化率来控制算法的收敛,用线性变化的惯性因子和学习因子来控制搜索范围。该算法在搜索初期具有较好的全局搜索能力,在搜索后期具有较好的局部搜索能力。当迭代间的对数似然相对变化率小于门限值时,或者迭代次数达到限定次数,算法终止。本文用条件随机域模型的对数似然估计作为目标函数,用改进粒子群优化算法来训练条件随机域,寻找使目标函数最大的参数向量作为条件随机域的最佳参数。
     4、研究了利用条件随机域模型从生物医学文本中识别生物命名实体的方法。针对马尔科夫等模型在命名实体识别中的标签倾向问题,提出了用富特征的条件随机域识别生物命名实体的方法。首先利用改进二进制粒子群优化方法对条件随机域的特征进行选择,然后利用改进粒子群优化算法对条件随机域模型进行训练,接下来基于各种辅助的特征集,用训练好的条件随机域模型进行生物命名实体的识别,标注出生物文本中存在的表示生物命名实体的名词和各种短语。
     5、研究了基于可拓分类器的生物医学文本分类方法。为了对海量生物医学文本进行自动分类,本文提出了一种新的基于可拓分类器的文本分类方法。可拓分类器用空间向量模型来表示单个生物医学文本,用可拓矩阵表示每个类型模板,通过计算文本与各个类型模板之间的可拓相关度,来判定文本与类型之间的相似程度,选择可拓相关度最大的类型为最终归档类型。为了使可拓矩阵保持最佳分类效果,本文采用改进粒子群优化算法来训练不同类别的文本特征的权重系数,使不同文本类别之间的距离和最大化。
In recent years, with the growth of biomedical literature, it is more and moreimportant to develop automatic text mining tool, for example, classifying massbiomedical literature, recognizing interesting named entity from text, extracting therelationship between those named entities, etc. Biomedical named entity recognitionfrom biomedical literature is the basic part of all biomedical texting mining, also is theprimary procedure to transform unstructured data to structured data. This dissertation isfocused on the key technologies in biomedical named entity recognition andclassification of biomedical literature, and all major contributions made by author areoutlined as follows:
     1. Features selection method based on improved binary particle swarm optimizer isstudied. Binary particle swarm optimizer is one of discrete particle swarm optimizer.Different with traditional real-number particle swarm optimizer, the value of solution ofbinary particle swarm optimizer is1or0instead of real number. The feature selectionalgorithm based on improved binary particle swarm evolves by round angle, andsearches for the best binary solution of fitness function in multi-dimension space untilget the best weight vector of features. The features with weight as1will be selected andfeatures with weight as0will be removed.
     2. Feature selection method based on membrane particle swarm optimizer isstudied. Utilizing the hierarchy structure and massage passing mechanism of membranesystem, membrane particle swarm optimizer assigns particle swarms optimizer to everysub-region. Different with traditional particle swarm optimizer, this dissertationproposes the local velocity and global velocity. All particle swarms in external regionssearch for local best solution in local velocity, and all particle swarms in the innermostregion search for global best solution in global velocity. The best solution in externalregion is passed to adjacent inner region, and the worst solution in inner regions ispassed to adjacent external region. The worst solution in the innermost region is passedto its adjacent external region. Once solution passing stops or iteration runs up tolimitation, iteration of algorithm is stopped and the best solution in the innermost regionis taken as output. We use membrane particle swarm optimizer to search for bestsolution of fitness function and get the best weight vector of features. According to thevalues in best weight vector, those features with weight less than threshold value areremoved and features with weight more than threshold value are selected in order toeliminate redundant features.
     3. Parameter estimation of conditional random field model is studied. Aimed tosolve the over fitting issues in traditional parameter estimation of conditional randomfields, we propose an improved particle swarm optimizer algorithm and apply thisalgorithm to estimate parameters of conditional random fields. In improved particleswarm optimizer, aggregation degree of particle swarm is utilized to control early localconvergence of particle swarm optimizer, the relative change ratio of log-likelihoodbetween iterations is employed to end its iterations, and the inertia factor and learningfactor are set as linear variables to control search scope. This algorithm has better globalsearch ability in early stage, and better local search ability in later stage than traditionalparticle swarm optimizer. Once the relative change ratio of log-likelihood betweeniterations is less than threshold or the iteration runs up to limitation, iteration is stopped.We set logarithm estimation of conditional random fields as object function, trainconditional random fields using improved particle swarm optimizer, and search for thebest parameters which maximize the object function.
     4. Biomedical named entity recognition in biomedical literature based onconditional random fields is studied. Aimed to solve label bias problem in Markovmodel, we utilize conditional random fields with rich features to recognize biomedicalnamed entity. We select features using improved binary particle swarm optimizer firstly,train conditional random fields using improved particle swarm optimizer, and thenrecognize biomedical named entity using trained conditional random fields with richfeature sets, finally, label all biomedical named entities in biomedical literature.
     5. Classification of biomedical literature based on extenics classifier is studied.Aimed to classify mass biomedical literature automatically, we propose a novelclassification method named extenics classifier. In extenics classifier, single literature ispresented by space vector model, category model is presented by extenics matrix,extenics similarities between the literature and all category models are calculated andthe literature is classified to that category with the maximum extenics similarity. Inorder to maximize the distance between all category models, extenics matrix is trainedusing improved particle swarm optimizer.
引文
[1] AM Cohen, WR Hersh.A survey of current work in biomedical text mining.Briefings in bioinformatics.2005, March,6(1),57-71.
    [2] U Leser, J Hakenberg.What makes a gene name? Named entity recognition in thebiomedical literature. Briefings in bioinformatics.2005, December,6(4),357-369.
    [3] P Zweigenbaum, D Demner-Fushman. Frontiers of biomedical text mining: currentprogress. Briefings in bioinformatics.2007, September,8(5),358-375.
    [4] R Rodriguez-Esteban. Biomedical text mining and its applications. PLoScomputational biology.2009, December,5(12), e1000597.
    [5] F Zhu, P Patumcharoenpol, C Zhang, et al. Biomedical text mining and itsapplications in cancer research. Journal of Biomedical Informatics.2013, April,46(2),200-211.
    [6] R Gaizauskas, G Demetriou, PJ Artymiuk, et al. Protein structures and informationextraction from biological texts: The PASTA system. Bioinformatics.2003, January,19(1),135–143.
    [7] SF Altschul, TL Madden, AA Schaffer, et al. Gapped BLAST and PSIBLAST: Anew generation of protein database search programs. Nucleic Acids Res.1997,Septemper,25(17),3389–3402.
    [8] C Nobata, N Collier, J Tsujii. Automatic term identification and classification inbiology texts. Proceedings of Natural Language Pacific Rim Symposium.1999,July,1,369-357.
    [9] MT Tomohiro, S Fation, M Murata, et al. Gene/protein name recognition based onsupport vector machine using dictionary as features. BMC Bioinformatics.2005,Mya,6(S1), S8.
    [10] JT Chang, H Schutze, RB Altman. GAPSCORE: Finding gene and protein namesone word at a time. Bioinformatics.2004, January,20(2),216–225.
    [11] S Kinoshita, KB Cohen, PV Ogren, et al. BioCreAtIvE Task1A: Entityidentification with a stochastic tagger. BMC Bioinformatics.2005, May,6(S1), S4.
    [12] J Finkel, S Dingare, CD Manning, et al. Exploring the boundaries: Gene andprotein identification in biomedical text. BMC Bioinformatics.2005, May,6(S1),S5.
    [13] JR Curran, S Clark. Language independent NER using a maximum entropy tagger.Proceeding of the7th Conference on Natural Language Learning.2003, June,4,164–167.
    [14] J Lafferty, A McCallum, FCN Pereira. Conditional random fields: Probabilisticmodels for segmenting and labeling sequence data. Proceeding of18th InternationalConference On Machine Learning.2001, June,18,282-289.
    [15] AR Kinjo, F Rossello, G Valiente. Profile Conditional Random Fields for ModelingProtein Families with Structural Information. Biophysics.2009, May,5,37-44.
    [16] B Settles. Biomedical named entity recognition using conditional random fieldsand novel feature sets. Proceedings of the Joint Workshop on Natural LanguageProcessing in Biomedicine and its Applications.2004, March,1,104-107.
    [17] M Bundschus, M Dejori, M Stetter, et al. Extraction of semantic biomedicalrelations from text using conditional random fields. BMC Bioinformatics.2008,April,9,207-220.
    [18] SP Chatzis, DI Kosmopoulos, P Doliotis. A conditional random field-based modelfor joint sequence segmentation and classification. Pattern Recognition.2013, June,46(6),1569-1578.
    [19] C Sun, Y Guan, X Wang, et al. Rich features based Conditional Random Fields forbiological named entities recognition. Computers in Biology and Medicine.2007,Septemper,37(9),1327-1333.
    [20] M Li, M Bai, C Wang, B Xiao. Conditional random field for text segmentationfrom images with complex background. Pattern Recognition Letters.2010, October,31(14),2295-2308.
    [21] MH Moattar, MM Homayounpour. Variational conditional random fields for onlinespeaker detection and tracking. Speech Communication.2012, July,54(6),763-780.
    [22] SK Das, SK Saha, DP Mukherjee. Segmentation of multiple objects evolvingconditional random field based topology adaptive active membrane. SignalProcessing.2012, October,92(10),2341-2355.
    [23] Q Zhang, YG Cao, H Yu. Parsing citations in biomedical articles using conditionalrandom fields. Computers in Biology and Medicine.2011, April,41(4),190-194.
    [24]李国臣,王瑞波,李济洪.基于条件随机场模型的汉语功能块自动标注.计算机研究与发展.2010,2,47(2),336-343.
    [25]周俊生,戴新宇,尹存燕,陈家骏.基于层叠条件随机场模型的中文机构名自动识别.电子学报.2006,5,34(5),804-809.
    [26] AL Berger, VJD Pietra, SAD Pietra. A maximum entropy approach to naturallanguage processing. Computational Linguistics.1996, March,22(1),39-71.
    [27]黄健斌,姬红兵,孙鹤立.基于混合链条件随机场的异构Web记录集成方法.软件学报.2008,August,19(8),2149-2158.
    [28]胡博磊,贺瑞芳,孙宏等.基于条件随机域的中文事件类型识别.模式识别与人工智能.2012,6,25(3),445-449.
    [29]付剑锋,刘宗田,刘炜等.基于层叠条件随机场的事件因果关系抽取.模式识别与人工智能.2011,8,24(4),567-573.
    [30]王静,刘志镜.基于概率模型的Web信息抽取.模式识别与人工智能.2010,12,23(6),847-855.
    [31] L Li, R Zhou, D Huang. Two-phase biomedical named entity recognition usingCRFs. Computational Biology and Chemistry.2009, August,33(4),334-338.
    [32] J Nothman, N Ringland, W Radford, T Murphy, JR Curran. Learning multilingualnamed entity recognition from Wikipedia. Artificial Intelligence.2013, January,194(SI),151-175.
    [33] JJ Jung. Online named entity recognition method for microtexts in socialnetworking services: A case study of twitter. Expert Systems with Applications.2012, July,39(9),8066-8070.
    [34] SK Saha, S Narayan, S Sarkar, et al. A composite kernel for named entityrecognition. Pattern Recognition Letters.2010, Septemper,31(12),1591-1597.
    [35] A Ekbal, S Saha, UK Sikdar. Multiobjective Optimization for Biomedical NamedEntity Recognition and Classification. Procedia Technology.2012, October,6,206-213.
    [36] J Atkinson, V Bull. A multi-strategy approach to biological named entityrecognition. Expert Systems with Applications.2012, December,39(17),12968-12974.
    [37] SK Saha, P Mitra, S Sarkar. A comparative study on feature reduction approachesin Hindi and Bengali named entity recognition. Knowledge-Based Systems.2012,March,27,322-332.
    [38] KJ Lee, YS Hwang, S Kim, HC Rim. Biomedical named entity recognition usingtwo-phase model based on SVMs. Journal of Biomedical Informatics.2004,December,37(6),436-447.
    [39] X Liu, M Zhou. Two-stage NER for tweets with clustering. Information Processing&Management.2013, January,49(1),264-273.
    [40] N Xue. Labeling Chinese Predicates with Semantic roles. ComputationalLinguistics.2008, June,34(2),225-255.
    [41] A McCallum. Efficiently inducing features of conditional random fields.Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence.2003, May,403-410.
    [42] AJ Viterbi. Error bounds for convolutional codes and an asymptotically optimaldecoding algorithm. IEEE Transactions on Information Theory.1967, April,13(2),260-269.
    [43] GD Forney. The Viterbi algorithm. Proceedings of the IEEE.1973, March,61(3),268-278.
    [44] G Paun. Computing with membranes. Journal of Computer and System Sciences.2000, August,61(1),108-143.
    [45] G Paun, G Rozenberg. A guide to membrane computing. Theoretical ComputerScience.2002, Septemper,287(1),73-100.
    [46]张葛祥,潘林强.自然计算的新分支-膜计算.计算机学报.2010,2,33(2),208-214.
    [47] Y Sun, L Zhang, X Gu. Membrane computing based particle swarm optimizationalgorithm and its application.2010IEEE Fifth International Conference onBio-Inspired Computing: Theories and Applications (BIC-TA).2010, Septemper,5,631-636.
    [48] TY Nishida. An application of P-system: A new algorithm for NP-completeoptimization problems. Proceedings of the8th World Multi-Conference on Systems,Cybernetics and Informatics.2004, July,5,109-112.
    [49]黄亮.膜计算优化方法研究[博士学位论文].杭州:浙江大学,2007.1-36.
    [50]江赟,石晓龙,张征等.脉冲神经膜系统在穷举使用规则下产生的二进制字符串语言.计算机学报.2009,12,32(12),2355-2361.
    [51]张兴义,曾湘祥,潘林强,罗斌.脉冲神经膜系统求解任意两个自然数的乘积.计算机学报.2009,12,32(12),2362-2372.
    [52]潘林强,张兴义,曾湘祥等.脉冲神经膜计算系统的研究进展及展望.计算机学报.2008,12,31(12),2090-2096.
    [53] G Paun(著),许进,王淑栋,潘林强(译).第一版. DNA计算:一种新的计算模式.北京:清华大学出版社,2004.1-66.
    [54] P Sosík, A Rodríguez-Patón. Membrane computing and complexity theory:acharacterization of pspace. Journal of Computer and System Sciences.2007,June,73(1),137-152.
    [55] OH Ibarra. On the computational complexity of membrane systems. TheoreticalComputer Sciunec.2004,June,320(1),89-109.
    [56] O Andrei, G Ciobanu, D Lucanu. A rewriting logic framework for operationalsemantics of membrane systems. Theoretical Computer Science, StructuralOperational Semantics.2007, April,373(3),163-181.
    [57] I Rechenberg.Case studies in evolutionary experimentation and computation.Computer Methods in Applied Mechanics and Engineering.2000,June,186(2-4),125-140.
    [58] N Jonoska,G Paun. Membrane computing. New Generation Computing.2004,August,22(4),297-298.
    [59] T Song, L Pan, G Paun. Asynchronous spiking neural P systems with localsynchronization. Information Sciences.2013, January,219,197-207.
    [60] C Buiu, O Arsene, C Cipu, et al. A software tool for modeling and simulation ofnumerical P systems. Biosystems.2011, March,103(3),442-447.
    [61] S Yang, N Wang. A novel P systems based optimization algorithm for parameterestimation of proton exchange membrane fuel cell model. International Journal ofHydrogen Energy.2012, May,37(10),8465-8476.
    [62] S Yang, N Wang. A P systems based hybrid optimization algorithm for parameterestimation of FCCU reactor–regenerator model. Chemical Engineering Journal.2012, November,211,508-518.
    [63] J Kennedy, R Eberhart. Particle swarm optimization. Proceedings of the FourthIEEE International Conference on Neural Networks.1995, November,4,1942-1948.
    [64] R Eberhart, J Kennedy. A new optimizer using particle swarm theory. Proceedingsof the Sixth International Symposium on Micro Machine and Human Science.1995, Ocotber,39-43.
    [65] Y Shi, R Eberhart.A modified particle swarm optimizer.1998IEEE InternationalConference on Evolutionary Computation Proceedings.1998, May,1,69-73.
    [66] H Fan, Y Shi. Study on Vmax of particle swarm optimization. Proceedings ofWorkshop on Particle Swarm Optimization.2001, April.
    [67] A Abido. Particle swarm optimization for multimachine power system stabilizerdesign. Proceedings of Power Engineering Society Summer Meeting.2001, Jul,3,1346–1351.
    [68] E Ozcan, C Mohan. Particle swarm optimization: Surfing the waves. Proceedingsof IEEE Congress Evolutinary Computation.1999, July,3,1939–1944.
    [69] DV Yamille, M Salman, G Ronald, et al. Particle Swarm Optimization: BasicConcepts, Variants and Applications in Power Systems. IEEE Transactions onEvolutionary Computation.2008, April,12(2),171-195.
    [70]吕强,刘士荣.一种信息充分交流的粒子群优化算法.电子学报.2010,3,38(3),664-667.
    [71] JP Li, ME Balazs, GT Parks, et al. A species conserving genetic algorithm formultimodal function optimization. Evolutionary computation.2002, July,10(3),207-234.
    [72] D Parrott, X Li. Locating and tracking multiple dynamic optima by a particleswarm model using speciation. IEEE Transactions on Evolutionary Computation.2006, August,10(4),440-458.
    [73] R Brits, AP Engelbrecht, F Van den Bergh. Locating multiple optima using particleswarm optimization. Applied Mathematics and Computation.2007, June,189(2),1859-1883.
    [74] G Singh, K Deb. Comparisons of multi-modal optimization algorithms based onevolutionary algorithms. Proceedings of the Genetic and Evolutionary ComputationConference.2006, Febuary,1,1305-1312.
    [75] S Bird, X Li. Enhancing the robustness of a speciation-based PSO. IEEECongress on Evolutionary Computation.2006, July,1,843-850.
    [76] M Clerc, J Kennedy. The particle swarm-explosion, stability, and convergence in amultidimensional complex space. IEEE Transactions on Evolutionary Computation.2002, Febuary,6(1),58-73.
    [77] KE Parsopoulos, MN Vrahatis. Modification of the particle swarm optimizer forlocating all the global minima. International Conference on Artificial NeuralNetworks and Genetic Algorithms.2001, April,1,324-327.
    [78] KE Parsopoulos, MN Vrahatis. On the computation of all global minimizersthrough particle swarm optimization. IEEE Transactions on EvolutionaryComputation.2004, June,8(3),211-224.
    [79] J Kennedy, R Eberhart. A discrete binary version of the particle swarm algorithm.IEEE International Conference on Systems, Man, Cybernetics, ComputationalCybernetics, Simulation.1997, October,5,4104-4108.
    [80] YW Jeong, JB Park, SH Jang, et al. A new quantum-inspired binary pso:application to unit commitment problems for power systems. IEEE Transactions onPower System.2010, August,25(3),1486-1495.
    [81] D Sudholt, C Witt. Runtime analysis of a binary particle swarm optimizer.Theoretical Computer Science.2010, May,411(21),2084-2100.
    [82] S Mirjalili, A Lewis. S-shaped versus V-shaped transfer functions for binaryParticle Swarm Optimization. Swarm and Evolutionary Computation.2013, April,9,1-14.
    [83] E Reséndiz, CA Rull-Flores. Mahalanobis–Taguchi system applied to variableselection in automotive pedals components using Gompertz binary particle swarmoptimization. Expert Systems with Applications.2013, June,40(7),2361-2365.
    [84] GC Luh, CY Lin, YS Lin. A binary particle swarm optimization for continuumstructural topology optimization. Applied Soft Computing.2011, March,11(2),2833-2844.
    [85] J Chand Bansal, K Deep. A Modified Binary Particle Swarm Optimization forKnapsack Problems. Applied Mathematics and Computation.2012, July,218(22),11042-11061.
    [86] S Pookpunt, W Ongsakul. Optimal placement of wind turbines within wind farmusing binary particle swarm optimization with time-varying accelerationcoefficients. Renewable Energy.2013, July,55,266-276.
    [87] HS Urade, R Patel. Dynamic Particle Swarm Optimization to SolveMulti-objective Optimization Problem. Procedia Technology.2012, October,6,283-290.
    [88] L Shao, Y Bai, Y Qiu, et al. Particle Swarm Optimization Algorithm Based onSemantic Relations and Its Engineering Applications. Systems EngineeringProcedia.2012, May,5,222-227.
    [89] W Bin, P Qinke, Z Jing, et al. A binary particle swarm optimization algorithminspired by multi-level organizational learning behavior. European Journal ofOperational Research.2012, June,219(2),224-233.
    [90] J Cai, WD Pan. On fast and accurate block-based motion estimation algorithmsusing particle swarm optimization. Information Sciences.2012, August,197,53-64.
    [91] TC Wong, SC Ngan. A comparison of hybrid genetic algorithm and hybrid particleswarm optimization to minimize makespan for assembly job shop. Applied SoftComputing.2013, March,13(3),1391-1399.
    [92] E Laskari, K Parsopoulos, M Vrahatis. Particle swarm optimization for integerprogramming.Proceedings of IEEE Congress on Evolutionary Computation.2002,May,2,1582–1587.
    [93]陈翔,顾庆,王子元等.一种基于粒子群优化的成对组合测试算法框架.软件学报.2011,12,22(12),2879-2893.
    [94]程祥,张忠宝,苏森等.基于粒子群优化的虚拟网络映射算法.电子学报.2011,10,39(10),2240-2244.
    [95]李文锋,梁晓磊,张煜等.具有异构分簇的粒子群优化算法研究.电子学报.2012,11,40(11),2194-2199.
    [96]黄岚,齐季,谭颖等.一种求解矩形排样问题的遗传-离散粒子群优化算法.电子学报.2012,6,40(6),1103-1107.
    [97]叶东毅,廖建坤.基于二进制粒子群优化的一个最小属性约简算法.模式识别与人工智能.2007,6,20(3),295-300.
    [98]高海兵,周驰,高亮等.广义粒子群优化模型.计算机学报.2005,12,28(12),1980-1987.
    [99] WM Zhong, SJ Li, F Qian. θ-PSO: a new strategy of particle swarm optimization.Journal of Zhejiang University A.2008, June,9(6),786-790.
    [100]芮挺,周游,方虎生,戎晓力.基于PSO的Fisher准则下小样本最佳鉴别变换.模式识别与人工智能.2009,4,22(2),288-292.
    [101] DP Rini, SM Shamsuddin, SS Yuhaniz. Particle Swarm Optimization: Technique,System and Challenges.International Journal of Computer Applications.2011,January,14(1),19-27.
    [102]魏臻,吴雷,葛方振.基于Memetic框架的混合粒子群算法.模式识别与人工智能.2012,4,25(2),213-219.
    [103]申元霞,王国胤,曾传华. PSO模型种群多样性与学习参数的关系研究.电子学报.2011,6,39(6),1238-1244.
    [104] Q He, L Wang, B Liu. Parameter estimadon for chaotic systems by particle swarmoptimization. Chaos, Solitons&Fractals.2007, October,34(2),654-661.
    [105] C Hou, L Jiao. Selecting features of linear-chain conditional random fields viagreedy stage-wise algorithms. Pattern Recognition Letters.2010, January,31(2),151-162.
    [106] M Lan, CL Tan, J Su. Feature generation and representations for protein–proteininteraction classification. Journal of Biomedical Informatics.2009, October,42(5),866-872.
    [107] BB Pineda-Bautista, JA Carrasco-Ochoa. General framework for class-specificfeature selection. Expert Systems with Applications.2011, August,38(8),10018-10024.
    [108] H Uguz. A two-stage feature selection method for text categorization by usinginformation gain, principal component analysis and genetic algorithm.Knowledge-Based Systems.2011, October,24(7),1024-1032.
    [109] H Yan, X Yuan, S Yan, et al. Correntropy based feature selection using binaryprojection. Pattern Recognition.2011, December,44(12),2834-2842.
    [110] A Al-Ani, A Alsukker, RN Khushaba. Feature subset selection using differentialevolution and a wheel based search strategy. Swarm and Evolutionary Computation.2013, April,9,15-26.
    [111] Y Zhang, S Li, T Wang, Z Zhang. Divergence-based feature selection for separateclasses. Neurocomputing.2013, Febuary,101,32-42.
    [112] LY Chuang, CH Yang, JC Li. Chaotic maps based on binary particle swarmoptimization for feature selection. Applied Soft Computing.2011, January,11(1),239-248.
    [113] MM Kabir, M Shahjahan, K Murase. A new hybrid ant colony optimizationalgorithm for feature selection. Expert Systems with Applications.2012, Febuary,39(3),3747-3763.
    [114] B Chen, L Chen, Y Chen. Efficient ant colony optimization for image featureselection. Signal Processing.2013, June,93(6),1566-1576.
    [115] A Arauzo-Azofra, JL Aznarte, JM Benítez. Empirical study of feature selectionmethods based on individual feature evaluation for classification problems. ExpertSystems with Applications.2011, July,38(7),8170-8177.
    [116] L Yin, Y Ge, K Xiao, et al. Feature selection for high-dimensional imbalanceddata. Neurocomputing.2013, April,105(SI),3-11.
    [117] X Sun, Y Liu, J Li, et al. Using cooperative game theory to optimize the featureselection problem. Neurocomputing.2012, November,97,86-93.
    [118] D Wei, S Li, M Tan. Graph embedding based feature selection. Neurocomputing.2012, Septemper,93,115-125.
    [119] Y Zhang, Z Zhang. Feature subset selection with cumulate conditional mutualinformation minimization. Expert Systems with Applications.2012, April,39(5),6078-6088.
    [120] J Lee, DW Kim. Feature selection for multi-label classification using multivariatemutual information. Pattern Recognition Letters.2013, Febuary,34(3),349-357.
    [121]徐燕,李锦涛,王斌等.基于区分类别能力的高性能特征选择方法.软件学报.2008,1,19(1),82-89.
    [122]陈友,程学旗,李洋等.基于特征选择的轻量级入侵检测系统.软件学报.2007,7,18(7),1639-1651.
    [123]伍之昂,庄毅,王有权等.基于特征选择的推荐系统托攻击检测算法.电子学报.2012,8,40(8),1687-1693.
    [124]陈铁明,马继霞,蔡家楣等.一种新的快速特征选择和数据分类方法.计算机研究与发展.2012,4,49(4),735-745.
    [125]薛晖,陈松灿.基于局部性正则化推广误差界的特征选择算法.模式识别与人工智能.2011,8,24(4),473-478.
    [126]钱宇华,梁吉业,王锋等.面向非完备决策表的正向近似特征选择加速算法.计算机学报.2011,3,34(3),435-442.
    [127] M Dash, H Liu. Feature Selection for Classification. Intelligent Data Analysis.1997, May,1(3),131-156.
    [128] SB Serpico, L Bruzzone. A New Search Algorithm for Feature Selection in HyperSpect ral Remote Sensing Images. IEEE Transactions on Geoscience and RemoteSensing.2001, July,39(7),1360-1367.
    [129] Y Saeys, I Inza, P Larranaga. A Review of Feature Selection Techniques inBioinformatics. Bioinformatics.2007, October,23(19),2507-2517.
    [130] J Bins. Feature Selection of Huge Feature Sets in the Context of Computer Vision
    [Ph. D. Dissertation]. Colorado: Colorado State University,2000.9-18.
    [131]宋国杰,唐世渭,杨冬青,王腾蛟.基于最大熵原理的空间特征选择方法.软件学报.2003,9,14(9),1544-1550.
    [132] R Malouf. A comparison of algorithms for maximum entropy parameterestimation. Proceedings of the Sixth Conference on Natural Language Learning.2002, May,20,49-55.
    [133] E Zitzler, K Deb, L Thiele. Comparison of multi-objective evolutionaryalgorithms:empirical results. IEEE Transactions on Evolutionary Computation.2000,March,8(2),173-195.
    [134] JD Kim, T Ohta, Y Tateisi, et al. GENIA corpus-a semantically annotated corpusfor bio-textmining. Bioinformatics.2003, July,19(S1), i180-i182.
    [135] L Tanabe, N Xie, LH Thom, et al. Genetag: a tagged corpus for gene/proteinnamed entity recognition. BMC bioinformatics.2005, May,6(S1),1-7.
    [136] I Guyon, J Weston, S Barnhill, et al. Gene selection for cancer classification usingsupport vector machines. Machine Learning.2002, January,46(1-3),389-422.
    [137] B Settles. Abner: an open source tool for automatically tagging genes, proteins,and other entity names in text. Bioinformatics.2005, July,21(14),3191-3192.
    [138]陈钰枫,宗成庆,苏克毅等.汉英双语命名实体识别与对齐的交互式方法.计算机学报.2011,9,34(9),1688-1696.
    [139]王浩畅,李钰,赵铁军等.面向生物医学命名实体识别的多Agent元学习框架.计算机学报.2010,7,33(7),1256-1262.
    [140]冯元勇,孙乐,张大鲲等.基于小规模尾字特征的中文命名实体识别研究.电子学报.2008,9,36(9),1833-1838.
    [141]周俊生,戴新宇,尹存燕等.基于层叠条件随机场模型的中文机构名自动识别.电子学报.2006,5,34(5),804-809.
    [142]张玥杰,徐智婷,薛向阳等.融合多特征的最大熵汉语命名实体识别模型.计算机研究与发展.2008,6,45(6),1004-1010.
    [143]姜维,王晓龙,关毅等.基于多知识源的中文词法分析系统.计算机学报.2007,1,30(1),137-145.
    [144] Y Regev, M Finkelstein-Landau, R Feldman. Rule-based extraction ofexperimental evidence in the biomedical domain: The KDD Cup2002(task1).ACM SIGKDD Explorations Newsletter.2002, December,4(2),90-92.
    [145] SS Keerthi, CJ Ong, KB Siah, et al. A machine learning approach for the curationof biomedical literature-KDD Cup2002(task1). ACM SIGKDD ExplorationsNewsletter.2002, December,4(2),93-94.
    [146] MM Ghanem, Y Guo, H Lodhi, et al. Automatic scientific text classification usinglocal patterns: KDD Cup2002(task1). ACM SIGKDD Explorations Newsletter.2002, December,4(2),95–96.
    [147] AS Yeh, L Hirschman, AA Morgan. Evaluation of text data mining for databasecuration: Lessons learned from the KDD Challenge Cup.Bioinformatics.2003, July,19(l), i331–339.
    [148] I Donaldson, J Martin, B de Bruijn, et al. PreBIND and Textomy–mining thebiomedical literature for protein–protein interactions using a support vectormachine. BMC Bioinformatics.2003, March,4(1),11-23.
    [149] PB Dobrokhotov, C Goutte,A Veuthey,et al. Combining NLP and probabilisticcategorisation of document and term selection for Swiss-Prot medical annotation.Bioinformatics.2003, July,19(S1), i91–94.
    [150] F Liu, TK Jenssen, V Nygaard, et al. FigSearch: A figure legend indexing andclassification system. Bioinformatics.2004, November,20(16),2880-2882.
    [151] Z Li, Z Xiong, Y Zhang, et al. Fast text categorization using concise semanticanalysis. Pattern Recognition Letters.2011, Febuary,32(3),441-448.
    [152] CP Wei, YT Lin, CC Yang. Cross-lingual text categorization: Conqueringlanguage boundaries in globalized environments. Information Processing&Management.2011, September,47(5),786-804.
    [153] Y Guo, Z Shao, N Hua. Automatic text categorization based on content analysiswith cognitive situation models. Information Sciences.2010, March,180(5),613-630.
    [154] CH Li, JC Yang, SC Park. Text categorization algorithms using semanticapproaches, corpus-based thesaurus and WordNet. Expert Systems withApplications.2012, January,39(1),765-772.
    [155] RHW Pinheiro, GDC Cavalcanti, RF Correa, et al. A global-ranking local featureselection method for text categorization. Expert Systems with Applications.2012,December,39(17),12851-12857.
    [156] S Jiang, G Pang, M Wu, L Kuang. An improved K-nearest-neighbor algorithm fortext categorization. Expert Systems with Applications.2012, January,39(1),1503-1509.
    [157] S Aseervatham, A Antoniadis, E Gaussier, et al. A sparse version of the ridgelogistic regression for large-scale text categorization. Pattern Recognition Letters.2011, January,32(2),101-106.
    [158]蔡文.可拓论及其应用.科学通报.1999,4,44(7),673-682.
    [159] W Cai. Extension Set and Non-Compatible Problems. Advances in AppliedMathematics and Mechanics in China.1990, January,2,1-21.
    [160]蔡文.可拓学概述.系统工程理论与实践.1998,1,18(1),76-84.
    [161]蔡文,杨春燕,何斌等.可拓学基础理论研究的新进展.中国工程科学.2003,2,5(2),80-87.
    [162] G Zheng, Y Jing, H Huang, et al.Application of Life Cycle Assessment (LCA)and extenics theory for building energy conservation assessment. Energy.2009,November,34(11),1870-1879.
    [163] Q Deng, Z Pu.Extenics-Based Evaluation of China's Insurance EcologicalEnvironment.Energy Procedia.2011, December,5,2604-2609.
    [164] PH Huang.The Extenics Theory for a Matching Evaluation System. Computers&Mathematics with Applications.2006, October,52(6–7),997-1010.
    [165]苏金树,张博锋,徐昕等.基于机器学习的文本分类技术研究进展.软件学报.2006,9,17(9),1848-1859.
    [166]王继成,潘金贵,张福炎等.Web文本挖掘技术研究.计算机研究与发展.2000,5,37(5),513-520.
    [167]黄萱菁,夏迎炬,吴立德等.基于向量空间模型的文本过滤系统.软件学报.2003,3,14(3),435-442.
    [168]李文波,孙乐,张大鲲等.基于Labeled-LDA模型的文本分类新算法.计算机学报.2008,4,31(4),620-627.
    [169]何力,贾焰,韩伟红等.大规模层次分类问题研究及其进展.计算机学报.2012,10,35(10),2101-2115.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700