详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
The fast growth of information technology and rapid changes of internet havebrought us into an enriched and rapidly updated information age. Especially with theemergence of various social networks in recent years, massive text information has beenproduced and disseminated constantly on the networks every day."Information poverty"has been replaced by "information overload" with the rapid growth of the mass ofinformation. The problem we are facing is no longer how to get information, but how toquickly and efficiently extract the required information from large amount ofinformation. As a key technology of great useful value, to a large extent, textclassification can solve the problem of information mess, and bring convenience forusers to accurately specify their required information and distribute information. Alongwith wide application of classification technology in information retrieval, publicsentiment analysis, information filtering, news classification, digital library and moreother areas, the study on key techniques of text classification has become an advancingfront subject of information processing, and has wide applications prospect andimportant research significance. This dissertation is mainly concerned with textsemantic representation and key techniques of hierarchical classification. The author’smajor contributions are outlined as follows:
     1. A Text Semantic Graph based text representation model is proposed. To solvethe problem of words semantic information loss caused by text representation based onword frequency statistics, a new Chinese text semantic representation model: TextSemantic Graph, is proposed by considering contextual semantic and backgroundinformation of the words in the text. This method captures the semantic relationshipsbetween words using Wikipedia as a knowledge base. Words with strong semanticrelationships are combined into a word-package as indicated by a graph node, whichweighted by the total number and frequency of the words it contains. Contextualrelationship between words in different word-packages is stated by a directed edge,which weighted with the maximum weight of its adjacent nodes. The model retains thecontextual information of each word to a large extent while at the same time thesemantic meaning between words is strengthened.
     2. A virtual category tree based the hierarchical text classification method isproposed. According to the problem of top-down building classification model inexisting hierarchical classification methods and sample data repetitive learning, a newvirtual category tree based the hierarchical text classification method is proposed. The classification method uses a bottom-up approach to build classifiers. It can decrease thecost of sample repetitive learning and reduce sample learning time. In the process oftop-down text classification, the similarity between document vector preprocessed andthe associated classifier is calculated. The maximum value is selected to determine thecategory which the document belongs to until the document is classified to leaf node.
     3. Hierarchical text classification incremental learning algorithms are proposed.Combined with the analysis on learning problems of single document adjustment andnew sample sets, the incremental learning algorithms based on the hierarchicalclassification model for the two patterns are proposed. Towards single documentadjustment, the classifier, which is the extreme left mismatching node between thedocument's classification path and its actual path in the virtual category tree, isretraining and then the virtual category tree model is updated. For new sample sets, thefeature space is updated incrementally using an incremental features selection algorithm.The weights are recalculated to improve the accuracy of classification model.
     4. A hierarchical text classification performance evaluation method is proposed. Toevaluate the hierarchical classification methods, resolve the limitations of conventionalflat classification measures for hierarchical classification evaluation, after studying thehierarchical classification methods based on concept tree, a set of extended measuresare put forward to accurately describe its performance, by effectively using the level and"affinity" among the categories in a hierarchical structure. And further a definition ofError Classification Concentration Ratio (ECCR) is given based on the distribution ofmisclassification samples. Besides evaluation the classification result, ECCR can guidethe training samples selection process to make the training set more representative.
     5. A text information processing model is designed. According to a text intelligenceprocessing application mode, a process model of text information processing is designed,including four stages of text information collection, hotspot aggregation andclassification, full text information retrieval and text information integrated compilation.On this basis, text information processing system is developed. The system can realizethe text information pre-processing, analysis processing and integrated compilation. Itprovides a software platform for information workers to improve the efficiency ofinformation processing.
    [2]Global Information Industry Center. How Much Information?2009Report on American Consumers.2010, pp.8-13.
    [3]Sebastiani Fabrizio. A Tutorial on Automated Text Categorization. Proceedings of the1st Argentinian Symposium on Artificial Intelligence. Buenos Aires, AR.1999, pp.7-35.
    [4]Luhn Hans Peter. Auto-encoding of Documents for Information Retrieval System. Modern Trends in Documentation. New York:Pergamon Press.1959.
    [5]Melvin Earl Maron, John Lary Kuhns. On Relevance, Probabilistic Indexing and Information Retrieval. ACM.1960, pp.216-244.
    [6]Florian Verhein, Sanjay Chawla. Using Significant Positively Associated and Relatively Class Correlated Rules for Associative Classification of Imbalanced Datasets. Proceedings of the2007Seventh IEEE International Conference on Data Mining. Washington:IEEE Computer Society.2007, pp.679-684.
    [7]Rakesh Gupta, Lev-Arie Ratinov. Text Categorization with Knowledge Transfer from Heterogeneous Data Sources. Proceedings of the23rd National Conference on Artificial Intelligence. California:AAAI Press.2008, pp.842-847.
    [8]Kwan Yi, Jamshid Beheshti. A hidden Markov Model-based Text Classification of Medical Documents. Journal of Information Science.2009,35(1):pp.67-81.
    [9]Nagesh Kapalavayi, S.N.Jayaram Murthy, Gongzhu Hu. Hierarchical Approach to Select Feature Vectors for Classification of Text Documents. Proceedings of the IEEE International Conference on Computer Systems and Applications. Sharja, USA.2006, pp.1180-1183.
    [10]Taeho Jo, Malrey Lee. Kernel based Learning Suitable for Text Categorization. Proceedings of the5th ACIS International Conference on Software Engineering Research, Management and Applications. Washington:IEEE Computer Society.2007, pp.289-292.
    [11]Makoto Suzuki, Shigeichi Hirasawa. Text Categorization Based on the Ratio of Word Frequency in Each Categories. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. Montreal, QC, Canada.2007, pp.3535-3540.
    [12] Alex K.S. Wong, John W.T. Lee. An Evolutionary Approach for DiscoveringEffective Composite Features for Text Categorization. Proceedings of the IEEEInternational Conference on Systems, Man and Cybernetics. Montreal, QC,Canada.2007, pp.3045-3050.
    [13] Guy Lebanon, Yi Mao, Joshua Dillon. The Locally Weighted Bag of WordsFramework for Document Representation. Journal of Machine Learning Research.2007,8(12), pp.2405-2441.
    [14] Anand Sharma, Anthony Kuh. Class Document Frequency as a Learned Featurefor Text Categorization. Proceedings of the International Joint Conference onNeural Network. Hong Kong, China.2008, pp.2988-2993.
    [15] Makoto Suzuki. Text Categorization using the Maximum Ratio of Term Frequency.Journal of Japan Industrial Management Association.2008,58(6): pp.438-444.
    [16] Thorsten Joachims. Text Categorization with Support Vector Machines: Learningwith Many Relevant Features. Proceedings of the10th European Conference onMachine Learning. Chemnitz, Germany.1998, pp.137-142.
    [17] Yiming Yang, Tom Pierce, Jaime Carbonell. A Study of Retrospective and On-lineEvent Detection. Proceedings of the21st Annual International ACM SIGIRConference on Research and Development in Information Retrieval. New York:ACM Press.1998, pp.28-36.
    [18] Andrew Kachites McCallum. Multi-label Text Classification with a MixtureModel Trained by EM. Proceedings of the AAAI-99Workshop on Text Mining.Orlando, Florida.1999, pp.1-7.
    [19] Kalyan Moy Gupta, Philip G. Moore, David W. Aha, et al. Rough Set FeatureSelection Methods for Case-Based Categorization of Text Documents.Proceedings of the1st International Conference on Pattern Recognition andMachine Intelligence. Heidelberg: Springer-Verlag.2005, pp.792-798.
    [20] Laurence Hirsch, Masoud Saeedi, Robin Hirsch. Evolving Text ClassificationRules with Genetic Programming. Applied Artificial Intelligence.2005,7(19):pp.659-676.
    [21] Jorge Civera, Elsa Cubel, Alfons Juan, et al. Different Approaches to BilingualText Classification. Based on Grammatical Inference Techniques. Proceedings ofthe2nd Iberian Conference on Pattern Recognition and Image Analysis. Estoril,Portugal.2005, pp.630-637.
    [22] AJC Trappey, SCI Lin, ACL Wang. Using Neural Network Categorization Methodto Develop an Innovative Knowledge Management Technology for PatentDocument Classification. Proceedings of the9th International Conference onComputer Supported Cooperative Work in Design.2005, pp.830-835.
    [23] David A. Bell, J. W. Guan, Yaxin Bi. On Combining Classifier Mass Functions forText Categorization. IEEE Transaction on Knowledge and Data Engineering.2005,17(10): pp.1307-1319.
    [24] Almonayyes, A. Categorizing Fanatic Texts by Integrating Explanation Patternswith Naive Bayes Classifier. Proceedings of2005International Conference onNeural Networks and Brain. Beijing, China.2005, pp.1279-1283.
    [25] Hiroshi Uejima, Takao Miura, Isamu Shioya. Improving Text Categorization byResolving Semantic Ambiguity. Systems and Computers in Japan.2005,36(4):pp.1-8.
    [26] Yun Jeong Choi, Seung Soo Park. Refinement Method of Post-processing andTraining for Improvement of Automated Text Classification. Proceedings ofInternational Conference on Computational Science and Its Application. Glasgow,United Kingdom.2006, pp.298-308.
    [27] Takahiro Yamada, Kyohei Yamashita, Naohiro Ishii. Text Classification byCombining Different Distance Functions with Weights. Proceedings of the7thACIS International Conference on Software Engineering, Artificial Intelligence,Networking, and Parallel/Distributed Computing. Las Vegas, NV, United States.2006, pp.85-90.
    [28] Youngsoo Kim, Taekyong Nam, Dongho Won.2-Way Text Classification forHarmful Web Documents. Proceedings of International Conference onComputational Science and Its Application. Glasgow, United Kingdom.2006,pp.545-551.
    [29] G.E. Hinton, R. R. Salakhutdinov. Reducing the Dimensionality of Data withNeural Networks. Science.2006,313(5786): pp.504-507.
    [30] Cornelis HA Koster, Jean G. Beney. On the Importance of Parameter Tuning inText Categorization. Proceedings of the6th International Andrei Ershov MemorialConference on Perspectives of Systems Informatics. Novosibirsk, Russia.2007,pp.270-283.
    [31]Yongwook Yoon, Gary G. Lee. Text Categorization Based on Boosting Association Rules. Proceedings of the2nd Annual IEEE International Conference on Semantic Computing. Washington:IEEE Computer Society.2008, pp.136-143.
    [32]Anastasia Krithara, Massih R. Amini, Jean-michel Renders, et al. Semi-supervised Document Classification with a Mislabeling Error Model. Proceedings of the30th European Conference on Advances in Information Retrieval. Heidelberg: Springer-Verlag.2008, pp.370-381.
    [33]Olivier Chapelle, Vikas Sindhwani, Sathiya S. Keerthi. Optimization Techniques for Semi-Supervised Support Vector Machines. Journal of Machine Learning Research.2008,9(2):pp.203-233.
    [34]Dino Isa, Lam Hong Lee, V.P. Kallimani et al. Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine. IEEE Transactions on Knowledge and Data Engineering.2008,20(9):pp.1264-1271.
    [55]Cover TM, Hart PE. Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory.1967,13(1):pp.21-27.
    [56]Susana Eyheramendy, David D. Lewis, David Madigan. On the Naive Bayes Model for Text Categorization. Proceedings of the9th International Workshop on Artificial Intelligence and Statistics. Key West, Florida.2003, pp.332-339.
    [57]David Lewis, Robert E. Schapire, James P. Callan, et al. Training Algorithms for Linear Text Classifiers. Proceedings of the19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press.1996, pp.298-306.
    [58]Adam Berger. Error-Correcting Output Coding for Text Classification. Proceedings of IJCAI-99Workshop on Machine Learning for Information Filtering. Stockholm, Sweden.1999, pp.17-24.
    [59] Rayid Ghani. Using Error-Correcting Codes for Text Classification. Proceedingsof the17th International Conference on Machine Learning. San Francisco:Morgan Kaufmann Publishers.2000, pp.303-310.
    [60] Platt J, Cristianini N, Shawe-Taylor J. Large Margin DAGs for MulticlassClassification. Advances in Neural Information Processing Systems. Cambridge:MIT Press.2000, pp.547-553.
    [61] Liu Jinbai, Xu Lihong, Fei Ben. Binary Tree of Support Vector Machine in TextureClassification Problem. Proceedings of the IASTED International Conference onCircuits, Signals and Systems. Calgary: ACTA Press.2004, pp.284-288.
    [62] Chakrabarti S, Roy S, Soundalgekar MV. Fast and Accurate Text Classification viaMultiple Linear Discriminant Projections. The International Journal on Very LargeData Bases.2003,12(2): pp.170-185.
    [63] Wu H, Phang TH, Liu B, et al. A Refinement Approach to Handling Model Misfitin Text Categorization. Proceedings of the8th ACM International Conference onKnowledge Discovery and Data Mining. Edmonton: ACM Press.2002,pp.207-216.
    [64] Tan SB, Cheng XQ, Wang B, et al. Using Dragpushing to Refine Centroid TextClassifiers. Proceedings of the Annual ACM Conference on Research andDevelopment in Information Retrieval. New York: ACM Press,2005, pp.653-654.
    [65] Debole F, Sebastiani F. An Analysis of the Relative Hardness of Reuters-21578Subsets. Journal of the American Society for Information Science and Technology.2004,56(6): pp.584-596.
    [66] Yang YM, Liu X. A Re-examination of Text Categorization Methods. Proceedingsof the Annual ACM Conference on Research and Development in InformationRetrieval. New York: ACM Press,1999, pp.42-49.
    [67] Lewis DD, Yang Y, Rose T, et al. RCV1: A New Benchmark Collection for TextCategorization Research. Journal of Machine Learning Research.2004,5:pp.361-397.
    [68] Forman G, Cohen I. Learning from Little: Comparison of Classifiers Given LittleTraining. Proceedings of the8th European Conference on Principles of DataMining and Knowledge Discovery. Heidelberg: Springer-Verlag.2004,pp.161-172.
    [69]Leo Breiman, Jerome Friedman, Charles J Stone, et al. Classification and Regression Trees. London:Chapman&Hall/CRC.1984.
    [70]Quinlan JR. Induction of Decision Trees. Machine Learning1. Boston:Kluwer Academic Publishers.1986, pp.81-106.
    [71]Quinlan JR. C4.5:Programs for Machine Learning. San Francisco:Morgan Kaufmann Publishers,1993.
    [72]Pawlak Z. Rough Set. International Journal of Computer and Information Science.1982,11:pp.341-356.
    [73]Yuhua Li, David Mclean, Zuhair A. Bandar, et al. Sentence Similarity Based on Semantic Nets and Corpus Statistics. IEEE Transactions on Knowledge and Data Engineering.2006,18(8):pp.1138-1150.
    [74]Adam Schenker, Mark Last, Horst Bunke, et al. Classification of Web Documents Using a Graph Model. Proceedings of the7th International Conference on Document Analysis and Recognition. Washington:IEEE Computer Society.2003,1:pp.240-244.
    [76]Manuel Montes-y-Gomez, Aurelio Lopez-Lopez, Alexander Gelbukh. Information Retrieval with Conceptual Graph Matching. Proceedings of the11th International Conference on Database and Expert Systems Applications. London: Springer-Verlag.2000,1873:pp.312-321.
    [77]Bhoopesh Choudhary, Pushpak Bhattacharyya. Text Clustering using Semantics. Proceedings of the11th International Conference on World Wide Web. New York: ACM Press.2002,79.
    [78]Svetlana Hensman. Construction of Conceptual Graph Representation of Texts. Proceedings of the Student Research Workshop at HLT-NAACL2004. Stroudsburg:Association for Computational Linguistics.2004, pp.49-54.
    [79]Wei Song, Soon Cheol Park. A Novel Document Clustering Model Based on Latent Semantic Analysis. Proceedings of the3rd International Conference on Semantics, Knowledge and Grid. Washington:IEEE Computer Society.2007, pp.539-542.
    [80] Chang-Shing Lee, Yuan-Fang Kao, Yau-Hwang Kuo, et al. Automated OntologyConstruction for Unstructured Text Documents. Data&Knowledge Engineering.2007,60(3): pp.547-566.
    [81] Anna Stavrianou, Periklis Andritsos, Nicolas Nicoloyannis. Overview andSemantic Issues of Text Mining. ACM SIGMOD Record.2007,36(3): pp.23-34.
    [82] Wei Jin, Rohini K. Srihari. Graph-based Text Representation and KnowledgeDiscovery. Proceedings of the2007ACM Symposium on Applied Computing.New York: ACM Press.2007, pp.807-811.
    [83] Ming-Wei Chang, Lev Ratinov, Dan Roth, et al. Importance of SemanticRepresentation: Dataless Classification. Proceedings of the23rd AAAIConference on Artificial Intelligence. California: AAAI Press.2008, pp.830-835.
    [84] Evgeniy Gabrilovich, Shaul Markovitch. Computing Semantic Relatedness usingWikipedia-based Explicit Semantic Analysis. Proceedings of the20thInternational Joint Conference for Artificial Intelligence. California: AAAI Press.2007, pp.1606-1611.
    [85] Yanjun Li, Soon M. Chung, John D. Holt. Text Document Clustering Based onFrequent Word Meaning Sequences. Data&Knowledge Engineering.2008,64(1):pp.381-404.
    [86] Khaled Shaban. A Semantic Approach for Document Clustering. Journal ofSoftware.2009,4(5): pp.391-404.
    [87] Walaa K. Gad, Mohamed S. Kamel. New Semantic similarity Based Model forText Clustering Using Extended Gloss Overlaps. Proceedings of the6thInternational Conference on Machine Learning and Data Mining in PatternRecognition. Berlin: Springer-Verlag.2009, pp.663-677.
    [88] Jianyi Liu, Jinghua Wang, Cong Wang. Research on Text Network Representation.Proceedings of IEEE International Conference on Networking, Sensing andControl. Washington: IEEE Computer Society.2008, pp.1217-1221.
    [89] Helen J. Peat, Peter Willett. The Limitations of Term Co-Occurrence Data forQuery Expansion in Document Retrieval Systems. Journal of the AmericanSociety for Information Science.1991,42(5): pp.378-383.
    [90] Lillian Lee. Measures of Distributional Similarity. Proceedings of the37th annualmeeting of the Association for Computational Linguistics on Computational Linguistics. Stroudsburg:Association for Computational Linguistics.1999, pp.25-32.
    [91]Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, et al. Word-Sense Disambiguation Using Statistical Methods. Proceedings of the29th annual meeting on Association for Computational Linguistics. Stroudsburg:Association for Computational Linguistics.1991, pp.264-270.
    [92]Ido Dagan, Lillian Lee, Fernando Pereira. Similarity-Based Models of Word Cooccurrence Probabilities. Machine Learning:Special issue on Machine Learning and Natural Language.1999,34(1-3):pp.43-69.
    [95]George A. Miller, Richard Beckwith, Christiane Fellbaum, et al. WordNet: An on-line Lexical Database. International Journal of Lexicography.1990,3(4): pp.235-244.
    [96]Robert A Dutch, Peter Mark Roget. The Original Roget's Thesaurus of English Words and Phrases. New York: St. Martin's Press.1966.
    [100]R. Rada, H. Mili, E. Bicknell, et al. Development and Application of a Metric on Semantic Nets. IEEE Transactions on Systems, Man, and Cybernetics.1989,19(1):pp.17-30.
    [101]Joon Ho Lee, Myoung Ho Kim, Yoon Joon Lee. Information Retrieval Based on Conceptual Distance in Is-A Hierarchies. Journal of Documentation.1993,49(2): pp.188-207.
    [102]Philip Resnik. Semantic Similarity in a Taxonomy:An Information-Based. Measure and its Application to Problems of Ambiguity in Natural Language. Journal on Artificial Intelligence Research.1999,11:pp.95-130.
    [104],.. Computational Linguisticsand Chinese Language Processing.2002,7(2): pp.59-76.
    [105] Evgeniy Gabrilovich, Shaul Markovitch. Computing Semantic Relatedness usingWikipedia-based Explicit Semantic Analysis. Proceedings of the20thInternational Joint Conference on Artificial Intelligence. San Francisco: MorganKaufmann Publishers.2007, pp.1606-1611.
    [106] Stephen D'Alessio, Keitha Murray, Robert Schiaffino, et al. The Effect of UsingHierarchical Classifiers in Text Categorization. Proceedings of the6thInternational Conference Recherche d'Information Assistee par Ordinateur. Paris,FR.2000, pp.302-313.
    [107] Yiming Yang. An Evaluation of Statistical Approaches to Text Categorization.Journal of Information Retrieval.1999,1(1/2): pp.69-90.
    [108] Susan Dumais, Hao Chen. Hierarchical Classification of Web Content.Proceedings of the23rd Annual International ACM SIGIR Conference onResearch and Development in Information Retrieval. New York: ACM Press.2000, pp.256-263.
    [109] Ke Wang, Senqiang Zhou, Yu He. Hierarchical Classification of Real LifeDocuments. Proceedings of the1st SIAM International Conference on DataMining. Chicago, United States.2001.
    [110] Andreas S. Weigend, Erik D. Wiener, Jan O. Pedersen. Exploiting Hierarchy inText Categorization. Information Retrieval.1999,1(3): pp.193-216.
    [111] Daphne Koller, Mehran Sahami. Hierarchically Classifying Documents UsingVery Few Words. Proceedings of the14th International Conference on MachineLearning. San Francisco: Morgan Kaufmann Publishers.1997, pp.170-178.
    [112] Andrew McCallum, Ronald Rosenfeld, Tom M. Mitchell, et al. Improving TextClassification by Shrinkage in a Hierarchy of Classes. Proceedings of the15thInternational Conference on Machine Learning. San Francisco: MorganKaufmann Publishers.1998, pp.359-367.
    [113] Aixin Sun, Ee-Peng Lim, Wee-Keong Ng. Performance Measurement Frameworkfor Hierarchical Text Classification. Journal of the American Society forInformation Science and Technology.2003,54(11): pp.1014-1028.
    [114] Minoru Sasaki, Kenji Kita. Rule-Based Text Categorization Using HierarchicalCategories. Proceedings of the IEEE International Conference on Systems, Man,and Cybernetics. San Diego, CA.1998, pp.2827-2830.
    [115] Dunja Mladeniéc, Marko Grobelnik. Feature Selection for Classification Based onText Hierarchy. Proceedings of the Conference on Automated Learning andDiscovery.1998.
    [116] Kristina Toutanova, Francine Chen, Kris Popat, et al. Text Classification in aHierarchical Mixture Model for Small Training Sets. Proceedings of the10thInternational Conference on Information and Knowledge Management. New York:ACM Press.2001, pp.105-113.
    [117] TieYan Liu, Yiming Yang, Hao Wan, et al. Support Vector MachinesClassification with A Very Large-scale Taxonomy. ACM SIGKDD ExplorationsNewsletter-Natural Language Processing and Text Mining.2005,7(1): pp.36-43.
    [118] Aixin Sun, Ee-Peng Lim. Hierarchical Text Classification and Evaluation.Proceedings of the2001IEEE International Conference on Data Mining.Washington: IEEE Computer Society.2001, pp.521-528.
    [119] Lijuan Cai, Thomas Hofmann. Hierarchical Document Categorization withSupport Vector Machines. Proceedings of the13th ACM International Conferenceon Information and Knowledge Management. New York: ACM Press.2004,pp.78-87.
    [120] Nicol`o Cesa-Bianchi, Claudio Gentile, Luca Zaniboni. Incremental Algorithmsfor Hierarchical Classification. Journal of Machine Learning Research.2006,7:pp.31-54.
    [121] Korinna Bade, Eyke Hullermeierm, Andreas Nurnberger. HierarchicalClassification by Expected Utility Maximization. Proceedings of the6thInternational Conference on Data Mining. Washington: IEEE Computer Society.2006, pp.43-52.
    [122] Miguel E. Ruiz, Padmini Srinivasan. Hierarchical Text Categorization UsingNeural Networks. Information Retrieval.2002,5(1): pp.87-118.
    [123] Juho Rousu, Craig Saunders, Sandor Szedmak, et al. Learning HierarchicalMulti-Category Text Classification Models. Proceedings of the22nd InternationalConference on Machine Learning. New York: ACM Press.2005, pp.744-751.
    [124]Nicholas Holden, Alex A Freitas. Improving the Performance of Hierarchical Classification with Swarm Intelligence. Proceedings of the6th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. Heidelberg:Springer-Verlag.2008, PP.48-60.
    [125]Andrea Esuli, Tiziano Fagni, Fabrizio Sebastiani. Boosting Multi-Label Hierarchical Text Categorization. Information Retrieval.2008,11(4):pp.287-313.
    [126]Xiaojun Quan, Lin Yanggang, Luo Qiming, et al. Hierarchical Text Categorization with Probabilistic Topics. Journal of University of Science and Technology of China.2009,39(8):pp.875-879.
    [128]J. Diez, J. J. del Coz, A. Bahamonde. A Semi-dependent Decomposition Approach to Learn Hierarchical Classifiers. Pattern Recognition.2010,43(11): pp.3795-3804.
    [129]Min-Hsuan Tsai, Shen-Fu Tsai, Thomas S. Huang. Hierarchical Image Feature Extraction And Classification. Proceedings of the International Conference on Multimedia. New York: ACM Press.2010, pp.1007-1010.
    [130]Carlos N Silla, Alex A Freitas. Novel Top-Down Approaches for Hierarchical. Classification and Their Application to Automatic Music Genre Classification. Proceedings of the2009IEEE International Conference on Systems, Man and Cybernetics. Piscataway:IEEE Press.2009, pp.3499-3504.
    [131]Kunal Punera, Joydeep Ghosh. Enhanced Hierarchical Classification via Isotonic Smoothing. Proceeding of the17th International Conference on World Wide Web. New York: ACM Press.2008, pp.151-160.
    [132]Nam Nguyen. Improving Hierarchical Classification with Partial Labels. Proceeding of the19th European Conference on Artificial Intelligence. Amsterdam: IOS Press.2010, pp.315-320.
    [133]Bin Gao, Tie-Yan Liu, Guang Feng, et al. Hierarchical Taxonomy Preparation for Text Categorization Using Consistent Bipartite Spectral Graph Copartitioning. IEEE Transactions on Knowledge and Data Engineering.2005,17(9): pp.1263-1273.
    [134]Tao Li, Shenghuo Zhu, Mitsunori Ogihara. Hierarchical Document Classification Using. Automatically Generated Hierarchy. Journal of Intelligent Information Systems.2007,29(2):pp.211-230.
    [135]Kunal Punera, Suju Rajan, Joydeep Ghosh. Automatic Construction of N-ary Tree Based Taxonomies. Proceedings of the6th IEEE International Conference on Data Mining. Washington:IEEE Computer Society.2006, pp.75-79.
    [136]Lei Tang, Jianping Zhang, Huan Liu. Acclimatizing Taxonomic Semantics for Hierarchical Content Classification. Proceedings of the12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press.2006,384-393.
    [137]Lei Tang, Huan Liu, Jianping Zhang, et al. Topic Taxonomy Adaptation for Group Profiling. ACM Transactions on Knowledge Discovery from Data.2008,1(4): pp.1-26.
    [138]Arthur Zimek, Fabian Buchwald, Eibe Frank, et al. A Study of Hierarchical and Flat Classification of Proteins. IEEE/ACM Transactions on Computational Biology and Bioinformatics.2010,7(3):pp.563-571.
    [139]Joel Ratsaby. Incremental Learning with Sample Queries. IEEE Transactions on Pattern Analysis and Machine Intelligence. Washington:IEEE Computer Society.1998,20(8):883-888.
    [140]K Yamauchi, N Yamaguchi, N Ishii. Incremental Learning Methods with Retrieving Interfered Patterns. IEEE Transactions on Neural Networks.1999,10(6):pp.1351-1365.
    [142]Nadeem Ahmed Syed, Huan Liu,Kah Kay Sung. Handling Concept Drifts in Incremental Learning with Support Vector Machines. Proceedings of the5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press.1999, pp.317-321.
    [143]Stefan Riiping. Incremental Learning with Support Vector Machines. Proceedings of the2001IEEE International Conference on Data Mining. Washington:IEEE Computer Society.2001, pp.641-642.
    [144]Gert Cauwenberghs, Tomaso Poggio. Incremental and Decremental Support Vector Machine Learning. Proceedings of the13th Neural Information Processing Systems. Cambridge:MIT Press.2000, pp.409-415.
    [147]Yannis Labrou, Tim Finin. Yahoo! As an Ontology-Using Yahoo! Categories to Describe Documents. Proceedings of the8th International Conference on Information and Knowledge Management.1999, pp.180-187.
    [148]Dunja Mladeni. Turning Yahoo to Automatic Web-Page Classifier. Proceedings of the13th European Conference on Artificial Intelligence. Brighton, UK.1998, pp.473-474.
    [149]David D Lewis. An Evaluation of Phrasal and Clustered. Representations on a Text Categorization Task. Proceedings of the15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press.1992, pp.37-50.
    [150]Fabrizio Sebastiani. Machine Learning in Automated Text Categorization. ACM Computing Surveys.2002,34(1):pp.1-47.
    [151]Tom Fawcett. ROC Graphs:Notes and Practical Considerations for Researchers. HP Labs Tech Report. Netherlands:Kluwer Academic Publishers.2004,31(HPL-2003-4):pp.1-38.
    [152]Fan Li, Yiming Yang. A Loss Function Analysis for Classification Methods in Text Categorization. Proceedings of the20th International Conference on Machine Learning. Washington:AAAI Press.2003, pp.472-479.
    [153]Yiming Yang, Jian Zhang, Bryan Kisiel. A Scalability Analysis of Classifiers in Text Categorization. Proceedings of the26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press.2003, pp.96-103.
    [154]Shengli Song, Zengxin Guo, Ping Chen. Fuzzy Document Clustering using Weighted Conceptual Model. Information Technology Journal.2011,10(6): pp.1178-1185.
    [156]Shengli Song, Xiaofei Qiao, Ping Chen. Hierarchical Text Classification Incremental Learning. Proceedings of the16th International Conference on Neural Information Processing. Heidelberg:Springer-Verlag.2009, pp.247-258.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700