有色金属领域实体检索关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
互联网中有大量的有色金属数据,这些数据大多是以结构化、半结构化或非结构化形式存在。快速、便捷、准确地获取这些数据对有色金属行业乃至整个商业市场有巨大的需求和重要的价值。目前,国内外还没有专门的针对有色金属领域的实体检索系统,本文针对有色金属领域的特点,结合信息检索中实体检索关键难点问题,围绕有色金属领域实体检索过程中的实体识别、实体关系抽取、实体证据文档识别以及实体排序等关键技术展开研究,主要完成以下特色工作:
     (1)针对有色金属领域产品、矿产、组织机构等几类实体结构复杂、嵌套性强等特点,提出了一种基于深度神经元网络(deep neural network, DNN)架构的有色金属领域实体识别模型。该模型将有色金属领域实体识别任务当作序列标注问题来处理,为了能利用有色金属领域实体中字符之间的紧密结合特征以及有色金属领域特征,模型首先通过word embedding预训练将输入的中文字符表征为一个低维高密度的向量作为DNN模型的输入,然后由DNN模型的多个隐层的逐层预训练自动提取到最优的特征向量作为训练有色金属实体分类器的特征向量,最后在模型输出层通过有监督的神经元语言模型训练实现对有色金属领域实体的识别。实验结果表明针对本文定义的有色金属领域产品、矿产、组织机构这几类有色金属实体识别任务,提出的模型取得了较好的效果。
     (2)针对有色金属领域产品、矿产、组织机构等几类实体在文档中的关系特点,提出了一种基于深度信念网络架构(deep belief network,DBN)的有色金属领域实体关系抽取模型。该模型首先将有色金属实体关系实例表征为word embedding向量作为DBN模型的输入,然后通过DBN网络多个隐层的逐层训练得到有色金属领域关系实例对的有效特征向量,并作为训练有色金属领域实体关系识别分类器的特征向量,最后在训练有色金属领域实体关系抽取分类器的过程中利用已标注好的关系实例样本通过反向传播(back propagation, BP)网络来不断优化整个DBN模型的参数,从而达到较好的关系分类效果。实验结果表明提出的方法针对有色金属领域实体之间的同类关系、生产销售关系、从属关系这三种关系类型的抽取任务具有较好的效果。
     (3)构建了有色金属实体证据文档识别无向图模型。首先分析各类有色金属实体证据文档中的词、URL链接、有色金属实体元数据等独立页面特征以及候选有色金属实体证据文档间的链接和内容等关联关系,然后将独立页面特征以及页面之间的关联关系融入到无向图中构建有色金属实体证据文档识别无向图模型,最后利用梯度下降方法学习模型中特征的权重,并利用吉布斯采样方法进行有色金属实体证据文档识别,实验结果表明所提方法有较好的效果。
     (4)提出了基于深度学习的有色金属实体排序模型。该模型首先通过深度网络的多层非线性变换分别将影响有色金属实体排序的查询向量、有色金属实体元数据向量、有色金属实体关系向量以及有色金属实体相关候选文档映射到同一个低维的语义空间向量,然后分别计算查询、有色金属实体元数据、有色金属实体关系与候选文档在变换后的低维语义空间中对应的向量之间的相似性,最后融合候选文档与这三个向量的语义相似性作为最终排序得分。实验结果表明我们的模型针对有色金属领域实体排序任务具有较好的效果。
There are a number of nonferrous data in the internet, most of these data are formed as structured, semi-structured or unstructured. How to obtaining these data fastly, conveniently, and exactly which is very useful in the nonferrous metal industry and even the whole commodity market. At the present, there is no special entity retrieval system in the field of nonferrous metal. In this paper, according to the characteristics of the non-ferrous metals, combining entity retrieval key difficult problem in the information retrieval, proceeded a serious of studies that focus on key technologies of entity recognition, entity relation extraction, entity evidence document recognition, entity ranking etc, which used in the process of information retrieval. The distinctive achievements are as follow:
     (1) In the field of nonferrous, Directed at the complex and strong nested structure characteristics of several entity such as products, organizations, placename etc, it proposed a kind of nonferrous metal industry entity recognition model based on Deep Neural Network (DNN) framework. The model treats this entity retrieval work as the entity sequence labeling work, in order to use the characteristics of close combination and the special field, Chinese characters are made as the input of this model, and the word embedding technology is used to map the input of Chinese character into word embedding vector firstly, and bunches the vectors initial combination feature vector as the input of DNN. The pre-training by a plurality of DNN model hidden layer (text window reduction Autoencoderr model) automatically extracts optimal feature vector as the training features of the nonferrous metals entity classifier. The experiments of entity recognition in the field of nonferrous metal show that this model is better than the conditional random field model or neural network model.
     (2) It proposed a model of Chinese entity relation extraction based on DBN (deep belief networks) in the field of nonferrous metal on the correlation recognition problem of all kinds of entities such as products, minerals etc in the same one document page in the field of nonferrous metal. Firstly, the relation instances are represented into word embedding vectors as the input of DBN in the field of nonferrous metal, and then uses the DBN model training to obtain the most stable feature vector as the input of BP (back propagation) network for relation extraction supervised. Finally uses the standard corpus annotation using BP neural network inverse optimization training nonferrous metal field entity relation classifier. The experimental results show that the method we proposed has an preferably effect on the retrieval task of relationships of the same kind of entity, production and sales, and subordinations in the field of nonferrous.
     (3) An evidence document recognition undirected graph model on nonferrous metal solid was proposed. At first, this method analyzes words, URL links,nonferrous metal entity metadata in all kinds of nonferrous metal solid evidence files such as independent page features,and the association between links and contents of the candidate nonferrous metal solid evidence documents, then constructs a nonferrous metal solid evidence document identification of undirected graph model by putting the independent page features and the relation of pages into an undirected graph,and finally uses the weight of the features of the gradient descent method in learning model, and the Gibbs sampling method recognizes the nonferrous metal solid evidence document. The experimental result shows that the method we proposed has an preferably effect.
     (4) It proposed a nonferrous metal entity ranking model based on deep learning. Firstly, the nonferrous metals entity queries, nonferrous metal entity candidate documents, nonferrous metal entity metadata, and nonferrous metal solid theme networks are mapped to the same semantic space respectively through a nonlinear transformation of deep architecture net. Then calculated the similarity among the concept vector that the low dimensional semantic space the query,non-ferrous metal entity metadata,non-ferrous metal entity relation and candidate document mapped in. Finally, fused the candidate document and the semantic similarity of the three vectors as the final ranking score. We can have a conclusion from the experiment that our model for the field of nonferrous metal solid ordering tasks has an preferably effect.
引文
华诚金属网http://pan.baidu.com/share/link?shareid=161885955&uk=3190030776
    2 http://www.mathworks.com/access/helpdesk/help/toolbox/nnet/backprop.html
    3 http://www.csdn.net/article/2013-08-20/2816643-word2vec
    [1]王晓伟.垂直搜索引擎若干关键技术的研究[J].浙江大学学报,2007,5.
    [2]Balog K,Serdyukov P,Vries AP.Overview of the TREC 2010 entity track[R]. NORWEGIAN UNIV OF SCIENCE AND TECHNOLOGY TRONDHEIM,2010.
    [3]De Vries A P, Vercoustre A M, Thom J A, et al. Overview of the INEX 2007 entity ranking track[M]. Focused Access to XML Documents. Springer Berlin Heidelberg,2008:245-251.
    [4]P. Bailey, N. Craswell, A. P. de Vries, and I. Soboroff, "Overview of the TREC 2007 enterprise track," in Proceedings of the Text REtrieval Conference,(TREC 2007), Gaithersburg, MD,2008.
    [5]K. Balog, I. Soboroff, P. Thomas, N. Craswell, A. P. de Vries, and P. Bailey,"Overview of the TREC 2008 enterprise track," in Proceedings of the Text REtrieval Conference, (TREC'08), Gaithersburg, MD,2009.
    [6]N. Craswell, A. de Vries, and I. Soboroff, "Overview of the TREC-2005 enterprise track," in Proceedings of the Text REtrieval Conference, (TREC'05),Gaithersburg, MD,2006.
    [7]M. Hertzum, "People as carriers of experience and sources of commitment:Information seeking in a software design project," New Review of Information Behaviour Research, vol.1, pp.135-149,2000. ISSN 1471-6313.
    [8]Clarke C L, Craswell N, Soboroff I. Overview of the tree 2009 web track[R]. WATERLOO UNIV (ONTARIO),2009.
    [9]Hu G, Liu J, Li H, et al. A supervised learning approach to entity search[M].Information Retrieval Technology. Springer Berlin Heidelberg,2006:54-66.
    [10]Balog K, Fang Y, de Rijke M, et al. Expertise Retrieval[J]. Foundations and Trends in Information Retrieval,2012,6(2-3):127-256.
    [11]Billerbeck B, Demartini G, Firan C, et al. Exploiting click-through data for entity retrieval[C].Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM,2010:803-804.
    [12]包胜华.基于Web的实体信息搜索与挖掘研究[D].上海:上海交通大学,2008.
    [13]王东.基于相关实体检索模型的信息保护[D].上海:复旦大学,2012.
    [14]黄健斌,孙鹤立.基于链接路径预测的聚焦Web实体搜索[J].计算机研究与发展,2010(012):2059-2066.
    [15]寇月.Deep Web实体搜索的关键技术研究[D].东北:东北大学,2009.
    [16]Wu Youzheng, Hori C, Kawai H. NiCT at TREC 2010:Related Entity Finding[C]. Proc. of the 19th Text Retrieval Conference. Gaithersburg, USA,2010.
    [17]Lei Cao, Lu Bai, Cheng Xueqi, et al. ICTNET at Entity Track TREC 2010[C]. Proc. of the 19th Text Retrieval Conference. Gaithersburg, USA,2010.
    [18]Pehcevski J, Vercoustre AM, Thom J A. Exploiting Locality of Wikipedia Links in Entity Ranking[C], ECIR 2008, LNCS, vol.4956,2008,258-269.
    [19]Demartini G, Firan C S, Iofciu T, Nejdl W. Semantically Enhanced Entity Ranking[C]. In:Bailey J, Maier D, Schewe KD, Thalheim B, Wang XS. WISE 2008, LNCS, vol.5175,2008,176-188. Springer, Heidelberg.
    [20]Demartini G, Firan CS, Iofciu T, Krestel R, Nejdl W. A Model for Ranking Entities and Its Application to Wikipedia[C]. In:Proceedings of The Latin-American Web Conference,2008.
    [21]Zirn C, Nastase V Distinguishing between Instances and Classes in the Wikipedia Taxonomy[C], In: Bechhofer S, Hauswirth M, Hoffmann J, Koubarakis M. ESWC 2008, LNCS, vol,502,1,2008,376-387.
    [22]Craswell N, Hawking D. Searching for Experts not just for Documents[J],2001.
    [23]Zirn C,Nastase V Distinguishing between Instances and Classes in the Wikipedia Taxonomy[C], In: Bechhofer S, Hauswirth M, Hoffmann J, Koubarakis M. (eds.)ESWC 2008, LNCS,vol.5021,2008,376-387.
    [24]Craswell N, Hawking D, Vercous 仃 eA, Wilkins P. P@noptic Expert:Searching for Experts not just for Documents[J],2001.
    [25]George R, Krupka and Kevin Hausman.1998. Isoquest:Description of the netowlTM extractor system as used in muc-7. In MUC-7.
    [26]Zhou G D, Su J. Named entity recognition using an HMM-based chunk tagger[C]. proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics,2002:473-480.
    [27]Finkel J R, Manning C D. Nested named entity recognition[C]. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. Volume 1-Volume 1. Association for Computational Linguistics,2009:141-150.
    [28]Chieu H L, Ng H T. Named entity recognition:a maximum entropy approach using global information[C]. Proceedings of the 19th international conference on Computational linguistics-Volume 1. Association for Computational Linguistics,2002:1-7.
    [29]Yoshida K, Tsujii J. Reranking for biomedical named-entity recognition[C]. Proceedings of the Workshop on BioNLP 2007. Biological, Translational, and Clinical Language Processing. Association for Computational Linguistics,2007:209-216.
    [30]Etzioni O, Cafarella M, Downey D, et al. Unsupervised named-entity extraction from the web:An experimental study[J]. Artificial Intelligence,2005,165(1):91-134.
    [31]Chiticariu L, Krishnamurthy R, Li Y, et al. Domain adaptation of rule-based annotators for named-entity recognition tasks[C]. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2010:1002-1012.
    [32]Locke B, Martin J. Named entity recognition:Adapting to microblogging[J]. Senior Thesis, University of Colorado,2009.
    [33]Liu X, Zhang S, Wei F, et al. Recognizing named entities in tweets[C]. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies-Volume 1. Association for Computational Linguistics,2011:359-367.
    [34]Wu Y, Zhao J, Xu B, et al. Chinese named entity recognition based on multiple features[C]. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2005:427-434.
    [35]Fu G, Luke K K. Chinese named entity recognition using lexicalized HMMs[J]. ACM SIGKDD Explorations Newsletter,2005,7(1):19-25.
    [36]刘非凡,赵军,吕碧波,等.面向商务信息抽取的产品命名实体识别研究[J].中文信息学报,2006,20(1):7-13.
    [37]梅丰.产品名实体识别及规范化研究[D].哈尔滨工业大学,2011.
    [38]Turian J, Ratinov L, Bengio Y. Word representations:a simple and general method for semi-supervised learning[C]. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics,2010:384-394.
    [39]Chen Y, Ouyang Y, Li W, et al. Using deep belief networks for Chinese named entity categorization[C]. Proceedings of the 2010 Named Entities Workshop. Association for Computational Linguistics,2010: 102-109.
    [40]Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch [J]. The Journal of Machine Learning Research,2011,12:2493-2537.
    [41]Wu K, Gao Z, Peng C, et al. Text Window Denoising Autoencoder:Building Deep Architecture for Chinese Word Segmentation[M]. Natural Language Processing and Chinese Computing. Springer Berlin Heidelberg,2013:1-12.
    [42]C.Aone, M.Ramo-Santaeruz. Rees. A large-scale relation and event extraction system[C]. In Proceedings of the 6th Applied Natural Language Processing Conference,2000:76-83.
    [43]Roman Yangarber, Ralph Grishman,.Machine learning of Extraction Patterns from Unannotated Corpora: Position Statement[C]. In Proceedings of European Conference on Artificial Intelligence,2000.
    [44]Ralph Grishman, Beth Sundheim. Message Understanding Conference-6:A Brief History [C]. In Proceedings of the International Conference on Computational Linguistics,1996:466-471.
    [45]N.Cristianini and J.Shawe Taylor. An Introduction to Support Vector Machines[M]. Cambridge University Press, Cambirdge University,2000.
    [46]Y.Freund, R.E.Schapire. Large margin classification using the perceptron algorithm[C]. In:Computational Learning Theory,1998:209-217.
    [47]黄瑞红,孙乐,冯元勇,黄云平.基于核方法的中文实体关系抽取研究[J],中文信息学报,2008,22(5):102-108.
    [48]董静,孙乐,冯元勇中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85.
    [49]Li Haiguang, Wu Xindong, Li Zhao, Wu Gongqing. A relation extraction method of Chinese named entities based on location and semantic features[J].Applied Intelligence,2013,18(1):1-15.
    [50]Che, W. X, Jianmin Jiang, Zhong Su, Yue Pan, Ting Liu. Improved-edit-distance kernel for Chinese relation extraction.. In IJCNLP,2005:132-137.
    [51]刘克彬,李芳,刘磊等.基于核函数中文关系自动抽取系统的实现[J].计算机研究与发展,2007,44(8):1406-1411.
    [52]黄瑞红,孙乐,冯元勇,黄云平.基于核方法的中文实体关系抽取研究[J],中文信息学报,2008,22(5):102-108.
    [53]Zhang Ming, Zhang Jie, Su Jian, Zhou Guodong. A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features[C]. In:COLING-ACL'2006, Sydney, Australia,2006: 825-832.
    [54]虞欢欢,钱龙华,周国栋,朱巧明.基于合一句法和实体语义树的中文语义关系抽取[J],中文信息学报,2010,24(5):17-23.
    [55]Liu Dandan, Zhao Zhiwei, Hu yanan, Qian Longhua. Incorporating Lexical Semantic Similarity to Tree Kernel-based Chinese Relation Extraction[J]. Lecture Notes in Computer Science,2013,7717:11-21.
    [56]Miyao.Y. et al. Evaluating contributions of natural language parsers to protein-protein interaction extraction[J]. Bioinformatics,2009,25,394-400.
    [57]Liu B, Qian LH, Wang HL, Zhou GD. Dependency-driven feature-based learning for extracting protein-protein interactions from biomedical Text[C]. In:Proceedings of COLING'2010:757-65.
    [58]Bui Q-C, Katrenko S, Sloot PMA. A hybrid approach to extract protein-protein interactions[J]. Bioinformatics 2011;27(2):259-65.
    [59]Longhua Qian, Guodong Zhou. Tree kernel-based protein-protein interaction extraction from biomedical literature [J]. Journal of Biomedical Informatics,2012,45(3):535-543.
    [60]Kim S, Yoon J, Yang J, Park S. Walk-weighted subsequence kernels for protein-protein interaction extraction[J]. MBC Bioinformatics,2010,11:107. Airola A, Pyysalo S, Bj rne J, Pahikkala T, Ginter F, Salakoski T. All-paths graphkernel for
    [61]protein-protein interaction extraction with evaluation of crosscorpus learning[J]. BMC Bioinformatics, 2008;9(S1).
    [62]陈宇,郑德权,赵铁军.基于Deep belief networks的中文名实体关系抽取[J].软件学报,2012,10:005.
    [63]Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief networks[J]. Neural computation,2006,18(7):1527-1554.
    [64]Craswell N, de Vries AP, Soboroff I. Overview of the trec-2005 enterprise track. In:Proc. of the TREC 2005 Conf. Notebook. Gaithersburg,2005.199-205.
    [65]Xi WS, Fox EA, Tan RP, Shu J. Machine learning approach for homepage finding task. In:Proc. of the 9th Int'l Symp. on String Processing and Information Retrieval. LNC S2476, Lisbon,2002.145-159.
    [66]Tang J, Zhang D, Yao LM. Social network extraction of academic researchers. In:Proc. of the 17th IEEE Int'l Conf. on Data Mining (ICDM 2007). Omaha:Nstitute of Electrical and Electronics Engineers Inc.,2007. 292-301.
    [67]Bron M, Balog K, de Rijke M. Ranking related entities:Components and analyses. In:Proc. of the 19th ACM Int'l Conf. on Information and Knowledge Management. Toronto. Association for Computing Machinery,2010.1079-1088.
    [68]Fang Y, Si L, Yu ZT, Xian YT, Xu YB. Entity retrieval with hierarchical relevance model. In:Proc. of the 18th Text REtrieval Conf. (TREC 2009). Gaithersburg:Institute of Electrical and Electronics Engineers Inc.,2009.
    [69]Li LN, Yu ZT, Zou JJ, Su L, Xian YT, Mao CL. Research on entity homepage recognition method. Journal of Computational Information System,2009,5(6):1617-1624.
    [70]Fang Y, Si L, Mathur AP. Discriminative graphical models for faculty homepage discovery. Journal of Information Retrieval,2010,13(6):618-635.
    [71]Wu ZJ, Yu ZT, Su L, Liu L, Xian YT. Research on the method of expert homepage recognition based on Markov logic networks. Journal of Computational Information System,2012,8(3):1089-1096.
    [72]MacdonaId C, Hannah D, Ounis I. High quality expertise evidence for expert search. In:Advances in Information Retrieval. Glasgow:Berlin, Heidelberg:Springer-Verlag,2008.283-295.
    [73]Macdonald C, Ounis I. Voting for candidates:Adapting data fusion techniques for an expert search task. In:Proc. of the CIKM 2006. Arlington:Association for Computing Machinery,2006.387-396.
    [74]Balog K, Azzopardi L, de Rijke M. Formal models for expert finding in enterprise corpora. In:Proc. of the SIGIR 2006. Seattle:Association for Computing Machinery,2006.43-50.
    [75]Jordan MI. Graphical models. Statistical Science,2004,19(1):140-155.
    [76]Koller D, Friedman N. Probabilistic Graphical Models:Principles and Techniques. Cambridge: Massachusetts Institute of Technology Press,2009.
    [77]Tian W, Shen T, Yu ZT, Guo JY, Xian YT. A Chinese expert name disambiguation approach based on spectral clustering with the expert page-associated relationships. In:Proc. of the 2013 Chinese Intelligent Automation Conf. Yangzhou:Berlin, Heidelberg:Springer-Verlag,2013.245-253.
    [78]Ng AY, Jordan MI, Weiss Y. On spectral clustering:Analysis and an algorithm. In:Dietterich TG, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems (NIPS) 14. Cambridge: MIT Press,2002.894-856.
    [79]Wang L, Bo LF, Jiao LC. Density-Sensitive semi-supervised spectral clustering. Ruan Jian Xue Bao. Journal of Software,2007,18(10):2412-2422.
    [80]Wu ZJ, Yu ZT, Guo JY, Mao CL, Zhang YM. Fusion of long distance dependency features for Chinese named entity recognition based on Markov logic networks. In:Proc. of the Natural Language Processing and Chinese Computing. Berlin. Heidelberg:Springer-Verlag,2012.132-142.
    [81]Luenberger DG. Optimization by Vector Space Methods. Hoboken:Wiley-Interscience,1997.
    [82]Zhang D, Lee WS. Question classification using support vector machines. In:Proc. of the 26th Annual Int'l ACM SIGIR Conf. on Research and Development in Informaion Retrieval. Toronto:Association for Computing Machinery,2003.26-32.
    [83]Aizawa A. An information-theoretic perspective of TF-IDF measures. Information Processing & Management,2003,39(1):45-65.
    [84]Stefan Kramer, Gerhard Widmer, Bernhard Pfahringer, Michael De Groeve. Prediction of Ordinal Classes Using Regression Trees [J]. Computer Science,2001, Volume 47:1-13.
    [85]David Cossock, Tong Zhang, Subset ranking using regression. Proceedings of the 19th annual conference on Learning Theory, June 22-25,2006, Pittsburgh, PA.
    [86]Crammer, k and Singer,Y. Pranking with ranking. In:Proceedings of the conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada,2001.
    [87]EF, Harrington. Online Ranking Collaborative filtering using the Perceptron Algorithm. In:Proceedings of the 20th International Conference on Machine Learning. Washington DC, USA,2003,250-257.
    [88]Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, Hang Li. Learning to Rank:From Pairwise Approach to Listwise Approach. Twenty-Fourth International Conference on Machine Learning, ICML 2007, Oregon State University in Corvallis, Oregon, USA,2007:129-136.
    [89]Tao Qin, Xu-Dong Zhang, Ming-Feng Tsai, De-Sheng Wang, Tie-Yan Liu Hang Li. Query-Level Loss Function for Information Retrieval, Information Processing and Management,2007, Volume 44, Issue 2: 838-855.
    [90]Yu D, Deng L, Wang S. Learning in the deep-structured conditional random fields[C].Proc. NIPS Workshop.2009:1-8.
    [91]毕文静,沈华伟,刘悦,许洪波,程学旗.基于企业环境的专家检索研究.第五届全国信息检索学术会议CCIR2009,上海,中国,2009:25-35.
    [92]Fu Y, Xiang R, Liu Y, et al. A CDD-based formal model for expert finding [A]. ACM New York, NY, USA,2007:881-884.
    [93]Balog K, Azzopardi L, de Rijke M. Formal models for expert finding in enterprise corpora[A]. ACM New York, NY, USA,2006:43-50.
    [94]Yi Fang, Luo Si. Discriminative Models of Integrating Document Evidence and Document-Candidate Associations for Expert Search.In:Proceeding of the 33rd international ACM SIGIR, Geneva, Switzerland, 2010,683-690.
    [95]MACDONALD c, OUNIS I. Voting techniques for expert search[J]. Knowledge and Information System, 2008.16(3):259-280.
    [96]Serdyukov P, Rode H, Hiemstra D. University of Twente at the TREC 2007 Enterprise Track:Modeling relevance propagation for the expert search task[A].2007.
    [97]Ji-Meng Chen, Jie Liu, Ya-Lou Huang, Min Lu. Efficient TOP-K Support Documents For Expert Search Using Relationship In A Social Network. Proceedings of the 2011 International Conference on Machine Learning and Cybernetics, Guilin, China,10-13 July,2011:1479-1484.
    [98]Nadeau D, Sekine S. A survey of named entity recognition and classification[J]. Lingvisticae Investigationes,2007,30(1):3-26.
    [99]Singh S, Hillard D, Leggetter C. Minimally-supervised extraction of entities from text advertisements[C]. Human Language Technologies:The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics,2010:73-81.
    [100]Minkov E, Wang R C, Cohen W W. Extracting personal names from email:applying named entity recognition to informal text[C]. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2005: 443-450.
    [101]Tjong Kim Sang E F, De Meulder F. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition[C]. Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics,2003: 142-147.
    [102]Nadeau D. Semi-supervised named entity recognition:learning to recognize 100 entity types with little supervision[J].2007.
    [103]Mccallum and Li,2003[McCallum A, Li W. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons[C]. Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics,2003: 188-191.
    [104]Etzioni O, Cafarella M, Downey D, et al. Unsupervised named-entity extraction from the web:An experimental study[J]. Artificial Intelligence,2005,165(1):91-134.
    [105]Wang Y. Annotating and recognising named entities in clinical notes[C]. Proceedings of the ACL-IJCNLP 2009 Student Research Workshop. Association for Computational Linguistics,2009:18-26.
    [106]Finkel J R, Grenager T, Manning C. Incorporating non-local information into information extraction systems by gibbs sampling[C]. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics,2005:363-370.
    [107]Ratinov L, Roth D. Design challenges and misconceptions in named entity recognition[C]. Proceedings of the Thirteenth Conference on Computational Natural Language Learning. Association for Computational Linguistics,2009:147-155.
    [108]Wu Y, Zhao J, Xu B, et al. Chinese named entity recognition based on multiple features[C]. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2005:427-434.
    [109]Huang P S, He X, Gao J, et al. Learning deep structured semantic models for web search using clickthrough data[C]. Proceedings of the 22nd ACM international conference on Conference on information&knowledge management. ACM,2013:2333-2338.
    [110]Bengio Y, Schwenk H, Senecal J S, et al. Neural probabilistic language models[M]. Innovations in Machine Learning. Springer Berlin Heidelberg,2006:137-186.
    [111]Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders:Learning useful representations in a deep network with a local denoising criterion[J]. The Journal of Machine Learning Research,2010,9999:3371-3408.
    [112]Bridle, John S. Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition[M]. Neurocomputing. Springer Berlin Heidelberg,1990. 227-236.
    [113]Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation[J]. the Journal of machine Learning research, 2003,3:993-1022.
    [114]Deerwester S C, Dumais S T, Landauer T K, et al. Indexing by latent semantic analysis[J]. JASIS,1990, 41(6):391-407.
    [115]Dumais, S.T., Letsche, T.A., Littman, M.L., and Landauer, T.K.1997. Automatic cross-linguistic information retrieval using latent semantic indexing[J]. In AAAI-97 Spring Symposium Series: Cross-Language Text and Speech Retrieval.
    [116]Hofmann T. Probabilistic latent semantic analysis[C].Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc.,1999:289-296.
    [117]Platt J C, Toutanova K, Yih W. Translingual document representations from discriminative projections[C].Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2010:251-261.
    [118]Salakhutdinov R. Learning deep generative models[D]. University of Toronto,2009.
    [119]Gao J, Toutanova K, Yih W. Clickthrough-based latent semantic models for web search[C].Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. ACM,2011:675-684.
    [120]Burges C, Shaked T, Renshaw E, et al. Learning to rank using gradient descent[C].Proceedings of the 22nd international conference on Machine learning. ACM,2005:89-96.
    [121]Bengio Y. Learning deep architectures for AI[J]. Foundations and trends in Machine Learning,2009, 2(1):1-127.
    [122]Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J]. The Journal of Machine Learning Research,2011,12:2493-2537.
    [123]Deng L, He X, Gao J. Deep stacking networks for information retrieval[C].Acoustics, Speech and Signal Processing (ICASSP),2013 IEEE International Conference on. IEEE,2013:3153-3157.
    [124]Heck L P, Konig Y, Sonmez M K, et al. Robustness to telephone handset distortion in speaker recognition by discriminative feature design[J]. Speech Communication,2000,31(2):181-192.
    [125]Konig Y, Heck L, Weintraub M, et al. Nonlinear discriminant feature extraction for robust text-independent speaker recognition[C].Proc. RLA2C, ESCA workshop on Speaker Recognition and its Commercial and Forensic Applications.1998:72-75.
    [126]Mesnil G, He X, Deng L, et al. Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding[C]. Interspeech,2013.
    [127]Salakhutdinov R, Hinton G. Semantic hashing[J]. International Journal of Approximate Reasoning, 2009,50(7):969-978.
    [128]Socher R, Huval B, Manning C D, et al. Semantic compositionality through recursive matrix-vector spaces[C].Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics,2012: 1201-1211.
    [129]Pascal Vincent,Hugo Larochelle, et al.Stacked Denoising Autoencoders:Learning Useful Representations in a Deep Network with a Local Denoising Criterion[J].Journal of Machine Learning Research.2010,11:3371-3408.
    [130]Jarvelin K, Kekalainen J. IR evaluation methods for retrieving highly relevant documents[C].Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval. ACM,2000:41-48.
    [131]Yih W, Toutanova K, Platt J C, et al. Learning discriminative projections for text similarity measures[C].Proceedings of the Fifteenth Conference on Computational Natural Language Learning. Association for Computational Linguistics,2011:247-256.
    [132]Orr G B, Muller K R. Neural Networks:Tricks of the Trade, this book is an outgrowth of a 1996 NIPS workshop[M]. Springer-Verlag,1998.
    [133]G.E.Hinton,Training products of experts by minimizing contrastive divergence,Neural Computation,2002,14(8):1711-1800.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700