结构学习中的辅助问题研究

英文题名：Researches on the Auxiliary Problems in Structural Learning
作者：张韬政
论文级别：博士
学科专业名称：信号与信息处理
中文关键词：结构学习 ; 交互结构最优化 ; 辅助问题 ; 目标问题 ; 相关性正交性 ; 领域自适应性质
英文关键词：structural learning ; Alternating Structure Optimization(ASO) ; Auxiliary Problems (APs) ; Target Problems (TPs) ; relevancy ; orthogonality ; domain adaptation
学位年度：2011
导师：钟义信
学科代码：081002
学位授予单位：北京邮电大学
论文提交日期：2011-07-03

摘要

多任务学习是借鉴式学习中的一种,是指以某一领域为背景,利用该领域内相关任务提供的知识来解决新的任务的学习方式。该研究领域近年来蓬勃发展,尤其是最近提出的结构学习框架将交互结构最优化(Alternating Structure Optimization, ASO)算法用于学习多个任务所共享的预测空间结构,在很多应用中都取得了较好的实验效果。在结构学习框架中,辅助问题是影响最终实验性能的关键因素,然而据我们所知,目前尚鲜有相关研究。
     本文从辅助问题和目标问题的相关性、辅助问题选择的正交性角度进行研究,得出了一些有益的结论；并进一步讨论了结构学习框架的领域自适应性质。此外,本文还在经过凸优化改进后的cASO算法上验证了上述有关辅助问题改进的若干原则及性质。最后,本文将该技术运用到自然语言处理领域一个具体的“汉语语义角色标注”任务中,结果证明上述得到的结论是有效的。本文的主要创新工作如下：
     1、着眼于结构学习框架中辅助问题和目标问题的相关性的度量标准。以汉语语块分析为例,构造了四种不同类型的辅助问题,进行了大量的实验及分析。在结构学习中构造辅助问题的相关性原则上得出了一些有益的结论,即对头名词进行判定的辅助问题与目标问题的相关性要大于其他辅助问题与目标问题的相关性。
     2、提出了结构学习框架中选择辅助问题的正交性原则。理论分析和实验结果均表明,如果多种不同类型的辅助问题的权值矩阵是正交的或者是近似正交的,那么它们的多元组合的实验效果一般情况下要比组合之前好,至少和组合之前的相当。即使固定辅助问题的总数,只要多元组合中各种类型的辅助问题的比例取得较为合适,上述结论仍然是成立的。此外,本文还给出了选择合适的辅助问题总数的经验性结论。简言之,正交性原则是可信的。
     3、重点研究了结构学习框架的领域自适应性质。理论分析和实验结果都表明,若构造辅助问题的未标注数据来自分布不同的多个源领域,即使这些源领域和目标领域的数据分布都不相同,结构学习框架的分类结果依然是令人满意的。综上所述,结构学习框架具有良好的领域自适应性质。
     4、就经过凸改进后的交互结构最优化算法(cASO, convex ASO)进行了相关研究。实验结果表明,本文提出的构造辅助问题的相关性原则、正交性原则对于cASO算法仍然是成立的,cASO算法同样具有领域自适应性质。且cASO算法的分类效果优于相同实验设置下的ASO算法。
     5、语义角色标注任务在问答系统、机器翻译、信息抽取等领域有着广泛的应用。本文将结构学习框架及构造辅助问题的相关性、正交性原则应用到该任务中。实验结果表明,上述原则是合理可行的。
Multi-task learning refers to a methodology, which takes a certain domain as background and utilizes the knowledge derived from the related tasks to solve the target problems in that domain. It belongs to the field of transfer learning and develops vigorously in recent years. In particular, structural learning framework applies the algorithm of Alternating Structure Optimization (ASO) to learn the structures of predictors space shared by multiple related tasks. The experimental results are satisfying in lots of applications. However, the ultimate experimental results largely depend on whether the auxiliary problems (APs) are good or not. To our knowledge, there exist few researches on it.
     We focus on the principles called principle of relevancy and principle of orthogonality for APs selection and then obtain some valuable conclusions. We also discuss the property of domain adaptation for structural learning. Furthermore, we validate that above principles and property are feasible on the improved cASO (convex ASO) algorithm. Finally, we apply these principles and property to a specific natural language processing (NLP) task,"Chinese semantic role labeling". The experimental results demonstrate that these conclusions are credible and feasible. In addition, the major innovation points in this paper are as follows.
     1. We focus on the metrics of principle of relevancy between APs and TPs in structural learning framework. It is researched by taking example of Chinese syntactic chunking. Four types of APs are created. Through substantive experiments and analyses, some valuable conclusions with regard to it are obtained. That is, if the APs are predicting head nouns of the sentences, the relevancy of them is greater than that of other types of APs.
     2. We propose a new principle called principle of orthogonality for APs selection. We first give theoretical analyses on it, and then carry experiments on the task of Chinese syntactic chunking. They both validate the following facts. If the weight matrices of different types of APs are orthogonal or approximately orthogonal, the multi-combinations of them perform better than or equal to any components of them. Even if the total amounts of APs are given, we can also obtain the same conclusion provided the ratios of different types of APs are appropriate in the multi-combinations. Moreover, we draw some conclusions on how to select appropriate total amount of APs. In short, the principle of orthogonality is credible.
     3. We study the property of domain adaptation in structural learning. Theoretical analyses and experimental results both indicate that the performances are still satisfying if unlabeled data (APs) come from different source domains. Even if the data distributions of source domains and target domain (TPs) are quite different, that conclusion is still established. Namely, there exists the property of domain adaptation in structural learning.
     4. We do some researches on the convex ASO (cASO) algorithm. The experimental results show that the principles of relevancy and orthogonality proposed by this paper are still established for cASO algorithm. There also exists the property of domain adaptation in it. Moreover, the performances of cASO algorithm are superior to those of ASO algorithm in the same experimental settings.
     5. The technique of semantic role labeling (SRL) has a wide range of applications in NLP, such as question answering (QA), machine translation (MT), information extraction (IE). We apply the principles of relevancy and orthogonality proposed above to it. Experimental results demonstrate that these principles are reasonable and credible.

引文

[1]Caruana R. Multi-task learning. Machine Learning,28,1997,41-75.
    [2]戴文渊.基于实例和特征的迁移学习算法研究.[硕士学位论文]上海交通大学,上海,2008.
    [3]陈德品.基于迁移学习的跨领域排序学习算法研究.[博士学位论文]中国科学技术大学,合肥,2010.
    [4]Heisele B., Serre T., Pontil M., et al. Categorization by learning and combining object parts. Advances in Neural Information Processing Systems,2001,1239-1245.
    [5]Ando R. K., Zhang T. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research,6,2005,1817-1853.
    [6]Ando R. K. BioCreative II gene mention tagging system at IBM Watson. In Proceedings of the 2nd BioCreative Challenge Evaluation Workshop,2007.
    [7]Xue Y., Liao X., Carin, L., et al. Multi-task learning for classification with dirichlet process priors. Journal of Machine Learning Research,8,2007,35-63.
    [8]Yu S., Tresp V, and Yu K. Robust multi-task learning with t-processes. In Proceedings of International Conference on Machine Learning,2007,1103-1110.
    [9]Argyriou, A., Micchelli, C. A., Pontil M., et al. A spectral regularization framework for multi-task structure learning. Advances in Neural Information Processing Systems,2007,1-8.
    [10]Baxter J. A model of inductive bias learning. Journal of Artificial Intellegence Research,12, 2000,149-198.
    [11]Bakker B., Heskes T. Task clustering and gating for bayesian multi-task learning. Journal of Machine Learning Research,4,2003,83-99.
    [12]Schwaighofer A., Tresp V., and Yu K. Learning gaussian process kernels via hierarchical bayes. Advances in Neural Information Processing Systems 17,2004,1209-1216.
    [13]Yu K., Tresp V, and Schwaighofer A. Learning gaussian processes from multiple tasks. In Proceedings of International Conference on Machine Learning,2005,1012-1019.
    [14]Zhang J., Ghahramani Z., and Yang Y. Learning multiple related tasks using latent independent component analysis. Advances in Neural Information Processing Systems 18,2005, 1585-1592.
    [15]Lawrence N. D., Platt J. C. Learning to learn with the informative vector machine. In Proceedings of International Conference on Machine Learning,2004.
    [16]Argyriou A., Evgeniou T., and Pontil M. Convex multi-task feature learning. Machine Learning,73,2008,243-272.
    [17]Obozinski G., Taskar B., and Jordan M. I. Multi-task feature selection. Technical report, Dept. of Statistics, UC Berkeley,2006.
    [18]Argyriou A., Evgeniou T., and Pontil M. Multi-task feature learning. Advances in Neural Information Processing Systems 19,2007.
    [19]Jacob L., Bach F., and Vert J. P. Clustered multi-task learning:A convex formulation. Advances in Neural Information Processing Systems,2008,745-752.
    [20]Jebara T. Multi-task feature and kernel selection for svms. In Proceedings of Twenty-First International Conference on Machine Learning,2004,433-440.
    [21]Amit Y, Fink M., Srebro N., et al. Uncovering shared structures in multiclass classification. In Proceedings of International Conference on Machine Learning,2007,17-24.
    [22]Zhou D., Olivier B., Thomas N. L., et al. Learning with Local and Global Consistency. Advances in Neural Information Processing System 16,2004.321-328.
    [23]Zhu X. Semi-supervised Learning Literature Survey. University of Wisconsin-Madison, 2008.
    [24]Zhu X. Semi-supervised learning with graphs. Carnegie Mellon University,2005.
    [25]Zhu X., Zoubin G. Learning from labeled and unlabeled data with label propagation. Technical Report, CMU-CALD-02-107, Carnegie Mellon University,2002.
    [26]Chapelle O., Zien A. Semi-supervised classification by low density separation. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics,2005, 57-64.
    [27]Blum A., Mitchell T. Combining labeled and unlabeled data with co-training. In Proceedings of the Workshop on Computational Learning Theory,1998,92-100.
    [28]Yarowsky D. Unsupervised word sense disambiguation rivaling supervised methods. In Hans Uszkoreit eds. Proceedings of the 33rd annual meeting on Association for Computational Linguistics, Massachusetts,1995,189-196.
    [29]Nigam K., McCallum A. K., Thrun S., et al. Text classification from labeled and unlabeled documents using EM. Machine learning,39 (2-3),2000,103-134.
    [30]Zhu X., Ghahramani Z., and Lafferty J. Semi-supervised learning using gaussian fields and harmonic functions. In Tom Fawcett eds. Proceedings of the Twentieth International Conference on Machine Learning, Washington, DC,2003,912-919.
    [31]Ando R. K., Zhang T. Two-view feature generation model for semi-supervised learning. In Zoubin Ghahramani eds. Proceedings of the 24th International Conference on Machine Learning, Oregon,2007,25-32.
    [32]Wold S. Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2(1),1987,37-52.
    [33]Chen J., Tang L., Liu J., et al. A convex formulation for learning shared structures from multiple tasks. In Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada,2009.
    [34]Ando R. K., Zhang T. High performance semi-supervised learning for text chunking. In Kevin Knight eds. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Michigan,2005,1-9.
    [35]Ando R. K. Applying alternating structure optimization to word sense disambiguation. In Antal van den Bosch eds. Proceedings of the 10th Conference on Computational Natural Language Learning, New York City,2006,77-84.
    [36]Liu C., Ng H. T. Learning predictive structures for semantic role labeling of NomBank. In John Carroll eds. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague,2007,208-215.
    [37]Blitzer J. Domain adaptation of natural language processing systems. [Ph.D. Dissertation]. University of Pennsylvania, Pennsylvania,2007.
    [38]Zhang T., Wang X., and Tong H. Researches on combinations of auxiliary problems in ASO (Alternating Structure Optimization) algorithm. In Proceedings of 2010 Third Pacific-Asia Conference on Web Mining and Web-based Application (WMWA 2010), Guilin, China,2010, 183-190.
    [39]Zhang T., Wang X., and Tong H. Domain adaptation in Alternating Structure Optimization (ASO) algorithm. In Proceedings of the Eleventh IASTED International Conference on Artificial Intelligence and Applications (AIA 2011), Innsbruck, Austria,2011,50-55.
    [40]白雪,张韬政,何赛克等.基于ASO的汉语语块分析.中国人工智能学会第13届年会。见：《中国人工智能进展：2009》,2009.
    [41]He S., Zhang T., Bai X., et al. Incorporating multi-task learning in conditional random fields for chunking in semantic role labeling. In Proceedings of 2009 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'09), Dalian, China, 2009,47-51.
    [42]白雪.基于结构学习的语义角色标注.[硕士学位论文]北京邮电大学,北京,2010.
    [43]何赛克.语义角色标注中的关键技术研究——多任务学习方法在组块分析中的应用.[硕士学位论文]北京邮电大学,北京,2010.
    [44]张韬政,王小捷,仝辉.交互结构最优化算法中辅助问题的相关性研究.北邮学报,34(4),2011,16-20.
    [1]Ando R. K., Zhang T. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research,6,2005,1817-1853.
    [2]Liu C., Ng H. T. Learning predictive structures for semantic role labeling of NomBank. In John Carroll eds. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague,2007,208-215.
    [3]Berwick R., Abney S., and Tenny C. Principle-based parsing. Kluwer Academic Publishers, 1991,257-278.
    [4]Ramshaw L. A., Marcus M. P. Text chunking using transformation-based learning. In Proceedings of the Third ACL Workshop on Very Large Corpora, Cambridge MA, USA,1995, 82-94.
    [5]Sang E. F. T. K., Buchholz S. Introduction to the CoNLL-2000 shared task:Chunking. In Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal,2000,127-132.
    [6]Sang E. F. T. K., Dejean H. Introduction to the CoNLL-2001 shared task:Clause identification. In Proceedings of CoNLL-2001, Toulouse, France,2001,53-57.
    [7]Carreras X., Marquez L. Introduction to the CoNLL-2004 shared task:Semantic role labeling. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL), Boston, MA,2004,89-97.
    [8]Carreras X., Marquez L. Introduction to the CoNLL-2005 shared task:Semantic role labeling. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL),2005.
    [9]周强,任海波,詹卫东.构建大规模汉语语块库.见：全国第六届计算语言学联合学术会议,2001,102-107.
    [10]白雪,张韬政,何赛克等.基于ASO的汉语语块分析.见：中国人工智能学会第十三届年会.北京,2009.
    [11]李素建,刘群,杨志峰.基于最大熵模型的语块分析.计算机学报,26(12),2003,1734-1738.
    [12]谭咏梅,王小捷,周延泉等.使用SVMs进行汉语浅层分析.北京邮电大学学报,01,2008,5-8.
    [13]徐中一,胡谦,刘磊.基于CRF的中文语块分析.吉林大学学报,45(3),2007,416-420.
    [14]周强,赵颖泽.汉语功能块自动分析.中文信息学报,21(5),2007,18-24.
    [1]Ando R. K., Zhang T. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research,6,2005,1817-1853.
    [2]Ando R. K., Zhang T. High performance semi-supervised learning for text chunking. In Kevin Knight eds. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Michigan,2005,1-9.
    [3]Ben-David S., Schuller R. Exploiting task relatedness for multiple task learning. In:Manfred Warmuth eds. Proceedings of the 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop, Washington D.C,2003,567-580.
    [4]白雪,张韬政,何赛克等.基于ASO的汉语语块分析.见：中国人工智能学会第十三届年会.北京,2009.
    [5]王跃龙.汉语树库综述.当代语言学,11,2009,47-55.
    [6]周强,张伟,俞士汶.汉语树库的构建.中文信息学报,1997,42-51.
    [7]周强.汉语句法树库标注体系.中文信息学报,18(4),2004,1-8.
    [1]Ando R. K., Zhang T. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research,6,2005,1817-1853.
    [2]Golub G. H., Loan C. F. V. Matrix computations. Johns Hopkins University Press,1996.
    [3]Franklin J. N. Matrix Theory. General Publishing Company Ltd.,2000.
    [4]Zhang T., Wang X., and Tong H. Researches on combinations of auxiliary problems in ASO (Alternating Structure Optimization) Algorithm. In Proceedings of the Third Pacific-Asia Conference on Web Mining and Web-based Application (WMWA-3),2010,183-190.
    [5]He S., Zhang T., and Wang X., et al. Incorporating multi-task learning in conditional random fields for chunking in semantic role labeling. In Proceedings of International Conference on Natural Language Processing and Knowledge Engineering, Dalian,2009,47-51.
    [6]白雪,张韬政,何赛克等.基于ASO的汉语语块分析.见：中国人工智能学会第十三届年会,北京,2009.
    [7]Zhang T. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the 21st International Conference on Machine Learning,2004, 919-926.
    [8]Ben-David S., Schuller R. Exploiting task relatedness for multiple task learning. In proceedings of the 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop,2003,567-580.
    [1]张迪.基于跨领域分类学习的产品评论情感分析.[硕士学位论文]上海交通大学,上海,2010.
    [2]陈德品.基于迁移学习的跨领域排序学习算法研究.[博士学位论文]中国科学技术大学,合肥,2010.
    [3]Blitzer J., McDonald R., and Pereira F. Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP), Sydney, Australia,2006,120-128.
    [4]Jiang J., Zhai C. Instance weighting for domain adaptation in NLP. In Proceedings of ACL'07, Prague, Czech Republic,2007,264-271.
    [5]Huang J., Gretton A., Scholkopf B., et al. Correcting sample selection bias by unlabeled data. In Proceedings of Neural Information Processing Systems '2007, MIT Press:Cambridge, MA, 2007,601-608.
    [6]Bickel S., Bruckner M., and Scheffer T. Discriminative learning for differing training and test distributions. In Proceedings of the 24th International Conference on Machine learning, Corvalis, Oregon:ACM,2007,81-88.
    [7]Liao X., Xue Y., and Carin L. Logistic regression with an auxiliary data source. In Proceedings of the 22nd International Conference on Machine learning, Bonn, Germany:ACM, 2005,505-512.
    [8]Wu P., Dietterich T. G. Improving SVM accuracy by training on auxiliary data sources. In Proceedings of the twenty-first International Conference on Machine learning, Banff, Alberta, Canada:ACM,2004,871-878.
    [9]He S., Zhang T., Wang X., et al. Incorporating multi-task learning in conditional random fields for chunking in semantic role labeling. In Proceedings of International Conference on Natural Language Processing and Knowledge Engineering, Dalian,2009,47-51.
    [10]白雪,张韬政,何赛克等.基于ASO的汉语语块分析.见：第十三届中国人工智能年会,北京,2009.
    [11]Ben-David S., Schuller R. Exploiting task relatedness for multiple task learning. In Proceedings of the 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop,2003,567-580.
    [12]Kudo T., Matsumoto Y. Chunking with support vector machines. In Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), Pittsburgh, PA, USA,2006,1-8.
    [13]邓舒.中文短信分析及其语言模型研究.[硕士学位论文]北京邮电大学,北京,2009.
    [14]Zhang T., Damerau F., and Johnson D. Text chunking based on a generalization of winnow. Journal of Machine Learning Research,2,2002,615-637.
    [15]Zhang T. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the 21st International Conference on Machine Learning,2004, 919-926.
    [16]盛骤,谢式千和潘承毅.概率论与数理统计.第三版.高等教育出版社,2001,213-253.
    [17]Zhang T., Wang X., and Tong H. Domain Adaptation in Alternating Structure Optimization (ASO) Algorithm. In Proceedings of the Eleventh IASTED International Conference on Artificial Intelligence and Applications (AIA 2011), Innsbruck, Austria, Feb.2011,50-55.
    [1]Chen J., Tang L., Liu J., et al. A convex formulation for learning shared structures from multiple tasks. In Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada, 2009.
    [2]Ando R. K. and Zhang T. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research,6,2005,1817-1853.
    [3]Overton M. L., Womersley R. S. Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrics. Mathematical Programming,62,1993,321-357.
    [4]Boyd S., Vandenberghe L. Convex optimization. Cambridge University Press,2004.
    [5]Golub G. H., Loan C. F. V. Matrix computations. Johns Hopkins University Press,1996.
    [6]Bertsekas D. P. Nonlinear programming. Athena Scientific,1999.
    [7]盛骤,谢式千和潘承毅.概率论与数理统计.第三版.高等教育出版社,2001,213-253.
    [1]张潮生.格语法与自然语言处理.中文信息学报,2(4),1998,28-35.
    [2]Gildea D., Jurafsky D. Automatic labeling of semantic roles. Computational Linguistics, 28(3),2002,245-288.
    [3]H. Llorens, Saquete E., Navarro-Colorado B. TimeML events recognition and classification: learning CRF models with semantic roles. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010),2010,725-733.
    [4]Pradhan S., Hacioglu K., Krugler V, et al. Support vector learning for semantic argument classification. Machine Learning,60(3),2005a,11-39.
    [5]车万翔.基于核方法的语义角色标注研究.[博士学位论文]哈尔滨工业大学,哈尔滨,2008.
    [6]Swier R. S., Stevenson S. Unsupervised semantic role labelling. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing,2004,95-102.
    [7]Xue N., Palmer M. Automatic semantic role labeling for Chinese verbs. In Proceedings of IJCAI-2005,2005,1160-1165.
    [8]Sun H., Jurafsky D. Shallow semantic parsing of Chinese. In Proceedings of NAACL-2004, 2004.
    [9]刘挺,车万翔,李生.基于最大熵分类器的语义角色标注.软件学报,(03),2007,565-573.
    [10]Yu J., Fan X., Pang W. Research on semantic role labeling for event information extraction. Computer Science,35(03),1998,155-157.
    [11]Meyers A., Reeves R., Macleod C., et al. Annotating noun argument structure for NomBank. In Proceedings of LREC-2004,2004.
    [12]Baker C. F., Fillmore C. J., Lowe J. B. The Berkeley FrameNet project. In Proceedings of COLING-ACL-1998,1998,86-90.
    [13]Palmer M., Gildea D., Kingsbury P. The Proposition Bank:An annotated corpus of semantic roles. Computational Linguistics,31(1),2005,71-106.
    [14]Liu C., Ng H. T. Learning predictive structures for semantic role labeling of NomBank. In John Carroll eds. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague,2007,208-215.
    [15]Hajicova E. Prague Dependency Treebank:from analytic to tectogrammatical annotation. In Proceedings of the First Workshop on Text, Speech, Dialogue,1998,45-50.
    [16]Erk K., Kowalski A., Pado S., et al. Towards a resource for lexical semantics:a large German corpus with extensive semantic annotation. In Proceedings of ACL 2003, Sapporo,2003, 537-544.
    [17]郝晓燕,刘伟,李茹等.汉语框架语义知识库及软件描述体系.中文信息学报,21(05),2007,96-100.
    [18]陈凤仪,蔡碧芳,陈克健等.中文句结构树资料库的构建.中文计算语言学期刊,4(2),1999,87-104.
    [19]Xue N. Annotating the predicate-argument structure of Chinese nominalizations. In Proceedings of the LREC 2006, Genoa, Italy,2006,1382-1387.
    [20]Xue N., Palmer M. Annotating the Propositions in the Penn Chinese Treebank. In Proceedings of the Second SIGHAN Workshop on Chinese Language Processing. Sapporo, Japan,2003.
    [21]白雪.基于结构学习的语义角色标注.[硕士学位论文]北京邮电大学,北京,2010.
    [22]He S., Zhang T., Wang X., et al. Incorporating multi-task learning in conditional random fields for chunking in semantic role labeling. In Proceedings of Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE),2009,47-51.
    [23]何赛克.语义角色标注中的关键技术研究——多任务学习方法在组块分析中的应用.[硕士学位论文]北京邮电大学,北京,2010.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700