Using Lexical and Thematic Knowledge for Name Disambiguation
详细信息    查看全文
  • 作者:Jinpeng Wang (21)
    Wayne Xin Zhao (21)
    Rui Yan (21)
    Haitian Wei (22)
    Jian-Yun Nie (23)
    Xiaoming Li (21)
  • 关键词:Name Disambiguation ; Lexical and Thematic Knowledge
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2012
  • 出版时间:2012
  • 年:2012
  • 卷:7675
  • 期:1
  • 页码:89-102
  • 全文大小:249KB
  • 参考文献:1. Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: Proc. COLING 2010, pp. 277鈥?85 (2010)
    2. Bunescu, R.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, pp. 9鈥?6 (2006)
    3. Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proc. EMNLP-CoNLL 2007, pp. 708鈥?16 (June 2007)
    4. Gottipati, S., Jiang, J.: Linking entities to a knowledge base with query expansion. In: Proc. EMNLP 2011, pp. 804鈥?13 (2011)
    5. Pilz, A., Paa脽, G.: From names to entities using thematic context distance. In: Proc. CIKM 2011, pp. 857鈥?66 (2011)
    6. Kozareva, Z., Ravi, S.: Unsupervised name ambiguity resolution using a generative model. In: Proc. EMNLP 2011, pp. 105鈥?12 (2011)
    7. Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proc. CIKM 2007, pp. 233鈥?42 (2007)
    8. Medelyan, O., Witten, I.H., Milne, D.: Topic indexing with wikipedia. In: Proc. AAAI 2008 (2008)
    9. Milne, D., Witten, I.H.: Learning to link with wikipedia. In: Proc. CIKM 2008, pp. 509鈥?18 (2008)
    10. Han, X., Sun, L.: A generative entity-mention model for linking entities with knowledge base. In: Proc. HLT 2011, pp. 945鈥?54 (2011)
    11. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst.聽22(2), 179鈥?14 (2004) CrossRef
    12. Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: Proc. SIGIR 1999, pp. 222鈥?29 (1999)
    13. Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: Proc. SIGIR 2008, pp. 475鈥?82 (2008)
    14. Gao, J., He, X., Nie, J.Y.: Clickthrough-based translation models for web search: from word models to phrase models. In: Proc. CIKM 2010, pp. 1139鈥?148 (2010)
    15. Lu, Y., Zhai, C., Sundaresan, N.: Rated aspect summarization of short comments. In: Proc. WWW 2009, pp. 131鈥?40 (2009)
    16. Kullback, S., Leibler, R.A.: On information and sufficiency. The Annals of Mathematical Statistics聽22(1), 79鈥?6 (1951) CrossRef
    17. Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proc. UAI 2004, pp. 487鈥?94 (2004)
    18. Heng, J., Ralph, G., Hoa, T.D., Kira, G., Joe, E.: Overview of the tac 2010 knowledge base population track. In: Proc. TAC 2010 (2010)
    19. McCallum, A.K.: Mallet: A machine learning for language toolkit (2002), http://mallet.cs.umass.edu
  • 作者单位:Jinpeng Wang (21)
    Wayne Xin Zhao (21)
    Rui Yan (21)
    Haitian Wei (22)
    Jian-Yun Nie (23)
    Xiaoming Li (21)

    21. Department of Computer Science and Technology, Peking University, China
    22. School of International Trade and Economics, University of International Business and Economics, China
    23. Dpartement d鈥橧nformatique et de Recherche Oprationnelle, Universit de Montral, Montreal, H3C 3J7, Qubec, Canada
  • ISSN:1611-3349
文摘
In this paper we present a novel approach to disambiguate names based on two different types of semantic information: lexical and thematic. We propose to use translation-based language models to resolve the synonymy problem in every word match, and to use topic-based ranking function to capture rich thematic contexts for names. We test three ranking functions that combine lexical relatedness and thematic relatedness. The experiments on Wikipedia data set and TAC-KBP 2010 data set show that our proposed method is very effective for name disambiguation.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700