Position-Aligned Translation Model for Citation Recommendation
详细信息    查看全文
  • 作者:Jing He (20)
    Jian-Yun Nie (20)
    Yang Lu (21)
    Wayne Xin Zhao (21)
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2012
  • 出版时间:2012
  • 年:2012
  • 卷:7608
  • 期:1
  • 页码:264-276
  • 全文大小:240KB
  • 参考文献:1. Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: SIGIR 1996, pp. 4-1 (1996)
    2. Karimzadehgan, M., Zhai, C.: Estimation of statistical translation models based on mutual information for ad hoc information retrieval. In: Proceeding of SIGIR 2010, pp. 323-30 (2010)
    3. Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: Proceedings of SIGIR 1999, pp. 222-29 (1999)
    4. Nie, J.Y., Simard, M., Isabelle, P., Durand, R.: Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web. In: Proceedings SIGIR 1999, pp. 74-1 (1999)
    5. Lavrenko, V., Choquette, M., Croft, W.B.: Cross-lingual relevance models. In: Proceedings of SIGIR 2002, pp. 175-82 (2002)
    6. Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: Proceeding of SIGIR 2008, pp. 475-82 (2008)
    7. Murdock, V., Croft, W.B.: A translation model for sentence retrieval. In: Proceedings of HLT 2005, pp. 684-91. Association for Computational Linguistics, Stroudsburg (2005) CrossRef
    8. Gao, J., He, X., Nie, J.Y.: Clickthrough-based translation models for web search: from word models to phrase models. In: Proceedings of CIKM 2010, pp. 1139-148 (2010)
    9. Metzler, D., Bernstein, Y., Croft, W.B., Moffat, A., Zobel, J.: Similarity measures for tracking information flow. In: CIKM 2005, pp. 517-24 (2005)
    10. Lu, Y., He, J., Shan, D., Yan, H.: Recommending citations with translation model. In: Proceedings of CIKM 2011, pp. 2017-020 (2011)
    11. Fung, P., Cheung, P.: Mining very-non-parallel corpora: Parallel sentence and lexicon extraction via bootstrapping and em. In: Proceedings of EMNLP 2004, pp. 57-3 (2004)
    12. Zhao, B., Vogel, S.: Adaptive parallel sentences mining from web bilingual news collection. In: Proceedings of ICDM 2002, p. 745 (2002)
    13. Lv, Y., Zhai, C.: Positional language models for information retrieval. In: Proceedings of SIGIR 2009, pp. 299-06 (2009)
    14. Wang, M., Si, L.: Discriminative probabilistic models for passage based retrieval. In: Proceedings of SIGIR 2008, pp. 419-26. ACM, New York (2008) CrossRef
    15. Hearst, M.A., Plaunt, C.: Subtopic structuring for full-length document access. In: Proceedings of SIGIR 2003, pp. 59-8 (1993)
    16. Bestgen, Y.: Improving text segmentation using latent semantic analysis: A reanalysis of Choi, Wiemer-hastings, and Moore (2001); Comput. Linguist.?32, 5-2 (2006)
    17. Misra, H., Yvon, F., Cappé, O., Jose, J.: Text segmentation: A topic modeling perspective. Inf. Process. Manage.?47, 528-44 (2011) CrossRef
    18. Callan, J.P.: Passage-level evidence in document retrieval. In: Proceedings SIGIR 1994, pp. 302-10 (1994)
    19. Zobel, J., Moffat, A., Wilkinson, R., Sacks-Davis, R.: Efficient retrieval of partial documents. Inf. Process. Manage.?31, 361-77 (1995) CrossRef
    20. He, Q., Pei, J., Kifer, D., Mitra, P., Giles, L.: Context-aware citation recommendation. In: Proceedings of WWW 2010, pp. 421-30 (2010)
    21. McNee, S.M., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S.K., Rashid, A.M., Konstan, J.A., Riedl, J.: On the recommending of citations for research papers. In: Proceedings of CSCW 2002, pp. 116-25 (2002)
    22. Zhou, D., Zhu, S., Yu, K., Song, X., Tseng, B.L., Zha, H., Giles, C.L.: Learning multiple graphs for document recommendations. In: Proceeding of WWW 2008, pp. 141-50 (2008)
    23. Nascimento, C., Laender, A.H., da Silva, A.S., Gon?alves, M.A.: A source independent framework for research paper recommendation. In: Proceedings of JCDL 2011, pp. 297-06 (2011)
    24. Kodakateri Pudhiyaveetil, A., Gauch, S., Luong, H., Eno, J.: Conceptual recommender system for citeseerx. In: Proceedings of RecSys 2009, pp. 241-44 (2009)
  • 作者单位:Jing He (20)
    Jian-Yun Nie (20)
    Yang Lu (21)
    Wayne Xin Zhao (21)

    20. Université de Montréal, Canada
    21. Peking University, China
文摘
The goal of a citation recommendation system is to suggest some references for a snippet in an article or a book, and this is very useful for both authors and the readers. The citation recommendation problem can be cast as an information retrieval problem, in which the query is the snippet from an article, and the relevant documents are the cited articles. In reality, the citation snippet and the cited articles may be described in different terms, and this makes the citation recommendation task difficult. Translation model is very useful in bridging the vocabulary gap between queries and documents in information retrieval. It can be trained on a collection of query and document pairs, which are assumed to be parallel. However, such training data contains much noise: a relevant document usually contains some relevant parts along with irrelevant ones. In particular, the citation snippet may only mention only some parts of the cited article’s content. To cope with this problem, in this paper, we propose a method to train translation models on such noisy data, called position-aligned translation model. This model tries to align the query to the most relevant parts of the document, so that the estimated translation probabilities could rely more on them. We test this model in a citation recommendation task for scientific papers. Our experiments show that the proposed method can significantly improve the previous retrieval methods based on translation models.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700