Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification
详细信息    查看全文
  • 关键词:Subtopic mining ; Word embedding ; Diversification ; Novelty
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9994
  • 期:1
  • 页码:308-314
  • 丛书名:Information Retrieval Technology
  • ISBN:978-3-319-48051-0
  • 卷排序:9994
文摘
Understanding the users’ search intents through mining query subtopic is a challenging task and a prerequisite step for search diversification. This paper proposes mining query subtopic by exploiting the word embedding and short-text similarity measure. We extract candidate subtopic from multiple sources and introduce a new way of ranking based on a new novelty estimation that faithfully represents the possible search intents of the query. To estimate the subtopic relevance, we introduce new semantic features based on word embedding and bipartite graph based ranking. To estimate the novelty of a subtopic, we propose a method by combining the contextual and categorical similarities. Experimental results on NTCIR subtopic mining datasets turn out that our proposed approach outperforms the baselines, known previous methods, and the official participants of the subtopic mining tasks.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700