一种基于语义关系图的词语语义相关度计算模型
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Model for Calculating Semantic Relatedness of Words Considering Semantic Relationship Graph
  • 作者:张仰森 ; 郑佳 ; 李佳媛
  • 英文作者:ZHANG Yang-Sen;ZHENG Jia;LI Jia-Yuan;Institute of Intelligent Information Processing, Beijing Information Science and Technology University;
  • 关键词:语义相关度 ; 语义关系图 ; HowNet ; 依存语义关系 ; 语义相似度
  • 英文关键词:Semantic relatedness;;semantic relationship graph;;HowNet;;dependency semantic relation;;semantic similarity
  • 中文刊名:MOTO
  • 英文刊名:Acta Automatica Sinica
  • 机构:北京信息科技大学智能信息处理研究所;
  • 出版日期:2018-01-15
  • 出版单位:自动化学报
  • 年:2018
  • 期:v.44
  • 基金:国家自然科学基金(61370139,61602044)资助~~
  • 语种:中文;
  • 页:MOTO201801008
  • 页数:12
  • CN:01
  • ISSN:11-2109/TP
  • 分类号:89-100
摘要
词语的语义计算是自然语言处理领域的重要问题之一,目前的研究主要集中在词语语义的相似度计算方面,对词语语义的相关度计算方法研究不够.为此,本文提出了一种基于语义词典和语料库相结合的词语语义相关度计算模型.首先,以HowNet和大规模语料库为基础,制定了相关的语义关系提取规则,抽取了大量的语义依存关系;然后,以语义关系三元组为存储形式,构建了语义关系图;最后,采用图论的相关理论,对语义关系图中的语义关系进行处理,设计了一个基于语义关系图的词语语义相关度计算模型.实验结果表明,本文提出的模型在词语语义相关度计算方面具有较好的效果,在Word Similarity-353数据集上的斯皮尔曼等级相关系数达到了0.5358,显著地提升了中文词语语义相关度的计算效果.
        Word semantic computation is one of the important issues in nature language processing. Current studies usually focus on semantic similarity computation of words, not paying enough attention to the semantic relatedness computation. For this reason, we present a word semantic relatedness calculation model based on semantic dictionary and corpus. First of all, the semantic extraction rules are formulated with "HowNet" and corpus, and a large number of semantic dependency relations are extracted based on these rules. Then, a semantic relationship graph is constructed by storing the semantic relationship triplet tuple. At last, graph theory is used to process the semantic relation in the semantic relationship graph and a semantic relatedness calculation model is designed by means of the semantic relationship graph. Experimental results show that this method has a better performance in word semantic relatedness computation,the Spearman rank correlation on the Word Similarity-353 dataset being up to 0.5358, a significant efficiency improvement of semantic relatedness computation of Chinese words.
引文
1 Gracia J,Mena E.Web-based measure of semantic relatedness.In:Proceedings of the 9th International Conference on Web Information Systems Engineering.Auckland,New Zealand:Springer,2008.136-150
    2 Resnik P.Using information content to evaluate semantic similarity in a taxonomy.In:Proceedings of the 14th International Joint Conference on Artificial Intelligence.Montreal,Quebec,Canada:Morgan Kaufmann Publishers Inc.,1995.448-453
    3 Liu H W,Xu J J,Zheng K,Liu C F,Du L,Wu X.Semantic-aware query processing for activity trajectories.In:Proceedings of the 10th ACM International Conference on Web Search and Data Mining.Cambridge,UK:ACM,2017.283-292
    4 Ensan F,Bagheri E.Document retrieval model through semantic linking.In:Proceedings of the 10th ACM International Conference on Web Search and Data Mining.Cambridge,UK:ACM,2017.181-190
    5 Liu Kang,Zhang Yuan-Zhe,Ji Guo-Liang,Lai Si-Wei,Zhao Jun.Representation learning for question answering over knowledge base:an overview.Acta Automatica Sinica,2016,42(6):807-818(刘康,张元哲,纪国良,来斯惟,赵军.基于表示学习的知识库问答研究进展与展望.自动化学报,2016,42(6):807-818)
    6 Zhang Y M,Iwaihara M.Evaluating semantic relatedness through categorical and contextual information for entity disambiguation.In:Proceedings of the IEEE/ACIS 15th International Conference on Computer and Information Science.Okayama,Japan:IEEE,2016.1-6
    7 Li C,Bendersky M,Garg V,Ravi S.Related event discovery.In:Proceedings of the 10th ACM International Conference on Web Search and Data Mining.Cambridge,UK:ACM,2017.355-364
    8 Arab M,Jahromi M Z,Fakhrahmad S M.A graph-based approach to word sense disambiguation.An unsupervised method based on semantic relatedness.In:Proceedings of the 24th Iranian Conference on Electrical Engineering.Shiraz,Iran:IEEE,2016.250-255
    9 Xin Yu,Xie Zhi-Qiang,Yang Jing.Semantic community detection research based on topic probability models.ActaAutomatica Sinica,2015,41(10):1693-1710(辛宇,谢志强,杨静.基于话题概率模型的语义社区发现方法研究.自动化学报,2015,41(10):1693-1710)
    10 Budanitsky A,Hirst G.Evaluating Word Net-based measures of lexical semantic relatedness.Computational Linguistics,2006,32(1):13-47
    11 Taieb M A,Aouicha M B,Hamadou A B.A new semantic relatedness measurement using Word Net features.Knowledge and Information Systems,2014,41(2):467-497
    12 Liu Qun,Li Su-Jian.Word similarity computing based on How Net.Computational Linguistics,2002,7(2):59-76(刘群,李素建.基于《知网》的词汇语义相似度计算.中文计算语言学,2002,7(2):59-76)
    13 Zhang P Y.A How Net-based semantic relatedness kernel for text classification.TELKOMNIKA,2013,11(4):1909-1915
    14 Zhang G P,Yu C,Cai D F,Song Y,Sun J G.Research on concept-sememe tree and semantic relevance computation.In:Proceedings of the 20th Pacific Asia Conference on Language,Information and Computation.Wuhan,China:Tsinghua University Press,2006.398-402
    15 Tian Xuan,Du Xiao-Yong,Li Hai-Hua.Computing termconcept association in semantic-based query expansion.Journal of Software,2008,19(8):2043-2053(田萱,杜小勇,李海华.语义查询扩展中词语–概念相关度的计算.软件学报,2008,19(8):2043-2053)
    16 Ye F Y,Zhang F,Luo X F,Xu L Y.Research on measuring semantic correlation based on the Wikipedia hyperlink network.In:Proceedings of the IEEE/ACIS 12th International Conference on Computer and Information Science.Niigata,Japan:IEEE,2013.309-314
    17 Wan Fu-Qiang,Wu Yun-Fang.Computing lexical semantic relatedness with Chinese Wikipedia.Journal of Chinese Information Processing,2013,27(6):31-38(万富强,吴云芳.基于中文维基百科的词语语义相关度计算.中文信息学报,2013,27(6):31-38)
    18 Wang Hong-Xian,Zhou Qiang,Wu Xiao-Jun.The automatic construction of lexical semantic relationship graph based on How Net.Journal of Chinese Information Processing,2008,22(5):90-96(王宏显,周强,邬晓钧.《知网》语义关系图的自动构建.中文信息学报,2008,22(5):90-96)
    19 Zheng Li-Juan,Shao Yan-Qiu,Yang Er-Hong.Analysis of the non-projective phenomenon in Chinese semantic dependency graph.Journal of Chinese Information Processing,2014,28(6):41-47(郑丽娟,邵艳秋,杨尔弘.中文非投射语义依存现象分析研究.中文信息学报,2014,28(6):41-47)
    20 Zhang Yang-Sen,Zheng Jia.Study of semantic error detecting method for Chinese text.Chinese Journal of Computers,2016,39,Online Publishing No.122(张仰森,郑佳.中文文本语义错误侦测方法研究.计算机学报,2016,39,在线出版号No.122)
    21 Zhang Hu-Yin,Liu Dao-Bo,Wen Chun-Yan.Research on improved algorithm of word semantic similarity based on How Net.Computer Engineering,2015,41(2):151-156(张沪寅,刘道波,温春艳.基于《知网》的词语语义相似度改进算法研究.计算机工程,2015,41(2):151-156)
    22 Finkelstein L,Gabrilovich E,Matias Y,Rivlin E,Solan Z,Wolfman G,Ruppin E.Placing search in context:the concept revisited.ACM Transactions on Information Systems,2002,20(1):116-131
    23 Wang Xiang,Jia Yan,Zhou Bin,Ding Zhao-Yun,Liang Zheng.Computing semantic relatedness using Chinese Wikipedia links and taxonomy.Journal of Chinese Computer Systems,2011,32(11):2237-2242(汪祥,贾焰,周斌,丁兆云,梁政.基于中文维基百科链接结构与分类体系的语义相关度计算.小型微型计算机系统,2011,32(11):2237-2242)
    24 Liu B Q,Feng J,Liu M,Liu F,Wang X L,Li P.Computing semantic relatedness using a word-text mutual guidance model.In:Proceedings of the 3rd CCF Conference on Natural Language Processing and Chinese Computing.Shenzhen,China:Springer,2014.67-78

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700