Enhancing the sentence similarity measure by semantic and syntactico-semantic knowledge

详细信息查看全文

作者：Wafa Wali ; Bilel Gargouri ; Abdelmajid Ben Hamadou
关键词：Sentence similarity ; Lexical semantic knowledge ; Syntactico ; semantic knowledge ; LMF ; ISO 24613 ; Standardized dictionaries
刊名：Vietnam Journal of Computer Science
出版年：2017
出版时间：February 2017
年：2017
卷：4
期：1
页码：51-60
全文大小：2289KB
刊物类别：Information Systems and Communication Service; Artificial Intelligence (incl. Robotics); Computer Ap
刊物主题：Information Systems and Communication Service; Artificial Intelligence (incl. Robotics); Computer Applications; e-Commerce/e-business; Computer Systems Organization and Communication Networks; Computa
出版者：Springer Berlin Heidelberg
ISSN：2196-8896
卷排序：4

文摘

The measure of sentence similarity is useful in various research fields, such as artificial intelligence, knowledge management, and information retrieval. Several methods have been proposed to measure the sentence similarity based on syntactic and/or semantic knowledge. Most proposals are evaluated on English sentences where the accuracy can decrease when these proposals are applied to other languages. Moreover, the results of these methods are unsatisfactory, as much relevant semantic knowledge, such as semantic class, thematic role and syntactico-semantic knowledge like the semantic predicates, are not taken into account. We must acknowledge that this kind of knowledge is rare in most of the lexical resources. Recently, the International Organization for Standardization (ISO) has published the Lexical Markup Framework (LMF) ISO-24613 norm for the development of lexical resources. This norm provides, for each meaning of a lexical entry, all the semantic and syntactico-semantic knowledge in a fine structure. Profiting from the availability of LMF-standardized dictionaries, we propose, in this paper, a generic method that enhances the measure of sentence similarity by applying semantic and syntactico-semantic knowledge. An experiment was carried out on Arabic, as this language is processed within our research team and an LMF-standardized Arabic dictionary is at hand where the semantic and the syntactico-semantic knowledge are accessible and well structured. Moreover, the experiments yielded better results, showing a high correlation with human ratings.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700