Extract conceptual graphs from plain texts in patent claims
详细信息查看全文 | 推荐本文 |
摘要
This paper develops techniques to extract conceptual graphs from a patent claim using syntactic information (POS, and dependency tree) and semantic information (background ontology). Due to plenteous technical domain terms and lengthy sentences prevailing in patent claims, it is difficult to apply a NLP Parser directly to parse the plain texts in the patent claim. This paper combines techniques such as finite state machines, Part-Of-Speech tags, conceptual graphs, domain ontology and dependency tree to convert a patent claim into a formally defined conceptual graph. The method of a finite state machine splits a lengthy patent claim sentence into a set of shortened sub-sentences so that the NLP Parser can parse them one by one effectively. The Part-Of-Speech and dependency tree of a patent claim are used to build the conceptual graph based on the pre-established domain ontology. The result shows that 99%sub-sentences split from 1700 patent claims can be efficiently parsed by the NLP Parser. There are two types of nodes in a conceptual graph, the concept and the relation nodes. Each concept or relation can be extracted directly from a patent claim and each relation can link with a fixed number of concepts in a conceptual graph. From 100 patent claims, the average precision and recall of a concept class mapping from the patent claim to domain ontology are 96%and 89%, respectively, and the average precision and recall for Real relation class mapping are 97%and 98%, respectively. For the concept linking of a relation, the average precision is 79%. Based on the extracted conceptual graphs from patents, it would facilitate automated comparison and summarization among patents for judgment of patent infringement.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700