Experts community memory for entity similarity functions recommendation
详细信息    查看全文
文摘
Similarity search (or similar entity search) is the process of finding all entities similar to a given entity (e.g., a person, a document, or an image). Although many techniques for similarity analysis have been proposed in the past, little work has been done on the question of which of the presented techniques are most suitable for a given similarity search task. Knowing the right similarity function is important as the task is highly domain- and data-dependent. In this article, we provide an approach for recommending which similarity functions (e.g., edit distance or jaccard similarity) should be used for measuring the similarity between two entities. The approach employs an incremental knowledge acquisition technique for capturing domain experts’ knowledge about similarity functions and their usage contexts (e.g., entity class, attribute name and some keywords). In addition, for situations where domain experts have little or no knowledge about datasets, we analyze the features of the datasets and then suggest similarity functions based on the identified features. We also demonstrate the feasibility and effectiveness of our proposed approach on several real-world datasets from different domains.
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.