Exploiting Multiple Resources for Word-Phrase Semantic Similarity Evaluation
详细信息    查看全文
  • 作者:Xiaoqiang Jin (21)
    Chengjie Sun (21)
    Lei Lin (21)
    Xiaolong Wang (21)
  • 关键词:word ; phrase Semantic Similarity ; Support Vector Machine
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2014
  • 出版时间:2014
  • 年:2014
  • 卷:8801
  • 期:1
  • 页码:46-57
  • 全文大小:196 KB
  • 参考文献:1. Turney, P.D., Pantel, P.: From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research聽37(1), 141鈥?88 (2010)
    2. Wittgenstein, L.: Philosophical Investigations. Blackwell. Translated by Anscombe, G.E.M. (1953)
    3. Harris, Z.: Distributional structure. Word聽10(23), 146鈥?62 (1954)
    4. Weaver, W.: Translation. In: Locke, W., Booth, D. (eds.) Machine Translation of Languages: Fourteen Essays. MIT Press, Cambridge (1955)
    5. Firth, J.R.: A synopsis of linguistic theory 1930-1955. In: Studies in Linguistic Analysis, pp. 1鈥?2. Blackwell, Oxford (1957)
    6. Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the American Society for Information Science (JASIS)聽41(6), 391鈥?07 (1990) CrossRef
    7. Landauer, T.K., Dumais, S.T.: A solution to Plato鈥檚 problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review聽104(2), 211鈥?40 (1997) CrossRef
    8. Han, E.-H(S.), Karypis, G.: Centroid-based document classification: Analysis and experimental results. In: Zighed, D.A., Komorowski, J., 呕ytkow, J.M. (eds.) PKDD 2000. LNCC (LNAI), vol.聽1910, pp. 424鈥?31. Springer, Heidelberg (2000) CrossRef
    9. Gabrilovich, E., Markovitch, S.: Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis. In: IJCAI, vol.聽7, pp. 1606鈥?611 (2007)
    10. Gabrilovich, E., Markovitch, S.: Wikipedia-based semantic interpretation for natural language processing. Journal of Artificial Intelligence Research聽34(2), 443 (2009)
    11. Strube, M., Ponzetto, S.P.: WikiRelate! Computing semantic relatedness using Wikipedia. In: AAAI, vol.聽6, pp. 1419鈥?424 (2006)
    12. Witten, I., Milne, D.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In: Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 25鈥?0. AAAI Press, Chicago (2008)
    13. Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. arXiv preprint arXiv:1105.5444 (2011)
    14. Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, pp. 133鈥?38. Association for Computational Linguistics (1994)
    15. Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. WordNet: An Electronic Lexical Database聽49(2), 265鈥?83 (1998)
    16. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. arXiv preprint cmp-lg/9511007 (1995)
    17. Lin, D.: An information-theoretic definition of similarity. In: ICML, vol.聽98, pp. 296鈥?04 (1998)
    18. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint cmp-lg/9709008 (1997)
    19. Manning, C.D., Sch眉tze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
    20. Firth, J.R.: A Synopsis of Linguistic Theory 1930-1955. In: Studies in Linguistic Analysis, pp. 1鈥?2. Philological Society, Oxford (1957), Reprinted in Palmer, F.R. (ed.): Selected Papers of J.R. Firth 1952-1959. Longman, London (1968)
    21. Turney, P.: Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL (2001)
    22. Chen, H.H., Lin, M.S., Wei, Y.C.: Novel association measures using web search with double checking. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 1009鈥?016. Association for Computational Linguistics (2006)
    23. Lu, G., Huang, P., He, L., et al.: A new semantic similarity measuring method based on web search engines. WSEAS Transactions on Computers聽9(1), 1鈥?0 (2010)
    24. Bollegala, D., Matsuo, Y., Ishizuka, M.: Measuring semantic similarity between words using web search engines. In: WWW, vol.聽7, pp. 757鈥?66 (2007)
    25. Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: AAAI, vol.聽6, pp. 775鈥?80 (2006)
    26. Bar, D., Biemann, C., Gurevych, I., Zesch, T.: Ukp: Computing semantic textual similarity by combining multiple content similarity measures. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics-vol. 1: Proceedings of the Main Conference and the Shared Task, and vol. 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp. 435鈥?40. Association for Computational Linguistics (2012)
    27. Ng, A.: Regularization and model selection, CS 229 Machine Learning Course Materials, pp. 4鈥?
  • 作者单位:Xiaoqiang Jin (21)
    Chengjie Sun (21)
    Lei Lin (21)
    Xiaolong Wang (21)

    21. School of Computer Science and Technology, Harbin Institute of Technology, China
  • ISSN:1611-3349
文摘
Previous researches on semantic similarity calculating have been mainly focused on documents, sentences or concepts. In this paper, we study the semantic similarity of words and compositional phrases. The task is to judge the semantic similarity of a word and a short sequence of words. Based on structured resource (WordNet), semi-structured resource (Wikipedia) and unstructured resource (Web), this paper extracts rich effective features to represent the word-phrase pair. The task can be treated as a binary classification problem and we employ Support Vector Machine to estimate whether the word and phrase is similar given a word-phrase pair. Experiments are conducted on SemEval 2013 Task5a. Our method achieves 82.9% in accuracy, and outperforms the best system (80.3%) that participates in the task. Experimental results demonstrate the effectiveness of our proposed approach.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700