A novel word embedding learning model using the dissociation between nouns and verbs
详细信息    查看全文
文摘
In recent years, there have been researches on using semantic knowledge and global statistical features to guide the learning of word embeddings. Though the syntax knowledge also plays an very important role in natural language understanding, its effectiveness on the word embedding learning is still far from well investigated. Inspired by the principle of the dissociation between nouns and verbs (DNV) in language acquisition observed in neuropsychology, we propose a novel model for word embeddings learning using DNV (named Continuous Dissociation between Nouns and Verbs Model, CDNV). CDNV uses a three-layer feed forward neural network to integrate DNV generated by auto-tagged noun/verb information into the word embeddings learning process, which can still preserve the word order of local context. The advantage of the CDNV lies in that it is able to learn high-quality word embeddings with relatively low time complexity. Experimental results show that: (1) CDNV takes about 1.5 h to learn word embeddings on a corpus of billions of words, which is comparable with CBOW and Skip-gram and more efficient than other models; (2) the nearest neighbors of some representative words derived from the word embeddings learnt by CDNV are more reasonable than other word embeddings; (3) the performance improvement on F1 measure from CDNV word embeddings is greater than other word embeddings on NER and Chunking.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700