N3LDG:一种轻量级自然语言处理深度学习库

英文篇名：N3LDG: A Lightweight Neural Network Library for Natural Language Processing
作者：王潜升 ; 余南 ; 张梅山 ; 韩子嘉 ; 付国宏
英文作者：WANG Qiansheng;YU Nan;ZHANG Meishan;HAN Zijia;FU Guohong;School of Computer Science and Technology, Heilongjiang University;
关键词：深度学习库 ; 自然语言处理 ; 轻量级 ; CUDA
英文关键词：deep learning library;;NLP;;lightweight;;CUDA
中文刊名：BJDZ
英文刊名：Acta Scientiarum Naturalium Universitatis Pekinensis
机构：黑龙江大学计算机科学技术学院;
出版日期：2018-08-22 20:23
出版单位：北京大学学报(自然科学版)
年：2019
期：v.55;No.291
基金：国家自然科学基金(61672211,61602160);; 黑龙江省自然科学基金(F2016036)资助
语种：中文;
页：BJDZ201901015
页数：7
CN：01
ISSN：11-2442/N
分类号：116-122

摘要

提出一种用于自然语言处理的轻量级深度学习库N3LDG,可以支持动态地构建计算图,并能自动地批量化执行计算图。实验显示,当训练卷积神经网络、双向LSTM和树结构LSTM时,N3LDG都能高效地构建与执行计算图;当使用CPU训练上述模型时,N3LDG的训练速度优于PyTorch;当使用GPU训练卷积神经网络和树结构LSTM模型时, N3LDG的训练速度优于PyTorch。
The authors propose a neural network library N3 LDG for natural language processing. N3 LDG supports constructing computation graphs dynamically, and organizing executions into batches automatically. Experiments show that N3 LDG can efficiently construct and execute computation graphs when training CNN, Bi-LSTM, and Tree-LSTM. When using CPU to train above models, the training speed of N3 LDG is better than that of PyTorch. When using GPU to train CNN and Tree-LSTM, N3 LDG is better than PyTorch.

引文

[1]Young T,Hazarika D,Poria S,et al.Recent trends in deep learning based natural language processing[EB/OL].(2017-08-09)[2018-04-01].https://arxiv.org/abs/1708.02709
    [2]Team T T D,Al-Rfou R,Alain G,et al.Theano:a python framework for fast computation of mathematical expressions[EB/OL].(2016-03-09)[2018-04-01].https://arxiv.org/abs/1605.02688
    [3]Yu D,Eversole A,Seltzer M,et al.An introduction to computational networks and the computational network toolkit.Microsoft Technical Report MSR-TR-2014-112.Singapore,2014
    [4]Jia Y,Shelhamer E,Donahue J,et al.Caffe:convolutional architecture for fast feature embedding//Proceedings of the 22nd ACM international conference on Multimedia.Orlando,2014:675-678
    [5]Abadi M,Barham P,Chen J,et al.Tensorflow:a system for large-scale machine learning//OSDI.Savannah,2016:265-283
    [6]Paszke A,Gross S,Chintala S,et al.Automatic differentiation in pytorch//NIPS 2017 Autodiff Workshop:The Future of Gradient-based Machine Learning Software and Techniques.Long Beach,2017:1-4
    [7]Bahdanau D,Cho K,Bengio Y.Neural machine translation by jointly learning to align and translate[EB/OL].(2014-09-01)[2018-04-01].https://arxiv.org/abs/1409.0473
    [8]Chen Z,Droppo J,Li J,et al.Progressive joint modeling in unsupervised single-channel overlapped speech recognition.IEEE/ACM Transactions on Audio,Speech and Language Processing(TASLP),2018,26(1):184-196
    [9]Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[EB/OL].(2014-09-04)[2018-04-01].https://arxiv.org/abs/1409.1556
    [10]Chen L C,Papandreou G,Kokkinos I,et al.Deeplab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRTs.IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(4):834-848
    [11]Looks M,Herreshoff M,Hutchins D L,et al.Deep learning with dynamic computation graphs[EB/OL].(2017-02-07)[2018-04-01].https://arxiv.org/abs/1702.02181
    [12]Zhang M,Yang J,Teng Z,et al.LibN3L:a lightweight package for neural NLP//LREC.Paris,2016:225-229
    [13]Neubig G,Goldberg Y,Dyer C.On-the-fly operation batching in dynamic computation graphs[EB/OL].(2017-05-22)[2018-04-01].https://arxiv.org/abs/1705.07860
    [14]Neubig G,Dyer C,Goldberg Y,et al.Dynet:the dynamic neural network toolkit[EB/OL].(2017-01-15)[2018-04-01].https://arxiv.org/abs/1701.03980
    [15]Chetlur S,Woolley C,Vandermersch P,et al.Cudnn:efficient primitives for deep learning[EB/OL].(2014-10-03)[2018-04-01].https://arxiv.org/abs/1410.0759
    [16]Guennebaud G,Jacob B.Eigen[EB/OL].[2018-04-01].http://eigen.tuxfamily.org
    [17]Knowlton K C.A fast storage allocator.Communications of the ACM,1965,8(10):623-624
    [18]Fatica M,LeGresley P,Buck I,et al.High performance computing with CUDA.Tutorial in IEEE Supercomputing,2007,18(6):397-412
    [19]Tai K S,Socher R,Manning C D.Improved semantic representations from tree-structured long short-term memory networks[EB/OL].(2015-02-28)[2018-04-01].https://arxiv.org/abs/1503.00075

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700