Learning Multiple Timescales in Recurrent Neural Networks

详细信息查看全文

关键词：Recurrent Neural Networks ; Sequence learning ; Multiple timescales ; Leaky activation ; Clocked activation
刊名：Lecture Notes in Computer Science
出版年：2016
出版时间：2016
年：2016
卷：9886
期：1
页码：132-139
全文大小：461 KB
参考文献：1.Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef
2.Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
3.Koutník, J., Greff, K., Gomez, F., Schmidhuber, J.: A clockwork RNN. In: Proceedings of ICML-2014, pp. 1863–1871 (2014)
4.Jaeger, H., Lukoševičius, M., Popovici, D., Siewert, U.: Optimization and applications of echo state networks with leaky-integrator neurons. Neural Netw. 20(3), 335–352 (2007)CrossRef MATH
5.Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: Proceedings of ICASSP-2013, pp. 8624–8628 (2013)
6.Wermter, S., Panchev, C., Arevian, G.: Hybrid neural plausibility networks for news agents. In: Proceedings of AAAI-1999, pp. 93–98 (1999)
7.Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks. ArXiv preprint arXiv:1312.6026v5 (2014)
8.Mikolov, T., Joulin, A., Chopra, S., Mathieu, M., Ranzato, M.: Learning longer memory in recurrent neural networks. ArXiv preprint arXiv:1412.7753v2 (2015)
9.Wermter, S.: Hybrid Connectionist Natural Language Processing. Chapman and Hall, Thompson International, London (1995)
10.Arevian, G., Panchev, C.: Robust text classification using a hysteresis-driven extended SRN. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D.P. (eds.) ICANN 2007. LNCS, vol. 4669, pp. 425–434. Springer, Heidelberg (2007)CrossRef
11.Graves, A.: Generating Sequences with recurrent neural networks. Arxiv preprint arXiv:1308.0850 (2013)
12.Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of AISTATS-2010, pp. 249–256 (2010)
13.Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of ICML-2015, pp. 2342–2350 (2015)
14.Kullback, S.: Information Theory and Statistics. Wiley, New York (1959)MATH
作者单位：Tayfun Alpay (16)
Stefan Heinrich (16)
Stefan Wermter (16)

16. Department of Informatics, Knowledge Technology, University of Hamburg, Vogt-Kölln-Straße 30, 22527, Hamburg, Germany
丛书名：Artificial Neural Networks and Machine Learning – ICANN 2016
ISBN：978-3-319-44778-0
刊物类别：Computer Science
刊物主题：Artificial Intelligence and Robotics
Computer Communication Networks
Software Engineering
Data Encryption
Database Management
Computation by Abstract Devices
Algorithm Analysis and Problem Complexity
出版者：Springer Berlin / Heidelberg
ISSN：1611-3349
卷排序：9886

文摘

Recurrent Neural Networks (RNNs) are powerful architectures for sequence learning. Recent advances on the vanishing gradient problem have led to improved results and an increased research interest. Among recent proposals are architectural innovations that allow the emergence of multiple timescales during training. This paper explores a number of architectures for sequence generation and prediction tasks with long-term relationships. We compare the Simple Recurrent Network (SRN) and Long Short-Term Memory (LSTM) with the recently proposed Clockwork RNN (CWRNN), Structurally Constrained Recurrent Network (SCRN), and Recurrent Plausibility Network (RPN) with regard to their capabilities of learning multiple timescales. Our results show that partitioning hidden layers under distinct temporal constraints enables the learning of multiple timescales, which contributes to the understanding of the fundamental conditions that allow RNNs to self-organize to accurate temporal abstractions.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700