深度学习常用优化算法研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

深度学习常用优化算法研究

详细信息查看全文 | 推荐本文 |

英文篇名：A study of common optimization algorithms for deep learning
作者：贾桐
英文作者：Jia Tong;National Computer System Engineering Research Institute of China;
关键词：优化算法 ; 机器学习 ; 深度学习
英文关键词：optimization algorithm;;machine learning;;deep learning
中文刊名：WXJY
英文刊名：Information Technology and Network Security
机构：华北计算机系统工程研究所;
出版日期：2019-07-10
出版单位：信息技术与网络安全
年：2019
期：v.38;No.507
语种：中文;
页：WXJY201907008
页数：5
CN：07
ISSN：10-1543/TP
分类号：46-50

摘要

随着计算处理单元及图形处理单元计算能力的快速发展,以及人类收集数据规模的指数级跨越,深度学习技术蓬勃发展,在图像识别、自然语言理解以及语音识别等领域展现出强大的能力。而深度学习通常被建模为一个无约束优化问题,因此需要有具体的优化算法对其进行求解。通过对比分析的方法介绍了深度学习领域常用的基于随机梯度下降的启发式优化算法,对比各种优化算法的优缺点,并总结在实际问题中的使用技巧和注意事项。
With the rapid development of computing power of Computational Processing Units (CPU) and Graphics Processing Units (GPU),as well as the exponential leap of human-collected data scales,deep learning techniques have flourished,demonstrating powerful capabilities in areas such as image recognition,natural language understanding,and speech recognition. Deep learning is usually modeled as an unconstrained optimization problem,so a specific optimization algorithm is needed to solve it. This paper introduced the heuristic optimization algorithm based on stochastic gradient descent commonly used in deep learning field through comparative analysis method,compared the advantages and disadvantages of various optimization algorithms,and summarized the use skills and precautions in practical problems.

引文

[1]仝卫国,李敏霞,张一可.深度学习优化算法研究[J].计算机科学,2018,45(S2):155-159.
    [2]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521:436-444.
    [3]BOTTOU L,CURTIS F,NOCEDAL J.Optimization methods for large-scale machine learning[J].ar Xiv:1606.04838v1,2016.
    [4]XU P,ROOSTA-KHORASANI F,MAHONEY M W.Second-order optimization for non-convex machine learning:an empirical study[J].arxiv:1708.07827,2017.
    [5]RUDER S.An overview of gradient descent optimization algorithms[J].CoRR,abs/1609.04747,2016.
    [6]ROBBINS H,MONRO S.A stochastic approximation method[J].Annals of Mathematical Statistics,1951,22:400-407.
    [7]Qian Ning.On the momentum term in gradient descent learning algorithms[J].Neural Netw,1999,12(1):145-151.
    [8]BOTEV A,LEVER G,BARBER D.Nesterov's accelerated gradient and momentum as approximations to regularised update descent[J].CoRR,abs/1607.01981,2016.
    [9]DUCHI J,HAZAN E SINGER Y.Adaptive subgradient methods for online learning and stochastic optimization[J].Journal of Machine Learning Research,2011,12:2121-2159.
    [10]DAUPHIN Y N,VRIES H D,CHUNG J,et al.Rmsprop and equilibrated adaptive learning rates for non-convex optimization[J].CoRR,abs/1502.04390,2015.
    [11]KINGMA D,BA J.Adam:a method for stochastic optimization[J].CoRR,abs/1412.6980,2014.
    [12]周志华.机器学习[M].北京:清华大学出版社,2016.
    [13]LECUN Y,CORTES C,BURGES C J C.The mnist database of handwritten digits[EB/OL].[2019-04-20].http://yann.lecun.com/exdb/mnist/.
    [14]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradientbased learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700