一种GMMHMM隐状态与高斯混合成份初始化算法

英文篇名：An Initialization Algorithm for GMMHMM's Implicit State and Gauss Mixture Components
作者：张军超 ; 蒋强荣
英文作者：ZHANG Jun-chao;JIANG Qiang-rong;College of Computer,Beijing University of Technology;
关键词：隐马尔可夫模型 ; GMM混合成份 ; 隐状态 ; 自适应
英文关键词：hidden Markov model;;GMM mixed component;;implicit state;;self-adaption
中文刊名：RJDK
英文刊名：Software Guide
机构：北京工业大学计算机学院;
出版日期：2018-12-26 13:27
出版单位：软件导刊
年：2019
期：v.18;No.195
基金：北京工业大学计算机应用技术重点学科发展建设项目(040000514118029)
语种：中文;
页：RJDK201901020
页数：5
CN：01
ISSN：42-1671/TP
分类号：87-91

摘要

为了解决传统隐马尔可夫模型应用通常将隐状态数和混合成份数看作一致的弊端,更客观地描述问题,使模型研究适合现实的数据分布,参数设定更为精准,从而使算法效果达到最优,提出一种基于高斯混合分布、聚类思想和OEHS准则的适应数据分布且自动确定参数的算法。因隐马尔可夫学习算法由EM算法实现,但EM是局部最优算法,严重依赖初始值,从跳出局部最优的角度出发,对两个参数进行初始设定。与传统的随机初始化方法进行比较,实验结果表明,该算法能得到更好的结果。
In order to solve the problem that in traditional application of hidden Markov model,the number of hidden states and the number of mixed components are usually regarded as the same,and to describe the problem more objectively so that the model research can be very suitable for the actual data distribution,and the parameters are set more accurately to make the algorithm achieve the best results,an algorithm based on Gaussian mixture distribution,clustering idea and OEHS criterion is proposed.At the same time,the hidden Markov learning algorithm is implemented by EM algorithm,but EM is a local optimal algorithm,which depends heavily on the initial value.From the point of jumping out of local optimum,so that the initial setting of the two parameters is conducted,which can adapt to the data distribution and automatically determine the parameters.Compared with the traditional random initialization method,the experimental results show that the proposed algorithm can get better results.

引文

[1]文芳.股票量化策略:科学爱好者的游戏[J].新财富,2016(8):60-64.
    [2]许博,陈鸣,魏祥麟.基于隐马尔科夫模型的P2P流识别技术[J].通信学报,2012,33(6):55-63.
    [3]陈世文.基于谱分析与统计机器学习的DDoS攻击检测技术研究[D].郑州:解放军信息工程大学,2013.
    [4]吴漫君.基于隐马尔科夫模型的股价走势预测[D].广州:华南理工大学,2011.
    [5]柳姣姣,禹素萍,吴波,等.基于隐马尔科夫模型的时空序列预测方法[J].微型机与应用,2016,35(1):74-76.
    [6]王慎波,张为.基于块模型的混合高斯模型运动目标检测方法[J].信息技术,2016(6):151-156.
    [7]王慧勇.基于神经网络的多方言口音汉语语音识别系统研究[D].深圳:中国科学院深圳先进技术研究院,2014.
    [8]杜世平.隐马尔可夫模型的原理及其应用[D].成都:四川大学,2004.
    [9]BAUM L E,PETRIE T,SOULES G,et al.A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains[J].Annals of Mathematical Statistics,1970,41(1):164-171.
    [10]BAUM L E,PETRIE T.Statistical inference for probabilistic functions of finite state Markov chains[J].Annals of Mathematical Statistics,1966,37(6):1554-1563.
    [11]张素洁,赵怀慈.最优聚类个数和初始聚类中心点选取算法研究[J].计算机应用研究,2017,34(6):1617-1620.
    [12]HU L,ZANIBBI R.HMM-based recognition of online handwritten mathematical symbols using segmental K-means initialization and a modified pen-up/down feature[C].Asian Conference on Computer Vision,2011:457-462.
    [13]ZIMMERMANN M,GHAZI M M,EKENEL H K,et al.Visual speech recognition using PCA networks and LSTMs in a tandem GMM-HMM system[C].Asian Conference on Computer Vision,2016:264-276.
    [14]VERBEEK J J,VLASSIS N,KR?SE B.Efficient greedy learning of Gaussian mixture models[J].Neural Computation,2014,15(2):469-485.
    [15]冯柳伟,常冬霞,邓勇,等.最近最远得分的聚类性能评价指标[J].智能系统学报,2017,12(1):67-74.
    [16]谭颖.文本挖掘中的聚类算法研究[D].长春:吉林大学,2009.
    [17]CELEUX G,DURAND J B.Selecting hidden Markov model state number with cross-validated likelihood[J].Computational Statistics,2008,23(4):541-564.
    [18]段江娇.基于模型的时间序列数据挖掘[D].上海:复旦大学,2008.
    [19]吴静,吴晓燕,滕江川,等.基于连续隐马尔可夫模型的仿真模型验证[J].兵工学报,2012,32(3):367-372.
    [20]冯波,郝文宁,陈刚,等.K-means算法初始聚类中心选择的优化[J].计算机工程与应用,2013,49(14):182-185.
    [21]谢庆华,张宁蓉,宋以胜,等.聚类数据挖掘可视化模型方法与技术[J].解放军理工大学学报:自然科学版,2015(1):7-15.
    [22]贾瑞玉,宋建林.基于聚类中心优化的k-means最佳聚类数确定方法[J].微电子学与计算机,2016,33(5):62-66.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700