高阶隐马氏模型算法理论若干问题的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
隐马氏模型作为一种具有双重随机过程的统计模型,具有可靠的概率统计理论基础和强有力的数学结构,目前已经被广泛应用于语音识别、字符识别、生物序列分析、金融数据分析、图像处理和计算机视觉等领域,但这些应用主要是基于一阶隐马氏模型的算法和理论.在一阶隐马氏模型中,假定状态转移概率和符号发出概率都只取决于当前的状态,而与以前的状态和发出的符号无关.尽管这样的假定在一定程度上对一些实际应用是有效的,并简化了相关的算法理论,但同时这样的假定也使得一阶隐马氏模型存在一定的不足和缺陷,不能对一些实际过程提供更加准确的描述,例如语音过程和DNA碱基排序过程等.
     为了克服一阶隐马氏模型的不足和缺陷,一些学者从不同角度对一阶隐马氏模型提出了一些改进措施.其中之一是通过考虑状态转移概率或符号发出概率与更多远程状态之间的依赖关系,从而提出了高阶隐马氏模型.与一阶隐马氏模型相比,高阶隐马氏模型具有更好的时序结构,能够纳入更多的统计特征,在一定程度上克服了一阶隐马氏模型的不足和缺陷,从而能够为一些实际应用提供更好的模型,对一些实际过程给予更加准确的描述.尽管高阶隐马氏模型具有更好的应用潜力,但相关的应用研究还是很少的.究其原因,其中之一是高阶隐马氏模型的算法理论尚不完善,缺少有效的算法.为了让高阶隐马氏模型得以广泛的应用,必须进一步发展和完善高阶隐马氏模型的算法理论.
     对于高阶隐马氏模型算法理论的研究有两种基本方法:一种方法是拓展法,就是根据一阶隐马氏模型算法的原理和方法,直接推导出高阶隐马氏模型的算法;另一种方法是模型降阶法,就是通过某种方法将高阶隐马氏模型转换为与之等价的一阶隐马氏模型,然后利用一阶隐马氏模型的标准技术建立高阶隐马氏模型的算法.本文分别使用拓展法和模型降阶法研究了高阶隐马氏模型算法理论中的若干重要问题,进一步发展和完善了高阶隐马氏模型的算法理论,为高阶隐马氏模型的实际应用奠定了一定的理论基础.本文主要获得了如下研究结果:
     (1)提出了一类三阶隐马氏模型,其中状态转移概率和符号发出概率都同时取决于当前状态和前面两个状态.通过借鉴一阶隐马氏模型算法的原理和方法,使用拓展法为三阶隐马氏模型定义了相应的向前变量、向后变量和Viterbi变量,研究和推导了三阶隐马氏模型中三个基本问题的算法,即估值问题的向前-向后算法、解码问题的Viterbi算法、单观测序列下学习问题的Baum-Welch算法以及多观测序列下学习问题的Baum-Welch算法.其中,多观测序列可以是相互独立的也可以是统计相关的,并使用组合权重来刻划多观测序列的独立相关性,从而使得三阶隐马氏模型在多观测序列下学习问题的Baum-Welch算法适用于更一般的训练集.为了说明这种一般性,给出了多观测序列下学习问题Baum-Welch算法的两个特例,即多观测序列分别在相互独立时和一致相关时的Baum-Welch算法.另外,还研究了三阶隐马氏模型与一阶隐马氏模型之间的关系,构造了一个与三阶隐马氏模型等价的一阶隐马氏模型,给出并证明了它们之间的等价性定理.
     (2)针对一类任意高阶的隐马氏模型,基于Hadar的等价变换方法和Li的组合方法,使用模型降阶法建立了多观测序列下高阶隐马氏模型的Baum-Welch算法,获得了多观测序列下高阶隐马氏模型的参数重估公式,并使用一阶隐马氏模型的标准技术来计算这些参数重估公式.其中,多观测序列可以是相互独立的也可以是统计相关的,并使用组合权重来刻划多观测序列的独立相关性,从而使得高阶隐马氏模型在多观测序列下学习问题的Baum-Welch算法适用于更一般的训练集.为了说明这种一般性,给出了参数重估公式的两个特例,即多观测序列分别在相互独立时和一致相关时的参数重估公式.
     (3)参照混合模型的概念,提出了一类混合高阶隐马氏模型.基于模型降阶法,通过Hadar的等价变换方法将混合高阶隐马氏模型转换为与之等价的混合
     一阶隐马氏模型,然后利用混合一阶隐马氏模型的Baum-Welch算法建立了混合高阶隐马氏模型的Baum-Welch算法,给出了混合高阶隐马氏模型的参数重估公式,并使用一阶隐马氏模型的标准技术来计算这些参数重估公式.
     (4)针对一类任意高阶的隐马氏模型,分别使用拓展法和模型降阶法分析了高阶隐马氏模型的解码问题.至于拓展法,就是通过定义高阶隐马氏模型的Viterbi变量,根据动态规划原理建立了高阶隐马氏模型推广的Viterbi算法,利用推广的Viterbi算法可直接求得高阶隐马氏模型的最佳路径.至于模型降阶法,就是通过Hadar的等价变换方法将高阶隐马氏模型转换为与之等价的一阶隐马氏模型,然后利用一阶隐马氏模型的Viterbi算法求得这个等价的一阶隐马氏模型的最佳路径,最后根据这个等价的一阶隐马氏模型的最佳路径与原高阶隐马氏模型最佳路径之间的关系间接地求出原高阶隐马氏模型的最佳路径.
Hidden Markov model as a statistical model with a doubly stochastic process hasbeen widely applied to speech recognition, character recognition, biological sequenceanalysis, financial data analysis, image processing and computer vision, etc. not onlybecause of its reliable theoretical basis of probability and statistics but also because ofits powerful mathematical structure. These applications, however, are mainly based onthe first-order hidden Markov model in which there exists an assumption that the statetransition probability and output observation probability depend only on the current stateand have nothing with the previous states and output observations. While it can simplifythe algorithms of the first-order hidden Markov model and is also valid to some extentin some applications, this assumption causes some shortages for the first-order hiddenMarkov model and can not provide more accurate description for some real processessuch as speech process and DNA bases sorting process etc.
     To overcome the shortages of the first-order hidden Markov model, some re-searchers have proposed several refinements from diferent perspectives. One of these re-finements is to extend the first-order hidden Markov model to high-order hidden Markovmodel that takes into consideration the dependency of state transition probability or out-put observation probability on more long-distance states. By contrast with the first-orderhidden Markov model, high-order hidden Markov model has better temporal structureand can contain more statistical characteristics, so that it can partly compensate for theshortages of the first-order hidden Markov model and more exactly match some real pro-cesses. In spite of its potential, high-order hidden Markov model has been found littleapplication in the existing literatures. One of the reasons for little application is the lackof efective algorithms of high-order hidden Markov model due to its imperfect theory.To obtain more wide applications of high-order hidden Markov model, the algorithms ofhigh-order hidden Markov model must be further developed and improved.
     There are two basic approaches to study the algorithms of high-order hiddenMarkov model. The first one is called the extended approach, which is to extend directlythe existing algorithms of the first-order hidden Markov model to high-order hiddenMarkov model by using the principles and techniques of the algorithms of the first-orderhidden Markov models. The second one is called the model reduction method, which isto transform high-order hidden Markov to an equivalent first-order hidden Markov modelby some means and then to establish the algorithms of high-order hidden Markov modelby using standard techniques applicable to the first-order hidden Markov model. Thisdissertation uses the extended approach and the model reduction method respectively to discuss several important problems on the algorithms of high-order hidden Markovmodel, and then further develops and improves the algorithms of high-order hiddenMarkov model. This dissertation lays a theory foundation for the application of high-order hidden Markov model. The main findings of this dissertation are listed as follows:
     (1) A class of third-order hidden Markov model is proposed. In this proposedmodel, both state transition probability and output observation probability depend onthe current state and on the two preceding states as well. With reference of the prin-ciples and techniques of the algorithms of the first-order hidden Markov models andby using the extended method, three variables for third-order hidden Markov model aredefined, including the forward variable, the backward variable and the Viterbi variable,and then three algorithms of third-order hidden Markov model are developed and de-rived, including the forward-backward algorithm for observation sequence evaluation,the Viterbi algorithm for determining the optimal state sequence, the Baum-Welch al-gorithm for training third-order hidden Markov model with single observation sequenceand the Baum-Welch algorithm for training third-order hidden Markov model with mul-tiple observation sequences. In this treatment, multiple observation sequences are eitherindependent of each other or statistically correlated, and the dependence-independenceproperty of multiple observation sequences is characterized by combinational weights,so that the Baum-Welch algorithm for training third-order hidden Markov model withmultiple observation sequences can be suitable for a more general training set. To showthis generality, two special cases of the Baum-Welch algorithm are given when multipleobservation sequences are independent of each other and uniformly dependent on oneanother respectively. In addition, a first-order hidden Markov model equivalent to thethird-order hidden Markov model is constructed, and then a theorem of their equivalenceis proposed and proved.
     (2) Aiming at a class of any high-order hidden Markov model, based on Hadar’sequivalent transformation method and Li’s combinatorial method, the Baum-Welch algo-rithm for training high-order hidden Markov model with multiple observation sequencesis developed and derived by using the model reduction method, and then the correspond-ing parameter reestimation formula can be expressed by using standard techniques ap-plicable to the first-order hidden Markov model. In this treatment, multiple observa-tion sequences are either independent of each other or statistically correlated, and thedependence-independence property of multiple observation sequences is characterizedby combinational weights, so that the Baum-Welch algorithm for training high-orderhidden Markov model with multiple observation sequences can be suitable for a moregeneral training set. To show this generality, two special cases of the parameter reesti- mation formula are given when multiple observation sequences are independent of eachother and uniformly dependent on one another respectively.
     (3) With reference of the definition of mixture model, a class of mixture high-orderhidden Markov model is proposed. Based on the model reduction method, mixture high-order hidden Markov model is transformed into two equivalent mixture first-order hid-den Markov models respectively by using Hadar’s equivalent transformation method,and then the Baum-Welch algorithm of mixture high-order hidden Markov model is de-veloped and derived by means of the Baum-Welch algorithm of an equivalent mixturefirst-order hidden Markov model. Also, the parameter reestimation formula of mixturehigh-order hidden Markov model can be expressed by using standard techniques appli-cable to the first-order hidden Markov model.
     (4) Aiming at a class of any high-order hidden Markov model, the extended ap-proach and the model reduction method are used respectively to decode high-order hid-den Markov model. As for the extended approach, the Viterbi variable of high-orderhidden Markov model is firstly defined, and then the extended Viterbi algorithm of high-order hidden Markov model is established according to the dynamic programming princi-ple. The extended Viterbi algorithm can determine directly the optimal state sequence ofhigh-order hidden Markov model. As for the model reduction method, high-order hid-den Markov model is transformed into an equivalent first-order hidden Markov modelby Hadar’s equivalent transformation method, and then the optimal state sequence of theequivalent first-order hidden Markov model is determined by the existing Viterbi algo-rithm of the first-order hidden Markov model. Based on their equivalence, the optimalstate sequence of high-order hidden Markov model is inferred from the optimal statesequence of the equivalent first-order hidden Markov model.
引文
[1] MARKOV A A. Rasprostranenie zakona bol’shih chisel na velichiny, zavisyaschie drug otdruga[J]. Izvestiya Fiziko-matematicheskogo Obschestva pri Kazanskom Universitete,2-yaSeriya,1906, Tom15(94):135–156.
    [2] BASHARIN G P, LANGVILLE A N, NAUMOV V A. The life and work of A.A. Markov[J].Linear Algebra and its Applications,2004,386:3–26.
    [3] BERNSTEIN S N. Sur l’extension du the′oreme′limite du calcul des probabilies etc.[J]. Math-ematische Annalen,1926, Bd.97:1–59.
    [4] KEMENY J G, SNELL J L. Finite Markov Chains[M]. New Jersey: Van Norstrand,1960.
    [5] STRATONOVICH R L. Conditional Markov processes[J]. Theory of Probability and its Ap-plications,1960,5(2):156–178.
    [6] CHUNG K L. Markov Chains with Stationary Transition Probabilities[M]. Berlin: Springer-Verlag,1967.
    [7] IKEDA N, NAGASAWA M, WATANABE S. Branching Markov processes I[J]. Journal ofMathematics of Kyoto University,1968,8(2):233–278.
    [8] IKEDA N, NAGASAWA M, WATANABE S. Branching Markov processes II[J]. Journal ofMathematics of Kyoto University,1968,8(3):365–410.
    [9] IKEDA N, NAGASAWA M, WATANABE S. Branching Markov processes III[J]. Journal ofMathematics of Kyoto University,1969,9(1):95–160.
    [10] ROSENBLATT M. Markov Processes: Structure and Asymptotic Behavior[M]. New Jersey:Springer-Verlag,1971.
    [11] SILVERSTEIN M L. Symmetric Markov Processes[M]. Berlin: Springer-Verlag,1974.
    [12] DYNKIN E B, YSHKEVICH A A. Controlled Markov Processes[M]. New York: Springer-Verlag,1979.
    [13] JANSSEN J, LIMNIOS N. Semi-Markov Models and Applications[M]. Dordrecht,The Nether-lands: Kluwer Academic Publisher,1999.
    [14] STROOCK D W. An Introduction to Markov Processes[M]. Berlin: Springer-Verlag,2005.
    [15] PARDOUX E. Markov Processes and Applications—Algorithms,Networks,Genome and Fi-nance[M]. United Kingdom: John Wiley&Sons,2008.
    [16] BAUM L E, PETRIE T. Statistical inference for probabilistic functions of finite state Markovchains[J]. The Annals of Mathematical Statistics,1966,37(6):1554–1563.
    [17] BAUM L E, EAGON J A. An inequality with applications to statistical estimation for proba-bilistic functions of Markov process and to a model for ecology[J]. Bulletin of the AmericanMathematical Society,1967,73(3):360–363.
    [18] BAUM L E, SELL G R. Growth transformations for functions on manifolds[J]. Pacific Journalof Mathematics,1968,27(2):211–227.
    [19] BAUM L E, PETRIE T, SOULES G, et al. A maximization technique ocurring in the statisticalanalysis of probabilistic functions of Markov chains[J]. The Annals of Mathematical Statistics,1970,41(1):164–171.
    [20] BAUM L E. An inequality and associated maximization technique in statistical estimation forprobabilistic functions of Markov processes[J]. Inequalities,1972,3(1):1–8.
    [21] LEVINSON S E, RABINER L R, SONDHI M M. An introduction to the application of thetheory of probabilistic functions of Markov process to automatic speech recognition[J]. TheBell System Technical Journal,1983,62(4):1035–1074.
    [22] BAHL L R, JELINEK F, MERCER R L. A maximum likelihood approach to continuousspeech recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1983,PAMI-5(2):179–190.
    [23] RABINER L R, LEVINSON S E. A speaker-independence, syntax-directed, connected wordrecognition system based on hidden Markov model and level building[J]. IEEE Transactionson Acoustics, Speech and Signal Processing,1985, ASSP-33(3):561–573.
    [24] LEVINSON S E. Continuously variable duration hidden Markov models for automatic speechrecognition[J]. Computer Speech&Language,1986,1(1):29–45.
    [25] LEE K F. Automatic Speech Recognition: the Development of SPHINX System[M].Dordecht,The Netherlands: Kluwer Academic Press,1989.
    [26] RABINER L R. A tutorial on hidden Markov models and selected applications in speechrecognition[J]. Proceedings of the IEEE,1989,77(2):257–286.
    [27] LEE K F, HON H W. Speaker-independent phone recognition using hidden Markov models[J].IEEE Transactions on Acoustics, Speech and Signal Processing,1989,37(11):1641–1648.
    [28] LEE K F. Context-dependent phonetic hidden Markov models for speaker-independent contin-uous speech recognition[J]. IEEE Transactions on Acoustics, Speech and Signal Processing,1990,38(4):599–609.
    [29] JUANG B H, RABINER L R. Hidden Markov models for speech recognition[J]. Technomet-rics,1991,33(3):251–272.
    [30] TAO C G. A generalization of discrete hidden Markov model and of viterbi algorithm[J].Pattern Recognition,1992,25(11):1381–1387.
    [31] VASEGHI S V. State duration modelling in hidden Markov models[J]. Signal Processing,1995,41(1):31–41.
    [32] RABINER L R, JUANG B H. Fundamentals of Speech Recognition[M]. EnglewoodClifs,New Jersey: Prentice-Hall,1993.
    [33]<. ê‰.9ù3?n¥A^[M].é:u¥nó,1995.
    [34] GALES M, YOUNG S. The application of hidden Markov models in speech recognition[J].Foundations and Trends in Signal Processing,2007,1(3):195–304.
    [35] KUNDU A, HE Y, BAHL P. Recognition of handwritten word: first and second order hiddenMarkov model based approach[J]. Pattern Recognition,1989,22(3):283–297.
    [36] VLONTZOS J A, KUNG S Y. Hidden Markov models for character recognition[J]. IEEETransactions on Image Processing,1992,1(4):539–543.
    [37] CHEN M Y, KUNDU A, ZHOU J. Of-line handwritten word recognition using a hiddenMarkov model type stochastic network[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence,1994,16(5):481–496.
    [38] VELTMAN S R, PRASAD R. Hidden Markov models applied to on-line handwritten isolatedcharacter recognition[J]. IEEE Transactions on Image Processing,1994,3(3):314–318.
    [39] BUNKE H, ROTH M, SCHUKAT-TALAMAZZINI E G. Of-line cursive handwriting recog-nition using hidden markov models[J]. Pattern Recognition,1995,28(9):1399–1413.
    [40] HU J Y, BROWN M K, TURIN W. HMM based on-line handwriting recognition[J]. IEEETransactions on Pattern Analysis and Machine Intelligence,1996,18(10):1039–1045.
    [41] BALDI P, CHAUVIN Y, HUNKAPILLER T, et al. Hidden Markov models of biologicalprimary sequence information[J]. Proceedings of the National Academy of Sciences of theUnited States,1994,91(3):1059–1063.
    [42] HUGHEY R, KROGH A. Hidden Markov models for sequence analysis: extension and anal-ysis of the basic method[J]. Computer Applications in the Biosciences,1996,12(2):95–107.
    [43] DURBIN R, EDDY S, KROGH A, et al. Biological Sequence Analysis: Probabilistic Modelsof Proteins and Nucleic Acids[M]. Cambridge: Cambridge University Press,1998.
    [44] KOSKI T. Hidden Markov Models for Bioinformatics[M]. Dordecht,The Netherlands: KluwerAcademic Press,2001.
    [45] ù,¤u,ê. ê.3) S¥A^[J].g,,“,2001,5(23):273–277.
    [46] GU Y H, SHI D H, WANG Y F. Brief introduction to self-adapting hidden Markov modelprogram for multiple sequences alignment[J]. Journal of Shanghai University,2001,5(2):93–95.
    [47] BIRNEY E. Hidden Markov models in biological sequence analysis[J]. IBM Journal of Re-search and Development,2001,45(3-4):449–454.
    [48] YOON B J. Hidden Markov models and their applications in biological sequence analysis[J].Current Genomics,2009,10(6):402–415.
    [49] BHAR R, HAMORI S. Hidden Markov Models: Applications to Financial Economics[M].Dordecht,The Netherlands: Kluwer Academic Press,2004.
    [50] MAMON R, ELLIOTT R. Hidden Markov Models in Finance[M]. New York: Springer,2007.
    [51]RIMEY R D, BROWN C M. Controlling eye movements with hidden Markov models [J]. International Journal of Computer Vision,1991,7(1):47-65.
    [52]VSTOVSKY G V, VSTOVSKAYA A V. A class of hidden Markov models for image process-ing[J]. Pattern Recognition Letters,1993,14(5):391-396.
    [53]LIN H C, WANG L L, YANG S N. Color image retrieval based on hidden Markov models[J]. IEEE Transactions on Image Processing,1997,6(2):332-339.
    [54]ASA K, EIKVIL L, HUSEB Y R B. Applications of hidden Markov chains in image analysis[J]. Pattern Recognition,1999,32(4):703-713.
    [55]RABINER L R, JUANG B H. An introduction to hidden Markov models[J]. IEEE ASSP Magazine,1986,3(1):4-16.
    [56]EDDY S R. Hidden Markov models[J]. Current Opinion in Structural Biology,1996,6(3):361-365.
    [57]EPHRAIM Y, MERHAV N. Hidden Markov processes [J]. IEEE Transactions on Information Theory,2002,48(6):1518-1569.
    [58]EDDY S R. What is a hidden Markov model?[J]. Nature Biotechnology,2004,22(10):1315-1316.
    [59]CAPPE O, MOULINES E, RYDEN T. Inference in Hidden Markov Models [M]. New York: Springer,2005.
    [60]王翼飞.生物信息学—智能化算法及其应用[M].北京:化学工业出版社,2006:84-104.
    [61]BILMES J A. What HMMs can do[J]. IEICE Transactions on Information and Systems,2006, E89-D(3):1-24.
    [62]朱明,郭春生.隐马尔可夫模型及其最新应用与发展[J].计算机系统应用,2010,19(7):255-259,216.
    [63]STAMP M. A revealing introduction to hidden Markov models[R].[S.1.]:[s.n.],2004. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.136.137.
    [64]石欧燕,杨晶,田心.基于MATLAB的隐马尔可夫模型识别CpG岛[J].计算机应用与软件,2008,25(11):214-215,231.
    [65]BIRD A P. CpG islands as gene markers in the vertebrate nucleus [J]. Trends in Genetics,1987,3:342-347.
    [66]LARSEN F, GUNDERSEN G, LOPEZ R, et al. CpG islands as gene markers in the human genome[J]. Genomics,1992,13(4):1095-1107.
    [67]TAKAI D, JONES P A. Comprehensive analysis of CpG islands in human chromosomes21and22[J]. Proceedings of the National Academy of Sciences of the United States of America,2002,99(6):3740-3745.
    [68]JONES N C, PEVZNER P A著.王翼飞等译.生物信息学算法导论[M].北京:化学工业出版社,2007:67-100.
    [69] DEVIJVER P A. Baum’s forward-backward algorithm revisited[J]. Pattern Recognition Let-ters,1985,3(6):369–373.
    [70] AUSTIN S, SCHWARTZ R, PLACEWAY P. The forward-backward search algo-rithm[C]//Proceedings of the1991IEEE International Conference on Acoustics Speech andSignal Processing(ICASSP1991)(Toronto). Piscataway: IEEE Press,1991§5:697–700.
    [71] COLLINS M. The forward-backward algorithm[R].[S.l.]:[s.n.],2001. http://www.systems.caltech.edu/EE/Courses/EE127/EE127C/handout/FBA.pdf.
    [72] YU S Z, KOBAYASH H. An efcient forward-backward algorithm for an explicit-durationhidden Markov model[J]. IEEE Signal Processing Letters,2003,10(1):11–14.
    [73] AZUMA A, MATSUMOTO Y. A generalization of forward-backward algorithm[J]. Transac-tions of the Japanese Society for Artificial Intelligence,2010,25(3):494–503.
    [74] KHREICH W, GRANGER E, MIRI A, et al. On the memory complexity of the for-ward backward algorithm[J]. Pattern Recognition Letters,2010,31(2):99–99.
    [75] FORNEY D G. The Viterbi algorithm[J]. Proceedings of the IEEE,1973,61(3):268–278.
    [76] VITERBI A J. A personal history of the Viterbi algorithm[J]. IEEE Signal Proceeding Maga-zine,2006,23(4):120–142.
    [77] HARTLEY H O. Maximum likelihood estimation from incomplete data[J]. Biometrics,1958,14(2):174–194.
    [78] DEMPSTER A P, LAIRD N M, RUBIN D B. Maximum likelihood from incomplete data viathe EM algorithm[J]. Journal of the Royal Statistical Society,Series B,1977,39(1):1–38.
    [79] WU C F J. On the convergence properties of the EM algorithm[J]. Annals of Statistics,1983,11(1):95–103.
    [80] MOON T K. The expectation-maximization algorithm[J]. IEEE Signal Processing Magazine,1996,13(6):47–60.
    [81] MCLACHLAN G J, KRISHNAN T. The EM Algorithm and Extensions[M]. Canada: JohnWiley&Sons,1997.
    [82] BILMES J A. A gentle tutorial on the EM algorithm and its application to parameter estimationfor gaussian mixture and hidden markov models[R].[S.l.]:[s.n.],1997. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.161.4358.
    [83] MINKA T P. Expectation-maximization as lower bound maximization[R].[S.l.]:[s.n.],1998.http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.8562.
    [84] NEAL R, HINTON G. A view of the EM algorithm that justifies incremental,sparse,and othervariants[M]//. JORDAN M I. Learning in Graphical Models. Dordecht,The Netherlands:Kluwer Academic Press,1998:355–370.
    [85] FRANK D. The expectation maximization algorithm[R].[S.l.]:[s.n.],2002. http://hdl.handle.net/1853/3281.
    [86] DO C B, BATZOGLOU S. What is the expectation maximization algorithm?[J]. NatureBiotechnology,2008,26(8):897–899.
    [87] BORMAN S. The expectation maximization algorithm–a short tutorial[R].[S.l.]:[s.n.],2009.http://www.seanborman.com/publications.
    [88] COLLINGS I B, RYDE′N T. A new maximum likelihood gradient algorithm for on-line hiddenMarkov model identification[C]//Proceedings of the1998IEEE International Conference onAcoustics Speech and Signal Processing(ICASSP1998)(Seattle). Piscataway: IEEE Press,1998§4:2261–2264.
    [89] TURNER R. Direct maximization of the likelihood of a hidden Markov model[J]. Computa-tional Statistics&Data Analysis,2008,52(9):4147–4160.
    [90] CYBENKO G, CRESPI V. Learning hidden Markov models using nonnegative matrix factor-ization[J]. IEEE Transactions on Information Theory,2011,57(6):3963–3970.
    [91] LI X L. Training hidden Markov models with multiple observations–a combinationalmethod[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(4):371–377.
    [92] JURAFSKY D, MARTIN J H. Speech and Language Processing: An Introduction to NaturalLanguage Processing, Computational Linguistics, and Speech Recognition[M]. New Jersey:Prentice Hall,2000.
    [93] HE Y. Extended Viterbi algorithm for second order hidden Markov process[C]//Proceedingsof the1988IEEE International Conference on Pattern Recognition(ICPR1988)(Rome). Pis-cataway: IEEE Press,1988§2:718–720.
    [94] KRIOUILE A, MARI J F, HAON J P. Some improvements in speech recognition algo-rithms based on HMM[C]//Proceedings of the1990IEEE International Conference on Acous-tics, Speech and Signal Processing(ICASSP1990)(Albuquerque). Piscataway: IEEE Press,1990§1:545–548.
    [95] WATSON B, TSOI A C. Second order Hidden Markov Models for speech recogni-tion[C]//Proceedings of the Fourth Australian International Conference on Speech Science andTechnology(SST1992)(Brisbane)..[S.l.]:[s.n.],1992:146–151.
    [96] MARI J F, HATON J P. Automatic Word Recognition Based on Second-Order Hidden MarkovModels[C]//Proceedings of the Third International Conference on Spoken Language Process-ing(ICSLP1994)(Yokohama, Japan)..[S.l.]:[s.n.],1994:247–250.
    [97] MARI J F, FOHR D, JUNQUA J C. A second-order HMM for high performance word andphoneme-based continuous speech recognition[C]//Proceedings of the1996IEEE InternationalConference on Acoustics, Speech, and Signal Processing(ICASSP1996)(Atlanta). Piscataway:IEEE Press,1996§1:435–438.
    [98] MARI J F, HATON J P, KRIOUILE A. Automatic word recognition based on second-orderhidden Markov models[J]. IEEE Transactions on Speech and Audio Processing,1997,5(1):22–25.
    [99]史笑兴,王太君,何振亚.二阶隐马尔科夫模型的学习算法及其与一阶隐马尔科夫模型的关系[J].应用科学学报,2001,19(1):29-32.
    [100]MARI J F, BER F L. Temporal and spatial data mining with second-order hidden Markov models[J]. Soft Computing,2006,10(5):406-414.
    [101]THEDE S M, HARPER M P. A second-order Hidden Markov Model for part-of-speech tagging[C]//Proceedings of the37th annual meeting of the Association for Computational Lin-guistics on Computational Linguistics(ACL1999)(Stroudsburg)..[S.1.]:[s.n.],1999:175-182.
    [102]梁以敏,黄德根.基于完全二阶隐马尔科夫模型的汉语词性标注[J].计算机工程,2005,31(10):177-179.
    [103]周顺先,林亚平,王耀南,易叶青.基于二阶隐马尔科夫模型的文本信息抽取[J].电子学报,2007,35(11):2226-2231.
    [104]杜世平.多观测序列HMM2的Baum-Welch算法[J].生物数学学报,2007,22(4):685-690.
    [105]高松,秦殿刚,冯铁男,马成荣,王翼飞.二阶HMM算法改进及在niRNA巴基因预测中的应用[J].应用科学学报,2010,28(3):307-312.
    [106]DENG L, AKSMANOVIC M, SUN X D, et al. Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states [J]. IEEE Transactions on Speech and Audio Processing,1994,2(4):507-520.
    [107]DU PREEZE J A. Algorithms for high order hidden Markov modeling [C]//Proceedings of the1997South African Symposium on Communications and Signal Processing(COMSIG1997)(Grahamstown,South Africa). Piscataway:IEEE Press,1997:101-106.
    [108]DU PREEZE J A, WEBER D. Comparison of the Order Reducing(ORED) and Fast Incre-mental Training(FIT)training high order hidden Markov models [C]//Proceedings of the1997South African Symposium on Communications and Signal Processing(COMSIG1997)(Gra-hamstown,South Africa). Piscataway:IEEE Press,1997:47-52.
    [109]DU PREEZE J A. Efficient training of high-order hidden Markov models using first-order representations [J]. Computer Speech and Language,1998,12(1):23-29.
    [110]DU PREEZE J A, WEBER D. High-order hidden Markov modelling[C]//Proceedings of the1997South African Symposium on Communications and Signal Processing(COMSIG1998)(Rondebosch,South Africa). Piscataway:IEEE Press,1998:197-202.
    [111]CHING W K, FUNG E S, NG M K. High-order hidden Markov models with applica-tions to DNA sequences[C]//Lecture Notes in Computer Science. Berlin:Springer-Verlag,2003,2690/2003:535-539.
    [112]LEE L M, LEE J C. A study on high-order hidden Markov models and applications to speech recognition[C]//Lecture Notes in Computer Science. Berlin:Springer-Verlag,2006,4031/2006:682-690.
    [113]LEE L M. High-order hidden Markov model and application to continuous mandarin digit Recognition [J]. Journal of Information Science and Engineering,2011,27(6):1919-1930.
    [114]HADAR U, MESSER H. High-order hidden Markov models-estimation and implemen-tation[C]//Proceedings of the IEEE/SP15th Workshop on Statistical Signal Processing(SSP2009)(Cardiff). Piscataway:IEEE Press,2009:249-252.
    [115]王梓坤,杨向群.生灭过程与马尔科夫链[M].北京:科学出版社,2006:1-166.
    [116]陈希孺.概率论与数理统计[M].合肥:中国科学技术大学出版社,2009.
    [117]DAWID A P. Conditional independence in statistical theory[J]. Journal of the Royal Statistical Society,Series B,1979,41(1):1-31.
    [118]MCSHANE E J. Jensen's inequality[J]. Bulletin of the American Mathematical Society,1937,43(8):521-527.
    [119]BELLMAN R E. An introduction to the theory of dynamic programming[R].[S.1.]:[s.n.],1953.http://www.rand.org/pubs/reports/R245.html.
    [120]DREYFUS S. Richard Bellman on the birth of dynamic programming [J]. Operations Research,2002,50(1):48-51.
    [121]BELLMAN R. On the Theory of Dynamic Programming [J]. Proceedings of the National Academy of Sciences of the United States of America,1952,38(8):716-719.
    [122]BELLMAN R. The theory of dynamic programming [J]. Bulletin of the American Mathemat-ical Society,1954,60(6):503-515.
    [123]BELLMAN R. Dynamic Programming[M]. New Jersey:Princeton University Press,1957.
    [124]BLACKWELL D. Discrete dynamic programming [J]. The Annals of Mathematical Statistics,1962,33(2):719-726.
    [125]HELD M, KARP R M. A Dynamic Programming Approach to Sequencing Problems [J]. Jour-nal of the Society for Industrial and Applied Mathematics,1962,10(1):196-210.
    [126]BELLMAN R. Dynamic Programming [J]. Science,1966,153(3731):34-37.
    [127]HOWARD R A. Dynamic Programming[J]. Management Science,1966,12(5):317-348.
    [128]YAKOWITZ S. Dynamic programming applications in water resources [J]. Water Resources Research,1982,18(4):673-696.
    [129]PUTERMAN M L. Markov Decision Processes:Discrete Stochastic Dynamic Program-ming[M]. Canada:John Wiley&Sons,1994.
    [130]EDDY S R. what is dynamic programming?[J]. Nature Biotechnology,2004,22(7):909-910.
    [131]胡运权.运筹学基础及其应用(第五版)[M].北京:高等教育出版社,2008:198-219.
    [132]秦裕瑗Bellman最优性原理—论动态规划(I)[J].应用数学,1994,7(3):349-354.
    [133]SINHA S, TOMPA M. A statistical method for finding transcription factor binding sites [C]//Proceedings of the Eighth International Conference on Intelligent Systems for Molec-ular Biology(California). California:AAAI Press,2000:344-354.
    [134]龚光鲁.应用随机过程教程及在算法和智能计算中的随机模型[M].北京:清华大学出版社,2004:255.
    [135]REDNER R A, WALKER H F. Mixture densities, maximum likelihood and the EM algo-rithm[J]. SIAM Review,1984,26(2):195-239.
    [136]CHEN J H. Optimal rate of convergence for finite mixture models[J]. Annals of Statistics,1995,23(1):221-233.
    [137]BROOKS S P, MORGAN B J T, RIDOUT M S, et al. Finite mixture models for proportions [J]. Biometrics,1997,53(3):1097-1115.
    [138]MCLACHLAN G J, PEEL D. Finite Mixture Models[M]. Canada:John Wiley&Sons,2000.
    [139]UEDA N, NAKANO R, GHAHRAMANI Z, et al. SMEM algorithm for mixture models [J]. Neural Computation,2000,12(9):2109-2128.
    [140]BOHNING D, SEIDEL W. Editorial:recent developments in mixture models[J]. Computa-tional Statistics&Data Analysis,2003,41(3-4):349-357.
    [141]ZHANG B B, ZHANG C S, YI X. Competitive EM algorithm for finite mixture models [J]. Pattern Recognition,2004,37(1):131-144.
    [142]LAW M H C, FIGUEIREDO M A T, JAIN A K. Simultaneous feature selection and clustering using mixture models [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2004,26(9):1154-1166.
    [143]PARK B J, LORD D. Application of finite mixture models for vehicle crash data analysis [J]. Accident Analysis&Prevention,2009,41(4):683-691.
    [144]DIAS J G, VERMUNT K V, RAMOS S. Mixture hidden markov models in finance re-search[M]//. FINK A, BERTHOLD L, WILFRIED S, et al. Advances in Data Analysis, Data Handling and Business Intelligence. Berlin:Springer-Verlag,2010:451-459.
    [145]杜世平.混合二阶隐马尔科夫模型的Baum-Welch算法[J].云南大学学报(自然科学版),2006,28(2):98-102.
    [146]VITERBI A J. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm[J]. IEEE Transactions on Information Theory,1967,13(2):260-269.
    [147]OMURA J K. On the Viterbi decoding algorithm[J]. IEEE Transactions on Information The-ory,1969,15(1):177-179.
    [148]VITERBI A J. Convolutional codes and their performance in communication systems [J]. IEEE Transactions on Communications Technology,1971,19(5):751-772.
    [149]FORNEY G D. Convolutional codes II. maximum-likelihood decoding[J]. Information and Control,1974,25(3):222-266.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700