基于粒计算的语音实时分段算法

英文题名：Real-time Segmentation Algorithm of Speech Signal Based on Granular Computing
作者：郝静
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：语音分段 ; 实时性 ; 粒计算 ; 重要度 ; 决策规则
英文关键词：speech segmentation ; real-time ; granular computing ; significance ; decision rules
学位年度：2008
导师：张刚
学科代码：081002
学位授予单位：太原理工大学
论文提交日期：2008-05-01

摘要

多参数结合的实时语音分段算法在语音信号处理领域中具有十分重要的意义。本文的主要研究目的是寻找一种有效的方法,能够将多种语音信号的特征参数结合起来,实时地检测或确定连续语音信号采样的突变点,并以此突变点对语音信号分段。
     本文首先对几种常用于语音分段算法的特征参数进行了分析,提出了实时改进算法。采用帧长20样点,利用140个历史样点构成长度为160点的窗进行处理。并在此基础上提出了一种实时自相关的语音分段算法,实验表明,改进后的方法能够实现对语音信号的实时处理,提取的特征参数能够表征语音信号的突变信息,对不同信噪比的语音信号进行分段,其准确率能达到80%以上。
     本文在实现了特征参数实时提取的基础上,研究分析了粒计算理论,将粒理论应用到语音分段技术上,提出了一种基于粒计算的语音分段算法。该算法利用粒计算处理不精确、模糊、不确定、部分真实和海量的信息的能力对8种特征参数进行了分析,得到它们之间的相互关系及对语音分段的重要度,最终构成决策规则,从而有效的进行语音分段判决。实验发现该算法能够将多种特征参数的优点结合起来,利用其决策规则可以找到大部分语音信号的分段点,但存在错判漏判情况,判决准确度不高。
     针对错判和漏判的问题,经过大量的实验分析,本文对基于粒计算的语音分段算法提出了改进方案。改进从两个方面进行:一是对特征参数采集过程的改进,目的是消除噪声干扰,增加决策规则的准确性;二是对判决过程的改进,通过在决策规则的基础上加入自相关与能量参数构成双路径判决规则,达到辅助判决减少漏判错判的目的。改进后的算法对不同信噪比的语音数据的判决准确率均达到90%以上,漏判率和错判率明显降低,是一种比较理想的多参数结合语音实时分段算法。
Real-Time Multi-Parameter Speech Segmentation Algorithm has very important meaning in the field of Speech Signal Processing. This research tries to find an effective method combines various characteristic parameters of speech signal to detect the sharp variations in continuous speech signal. And we can carry out the segmentation by the sharp variations of speech signal.
     First, in analyzing several characteristic parameters in common use, this paper presents a real time improvement algorithm, which processes in the window of 160 sample points including current 20 sample points frame and history 140 sample points, and based on the algorithm, this paper also present the Real-time Autocorrelation Speech Signal Segmentation Algorithm. Research illustrates the real time improved algorithm can process the real time speech signal. The characteristic parameters can characterize the sharp variations of speech signal. The real time improvement algorithm can segment various speech signal of signal noise ratio; the precision ratio of decisions reaches at 80 percent.
     Based on the realization of characteristic parameters distilling and analyzing of the granular computing theory, this paper applies it on the theology of speech signal segmentation and presents the Speech Signal Segmentation Algorithm Based on Granular Computing。This algorithm takes the advantage of the abilities on processing inaccurate, fuzzy, undefined, partial real and mass information to analyze eight characteristic parameters of speech signal, then gets the relationship between the characteristic parameters and the significance of the characteristic parameters, and finally forms the decision rules to segment speech signal efficiently. Research finds this algorithm can combine many kinds of characteristic parameters' benefits on speech segmentation. It can find almost all the segmentation points of speech signal by its generated decision rules; however, some missed and inaccuracy decisions also exist.
     To resolve the missed and inaccuracy arbitration problem, through a great deal of experimentations, this paper present the two improve methods of the Speech Signal Segmentation Algorithm Based on Granular Computing: First, improves the characteristic sampling process in order to eliminate noise disturb and increase the precision of arbitration; Second, on the foundation of arbitration rule, adds autocorrelation and energy parameters and forms two-way arbitration rule, this improvement can assist the arbitration and reduce the missed decisions. After these improvements, the precision of arbitration reaches the 90 percent, the missed and inaccuracy decisions obviously decrease. In conclusion, the Speech Signal Segmentation Algorithm Based on Granular Computing is a relative perfect Real-Time Multi-Parameter Speech Segmentation Algorithm.

引文

[1]Steffen Pauws,Yves Kamp,Lei Willems,A hierarchical method of automatic speech segmentation for synthesis applications,Speech communication,19(1996),207-220
    [2]赵力,语音信号处理,北京,机械信号处理出版社,2003
    [3]赵高峰,基于小波分析的语音端点检测算法研究,太原理工大学硕士学位论文,2006.
    [4]Thoms W,Parsons.Voice and speech processing,Mc Graw-Hi 11,1986
    [5]Limaia Karray,Amaud Martin.Towards improving speech detection robustness for speech recognition in adverse conditions.Speech Communication,2003,40:261-276
    [6]易克初,田斌,付强.语音信号处理,北京:国防工业出版社,2000
    [7]杨行峻,迟惠生.语音信号数字处理,北京:电子工业出版社,1995
    [8]Lawrence Rabiner,Juang B.H.,Fundaments of speech recognition,entice Hall,Englewood Cliffs,1993
    [9]J D Johnston.,Transform coding of audio signal using perceptual noise criteris,IEEE J.Select,Areas Commun,1998,6:314-323
    [10]马建芬,基于小波变换的音素分段算法的研究,太原理工大学硕士学位论文,1999
    [11]沈亚强,低信噪比语音信号端点检测和自适应滤波[J],电子测量和仪器学报,2001,No.1,Vol.15,27-32
    [12]易克初,田斌,付强,语音信号处理[M],北京,国防工业出版社,2000
    [13]朱杰,韦晓东,噪声环境中基于HMM模型的语音信号端点检测方法,上海交通大学学报,1998,Oct.No 10.Vol 22.14-16
    [14]Aini Hussain,Salina Abdul Samad,Liew Ban Fah,Endpoint Detection of Speech signal Using Neural Network[R],TENCON 2000,Proceedings,Malaysia,2000,271-274
    [15]杨胜跃,周宴宇,黄深喜,语音信号端点检测方法与展望,信息技术,2005,No.7,5-8
    [16]丁昊,姚天任,基于Mel标度频谱和音素分割的汉语语音单词端点检测方法,计算机与数字工程,2005,No.3,Vol.33,57-59
    [17]梅晓丹,孙圣和,基于小波变换的静音与语音分割新算法,哈尔滨工业大学学报,2002.6,Vol.34,No.3,408-411
    [18]徐静波,冉崇森,语音处理中自适应小波变换的应用,计算机工程与科学,2004,No.7,Vol.26,86-88
    [19]M H Savoji,A robust algorithm for accurate endpointing of speech.Speech Commun,1989,8:45-60
    [20]张刚,张雪英,马建芬,语音处理与编码,北京,兵器工业出版社,2000
    [21]Lau YK,Speech recognition based on zero crossing rate and energy,ITASSP,1985,33:320-323
    [22]卢艳玲等,一种基于多特征的带噪语音信号端点检测与音节分割算法,电声技术,2005,7,60-62
    [23]David A.Krubsack,"An Autocorrelation Pitch Detector and Voicing Decision With Confidence Measures Developed for Noise-Corrupted Speech" IEEE Trans On Signal Processing.Vol 39.No.2February 1991.
    [24]KARRAYL,POLARD E,A wavelet denoising technique to improve endpoint detection in adverse conditions[A],EUROSPEECH'99[C],Budapest:Kluwer Academic Publishers,1999.2379-23821.
    [25]崔锦泰,小波分析导论,西安,西安交通大学出版社,1995,65-70
    [26]杨力华,戴道清,黄文良等,信号处理的小波导引[M],北京,机械工业出版社,2002.
    [27]陶智,葛良,基于小波变换的语音增强和噪声消除的研究,苏州大学学报(自然科学),2001,Oct,No.4,Vol.17,74-77
    [28]Doh-suk Kim,Soo-Young Lee,Rhee M Kil,Auditory processing of speech signal for robust speech recognition in world noisy environments.ITSAP,1999,1(7):55-58
    [29]Oded Ghitza,Auditory models and human performance in tasks related to speech coding and speech recognition,IRSAP,1994,1(2):113-131
    [30]李霄寒,戴蓓倩,方绍武,高阶MFCC的话者识别性能及其噪声鲁棒性,信号处理,2001,17(2):124-129
    [31]D o Shaughnessy,Speech Communication.Reading,MA:AddisionWesley,1987:150-153
    [32]梁五洲,抗噪语音识别特征提取算法的研究,太原理工大学硕士学位论文,2006
    [33]Y.Y.Yao,Granular Computing:basic issues and possible solutions.Proc.of fifth Joint Conference on Information Sciences,Vol.1,Atlantic City,New Jersey,USA,2000:186-189.
    [34]L.A.Zadeh,Fuzzy Sets and Information Granularity,in:M.M.Gupta,R.K.Ragade and R.R.Yager eds.,Advances in Fuzzy Set Theory and Applica-tions,North Holland,A msterdam,1979,3-18
    [35]J.R.Hobbs,Granularity.In:Proc.of UCAI,Los Angeles,1985.432-435
    [36]L.A.Zadeh,Fuzzy logic-computing with words,IEEE Transactions on Fuzzy Systems,4,1996:103-111
    [37]L.A.Zadeh,Towards a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic,Fuzzy Sets and Systems,19,1997,111-121
    [38]罗敏,粒计算及其研究现状,计算机与现代化,2007,No.1,1-5
    [39]张文修,吴伟志,梁吉业,李德玉,粗糙集理论与方法,北京:科学出版社,2001
    [40]李道国,信息粒计算理论、模型与应用研究,太原:山西科学技术出版社,2006.6
    [41]陈玉明,基于信息粒与粒计算理论的数据约简研究,南昌大学硕士学位论文,2005
    [42]Hart Jia-wei,Kamber M.数据挖掘:概念与技术.北京:机械工业出版社.2001
    [43]张永,丁洪昌,连续属性离散化的MaxDiff方法,计算机工程与应用,2007,43(19):80-82
    [44]Z.Pawlak,Rough Sets,International Journal of Computer and Inf or ma tion Science,1982,11(5):341-356
    [45]刘清,Rough集及Rough推理[M],北京:科学出版社,2001
    [46]L.A.Z adeh,Fuzzy Sets,Information and Control,1965(8),338-35

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700