基于听觉感知小波包改进的语音处理方案对电子耳蜗汉语音感知的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
电子耳蜗是一种利用功能性电刺激直接兴奋听神经来恢复耳聋患者的听觉感知的装置,但是目前针对汉语音声调的语音编码策略还存在不足,如声调感知发音效果较差,对旋律和音乐感知较弱。
     针对电子耳蜗植入者对声调感知的不足,本文研究了时域信息与频域信息的获取率对声调感知的决定性作用,并提出了基于听觉感知的小波包分解策略,采用多分辨率分析思想,非均匀划分时频空间,获取有效的时频信息,从而提高患者音调和音质的感知能力。
     针对耳蜗植入电极数目的限制,在保证完整获取时频域信息和避免多电极通道相互间干扰的基础上,采用通道能量最大选取准则,剔除语音信息含量较少的频率通道,即保证了有效时频信息的传递,又避免了电极通道间的相互干扰。
     针对基于高速脉冲调制时产生的带内噪声等对汉语声调的掩蔽干扰,提出了调制深度选择方案,极大的削弱了带内噪声的干扰,进而提高了汉语声调信息的获取率。
     研究结果表明,基于听觉感知的小波包分解与调制深度选择相结合的策略,能有效地抑制带内噪声,并获取更丰富的时频信息,增强患者声调感知能力。
Cochlear implant (CI) is a medical device that can restore the residual auditoryperception for the profoundly or totally deafened person via functional electricalstimulation. At present, however,the speech coding strategies for Chinese tone still havesome shortcomings, such as poor perception of tone, melodies and music.
     According to the patients’ lack of tone perception, based on the decisive role ofinformation retrieval to tone perception in both time domain and frequency domain, wepropose the wavelet packet decomposition strategy using multi-resolution analysis andnon-uniformity dividing time-frequency space technology. To effectively accesstime-frequency information in order to improve the patients’ tone pitch perception andspeech quality.
     On the account of the limited number of cochlear implant electrode, basing onensuring the complete information obtained from the time-frequency domain andpreventing the multi-electrode channel mutual interference, we apply the channelmaximum energy criterion for selection of the effective channels so as to remove thechannels containing the less information of speech, which is to ensure the transmission ofeffective information and to avoid interference between the electrode channels.
     On account of the noise masking interference of Chinese tones causing by thehigh-speed pulse modulation, we propose the modulation depth options which greatlyweak the interference of in-band noise and thus improve the access to Chinese tones.
     The results show that the combination of wavelet packet decomposition basing onauditory perception and depth of modulation can effectively suppress the in-band noiseand enhance the tone perception.
引文
[1]韩纪庆,张磊,郑铁然.语音信号处理[M]..清华大学出版社.2004.
    [2] Zeng,F.G..Trends in cochlear implants. Trends In Amplification,2004,8(1):1-34.
    [3] Fan-GangZeng, John-Yuhan Bai. Trends in Cochlear Implant (CN).3-4
    [4] Philipos C.Loizou,Mimicking the human ear, IEEE Signal Processing Magazine,1998,15(5):101-130
    [5] P. Loizou, M. Dorman. Mimicking the human ear. IEEE Signal ProcessingMagazine,1998,9:101-130.House WF, Urban J. Long term results of electrodeimplantation andelectronic stimulation of the cochlea in man. Ann Otol RhinolLaryngol,1973,82:504-517.
    [6]袁桂清,北京协和医院人工而我中心成立,中华医学杂志,2003,83(7):540.
    [7] Kevin SM. A comparison study of speech perception of adult implantpatients usingthe MED-EL, CLARION and nucleus cochlear implants,[Paper for Ph Degree]The Univ. of Utah,2003:33-87.
    [8] Dorman, M., Loizou, P., Fitzke, J. and Tu, Z. The recognition ofsentences in noiseby normal-hearing listeners using simulations of cochlear-implant signal processorswith6-20channels. J. Acoust. Soc.Am.1998,104(6):3583-3585.
    [9]赵力.机械工业出版社.2003.5-9.
    [10] Philipos C. Loizou. Signal-Processing Techniques for Cochlear Implants. IEEEengineering in medicine and biology,1999.
    [11]田岚.增强电子耳蜗听感知的处理策略研究.2009.
    [12] hilipos C. Loizou. Introduction to cochlear implants [J]. IEEE engineering inmedicine and biology,1999,1/2:32-42.
    [13] Zierhofer C. Cochlear implant system. United States Patent1999.
    [14] Dorman MF, Smith L, McCanless G, et al. Pitch scaling and speech understandingby patients who use the inneraid cochlear implant[J]. Ear Hear,1990,11:310.
    [15]陈雪清,刘海红,刘博,等.时域和频域信息对汉语普通话声调识别的影响[J].中国听力语言康复科学杂志,2008,5:18.
    [16] Carroll J, Zeng FG. Fundamental frequency discrimination and speech perceptionin noise in cochlear implant simulations[J]. Hear Res,2007,231:42.
    [17]张家禄,齐士钤,宋美珍,等.汉语声调在言语可懂度中的重要作用[J].声学学报,1981,7:237.
    [18]梁之安.汉语普通话中声调的听觉辨认依据[J].生理学报1963,26:85.
    [19]郭莹,陈雪清,郭连生,等.滤除低频音对听力正常人声调识别的影响[J].听力学及言语疾病杂志,2008,16:447
    [20] Xu L, T sai Y, Pfingst BE. Featur es of s timulat ion affecting tonal-speechperception: Implicat ions for cochlear prosthenses[J]. J Acoust Soc Am,2002,112:247.
    [21] Au DK. Effect s of st imulat ion rat es on Can tonese lexical tone recognit ion bycochlear implant us ers in H on g Kon g[J]. Clinical Otolaryngology,2003,28:533.
    [22]金昊,许由,区建国,等.刺激速率对广东话人工耳蜗使用者声调认知能力的影响[J].中国听力语言康复科学杂志,2004,3:7.
    [23]22H ochberg I, Booth royd A., Weis s I, et al. E ffect s of noise and nois esuppression on speech perception by cochl ear implant s users[J]. Ear H ear,1992,13:263
    [24] Van dali AE, Wh itford LA, Plant KL, et al. Speech percep t ion as a fu nct ion ofelect rical s timulat ion rat e: Us ing the Nucleus24cochl ear impl ant s yst em[J].Ear H ear,2000,21:608.
    [25] Peng SC, T omblin JB, Cheung C, et al. Percept ion and produc t ion of man darinton es in prelingually d eaf children with co chl ear implant s[J]. Ear H ear,2004,25:251.
    [26] H an D, Zhou N, Li YX, et al. Tone product ion of Mandarin Chin ese speaking children with cochl ear implant s [J]. Int J Pediat r Otorhinolaryngol,2007,71:875.
    [27]刘勇智,曹克利,魏朝刚,等.低年龄段语前聋儿童人工耳蜗植入后汉语声调识别变化的分析[J].临床耳鼻咽喉头颈外科杂志,2007,21:1015.
    [28]王志恺,华清泉,曹永茂,等.先天性聋儿人工耳蜗植入后的汉语普通话声调辨别[J].中华耳科学杂志,2008,6:40.
    [29]区建国,金昊,许由,等.双侧人工耳蜗植入者在噪声环境下的言语辨别能力[J].中华耳鼻咽喉科杂志,2001,36:433.
    [30] W. House,“A personal perspective on cochlear implants,” in Cochlear Implants (R.Schindler and M. Merzenich, eds), New York; Raven Press,1985, PP.13-16.
    [31] W. House and J. Urban,“Long term results of electrode implantation and electronicstimulation of the cochlea in man,” Annals of Otology, Rhindogy and Largngology,1973. vol.82, pp.504-517.
    [32] W. House and K. Berliner,“Cochlear implants: Progress and perspectives,” Annalsof Otology, Rhinology and Laryngology,1982, vol.(Suppl.91), pp.1-124.
    [33] I. Hochmair-Desoeyer and E. Hochmair,“Percepts elicited by differentspeech-coding strategices,” Annals of New York Academy of Sciences,1983,vol.405, pp.268-279.
    [34] R. Shannon, F. Zeng, V. kamath, J. Wygonski, and M. Ekelid,“Speech recognitionwith primarily temporal cues,” science,1995,vol.270, pp.303-304.
    [35] M. Dorman, P. Loizon, and D. Rainey,“Speech intelligibility as a function of themumber of channels of stimulation for signal processors using sine-wave andnoise-band outputs,” Journal of the Acoustical Society of America,1997,vol.102,pp.2403-2411.
    [36]韩先花.电子耳蜗实现方案及其信号处理算法研究进展.生物医学工程学杂志,2003,20(2):340-344.
    [37]聂开宝,蓝宁,高上凯.人工电子耳蜗语音处理方法的研究进展.生物医学工程学杂志,1999,16(3):365-370
    [38] G. Clark,“The University of Melbourne-Nucleus multi-electrode cochlearimplant,” Advances in Oto-Rhino-Laryngology,1987, vol.38, pp.1-189.
    [39] P. Seligman, J. Patrick, Y. Tong, G. Clark, R. Dowell, and P. Crosby,“A signalprocessor for a multiple–electrode hearing prostheses,” Acta Otolaryngologica,1984,pp.135-139,(Suppl.411).
    [40] Philipos C. Loizou. Signal-Processing Techniques for Cochlear Implants. IEEEengineering in medicine and biology,1999
    [41] Tye-Murray N, Lowder M,and Tyler R. Comparison of the F0/F2and F0/F1/F2processing strategies for the Cochlear Corporation cochlear implant. Ear andHearing,1990,11:195-200.
    [42] Dorman M, Loizon P. Mechanisms of vowel recognition for Ineraid patients fitwith continuous interleaved sampling processors. J Acoust Soc Amer,1997,102:581-587.
    [43] H. McDermott, C. Mckay, and A. Vandali,“A new portable sound processor forthe university of Melbourne/Nucleus Limited multi electrode cochlear implant,”Jourual of the Acoustical Society of America,,1992vol.91, pp.3367-3371.
    [44] P. Seligman and H. McDermott,” Architecture of the Spectra22Speechprocessor,” Annals of Otology, Rhinology and Laryngology,1995, pp.139-141,(Suppl.166).
    [45] Greenwood DD. A cochlear frequency-position function for several species-29years later. J Acoust SocAm,1990,87:2592-2605.
    [46] Philipos C. Loizou. Speech p rocessing in vocoder2centric cochlear imp lants
    [J].Adv O torh imolaryngol Ba sel,Karger,2006,64:109.
    [47] Wilson CF, Lawson D, Wolford R, Eddington D et al.Better speech recognitionwith cochlear imp lants [J].Na ture,1991,352(6332):236.
    [48] Dorman M, Loizou P. Changes in speech intelligibility as a function of time andsignal processing strategy for an Ineraid patient fitted with Continuous InterleavedSampling (CIS) processors. Ear and Hearing,199718:147-155.
    [49] Clark G M, Black R, Dewhurst D J, et al. A multiple-electrode hearing prosthesisfor cochlear implantation in deaf patients [J].Med Prog Through Techno,1997,5:127-140.
    [50] Friesen LM, Shannon RV, Baskent D, et al. Speech recognition in noise as afunction of the number of spectral channels: Comparison of acoustic hearing andcochlear implants. J A coust Soc Am,2001,110:1150-1163.
    [51] Hill F J, McRae LP, McClellan RP. Speech recognition as a function of channelcapacity in a discrete set of channels. J A coust Soc Am,1968,44:13-18.
    [52] Shannon RV. Multichannel electrical stimulation of the auditory nerve in man. I.Basic psychophysics Hear Res,1983,11:157~189.
    [53] Shannon RV. Temporal modulation transfer functions in patients with cochlearimplants. J A coust Am,1992,91:2156-2164.
    [54] Wei CG, Cao KL, Zeng FG. Mandarin tone recognition in cochlear implant subjects.Hear Res,2004,197:87-95.
    [55] D. Sinha and H.Tewfik,“Low bit rate transparent audio compression using adaptedwavelets,” IEEE Trans. Signal Process.,1993,vol.41, no.12, pp.3463-3479, Dec.
    [56] S. Krimi, K. Ouni, and N. Ellouze,“Realization of a psychoacoustic model forMPEG1using gammachirp wavelet transform,” presented at the13thEur. SignalProcess. Conf.(EUSIPCO2005), Antalya, Turkey.
    [57] X. Han and K. Nie,“Implementation of spectral maxima sound processing forcochlear implants by using Bark scale frequency band partition,” in Proc. IEEEEMBS Conf.,2001, vol.2,pp.2097-2101.
    [58] X. Zeng, W. Zhao, and J. Sheng,“Corresponding relationships between nodes ofdecomposition tree of wavelet packet and frequency bands of signal subspace,”Acta Seismologica Sinica,2008,vol.21, no.1,pp.91-97.
    [59]陈雪清,刘海红等.语前聋患者人工耳蜗植入后声调识别能力研究.听力学及言语疾病杂志,2010,18(1):55-56.
    [60] Psarros CE, Plant KL, Lee K, et al. Conversion from the SPEAK to the ACEstrategy in children using the nucleus24cochlear implant system: speechperception and speech production outcomes, Ear and Hearing,2002;23(18):18
    [61] King Chung(2004), Challenges and Recent Developments in Hearing Aids: Part I.Speech Understanding in Noise, Microphone Technologies and Noise ReductionAlgorithms, Trends In Amplification8(3):98-99.
    [62] Green T, Rosen S, Faulkner A, Enhancing temporal cues to voice pitch availablethrough cochlear implant speech processors[C]. Daytona Beach, Florida, USA:Assoc. Res. Otolaryngol.,2003,26:198.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700