基于骨导信号的语音重构技术

作者：李静
论文级别：硕士
学科专业名称：水声工程
中文关键词：骨导信号 ; 语音重构 ; 谐波修正 ; 权系数 ; 原理样机
英文关键词：Speech Signal Conducted by Bone ; Speech Reconstruction ; Harmonic Correction ; Power Constant ; Prototype
学位年度：2004
导师：盛美萍
学科代码：082403
学位授予单位：西北工业大学
论文提交日期：2004-03-01

摘要

随着移动通信的发展，高背景噪声环境下的语音通信问题已成为一个迫切需要解决的问题，如何更为有效地去除背景噪声对语音通信的影响，日益受到人们的广泛关注。
     本文与传统的基于噪声特性的自适应噪声抵消法、频谱减法等语音增强降噪技术不同，是以骨导语言为研究对象，采用理论与实验相结合的方法对骨导信号的声学特性进行了探索性研究，进而提出了基于骨导信号的语音重构技术，并完成了相应的软硬件开发。整篇论文分为三个方面：①骨导语言特点的分析；②在理论上实现骨导信号的语音重构；③原理样机的开发。
     首先从骨导语言的声学原理入手，对骨导语言的各种特征进行了深入分析。文中分析了骨导信号和语音信号的相关性，以及频谱与音色的关系，发现了其中的一些规律，提出基于谱修正的语音重构，并通过大量统计实验，得到骨导信号谱修正的权系数。
     然后，探讨了基于骨导信号的语音重构设计方法和实现途径。分析了语音信号产生的正弦模型，并在此基础上完成了骨导信号的语音重构；与此同时，结合数字信号处理知识，分别用谐波修正和时变数字滤波器的方法，完成基于骨导信号的语音重构。
     最后，通过对上述三种重构方法的对比分析，提出适用于飞行通讯和一般地面移动通讯不同需要的两种重构方法。进而，针对一般地面移动通讯需求在模拟电子技术原理基础上开发了一套结构小、成本低、实用性好的原理样机。
With the development of mobile communication, speech in telecommunication under the environment of high background noise has been a problem which is an urgent demand to be meet. Since the signal and noise have the same band, it is very difficult to separate speech signal from background noise, which is concerned by more and more researchers.
    This dissertation is different from traditional speech enhancement methods which are based on noise characteristic such as adaptive noise cancellation or spectral subtraction processing. In this dissertation the speech signal conducted by bone was taken as the object to be studied and the exploitive study on the acoustical characteristic of speech signal conducted by bone was performed by the method of theory combined with experiment. Then a proposition about speech reconstruction based on speech signal conducted by bone was presented, and the design of software and hardware was completed. The thesis mainly includes the following three aspects: 1. analyzing the characteristics of speech signal conducted by bone; 2. reconstructing speech signal based on speech signal conducted by bone theoretically; 3. designing a prototype.
    The first aspect was contributed to analyze characteristic of speech signal conducted by bone. In this part, an analysis to several kinds of character of speech signal conducted by bone based on its acoustical principle was firstly given. Secondly, the relativity of the speech signal and that conducted by bone and the relation between the spectrum and timbre were analyzed, so some regulations were discovered. Based on correction of spectrum, a method for speech reconstruction was proposed. Thirdly, the power constant was obtained by large quantity statistic experiments.
    The second aspect was to design the speech reconstruction based on speech signal conducted by bone and its implementation. Focused on the sine model of the speech signal creation, and finished the reconstruction by using it. At the same time, based on the knowledge of digital signal processing, reconstruction by harmonic correction and time variant digital filter was proposed.



    The last aspect was devoted to compare and analyze the above three types of reconstruction. In this dissertation, two reconstruction methods suitable for communication in flight and general terrestrial mobile communication separately were chosen. Based on analog electronic technology theory, a prototype was designed, which was small, cheap and applicable for terrestrial mobile communication.

引文

[1] 周笃强，王能达，李道德，牛聪敏．高强噪声环境中通话的语言可信度．航天医学与医学工程，1995，Vol．8，No 4．
    [2] 方建淳．语音合成技术与单片微机综合系统．北京航空航天大学出版社，1993．
    [3] Schoroeder M R, Atal B S. Code-excited Linear Prediction High Quality Speech at Very Low Bit Rates. ICASSP-85.
    [4] George E B, Smith M J T. An Analysis-by-Synthesis Aproach to Sinusoidal Modeling Applied to the Analysis and Synthesis of Speech. Journal of the Audio Engineering Society, June, 1992.
    [5] McAulay R. J. and Quatieri T. F. Speech Analysis/Synthesis Based On a Sinusoidal Representation of Speech. IEEE trans ASSP, 1986, 34.
    [6] McAulay R. J. and Quatieri T. F. Magnitude-Only Reconstruction Using A Sinusoidal Speech Model. MIT Lincoln Laboratory, MA 02173-0073.
    [7] McAulay R. J. and Quatieri T. F. Mixed-Phase Deconvolution of Speech Based on Sine-Wave Model Proceedings of the International Conference on Acoustics. Speech and Signal Processing, April, 1987.
    [8] 赵力．语音信号处理．机械工业出版社，2003．
    [9] 林祥荣．关于接触传导语言特性初步研究．通讯与电声，1976，No 2．
    [10] 鲍怀翅，林祥荣．接触传导汉语音节清晰度．电声技术，1976，No 3．
    [11] 林祥荣，鲍怀翅．接触传导汉语的频谱特性．在第三届全国声学会议上的报告，1979．5
    [12] 桑志明，孙小珺．可供耳弱者使用的骨导电话机．声学会议增刊，2001．
    [13] 杨顺安．小型接触式送话器．电声技术，1977，No 1．
    [14] 林祥荣．关于接触传导语言特性初步研究．通讯与电声，1976，No 2．
    [15] 何祚镛，赵玉芳编．声学理论基础．北京：国防工业出版社，1981．8．
    [16] 林祥荣．电声技术．1998，No 3．
    [17] 林祥荣．电声技术．1998，No 1．
    [18] 姚天任．数字语音处理．华中理工大学出版社，1992．
    [19] 曹梅杰．语言通信中的一些声学问题．应用声学，1995，Vol．4，No．2．


    [20] Bailly G, Bencit C & Sawallics T R. Talking Machines: Theories, Models and Design. Elsevier Science Publishers B.V., 1992.
    [21] McAulay R. J, Quatieri T F. Sine-Wave Phase Coding at Low Data Rates. IEEE ASSP-91, 1991.
    [22] Gold B et al. Parallel Processing Techniques for Estimation Pitch Periods Speech in The Time Domain. J. Acoust. Soc. Am., 1969, No. 46.
    [23] Miller N J. Pitch Detection by Data Reduction. IEEE ASSP -23, 1975.
    [24] Sondhi M M. New Methods of Pitch Extraction. IEEE AU-16, 1968.
    [25] 徐学庆，俞铁城．音调提取的新方法探索．NCMMSC’98，1998．
    [26] Rabiner L R. On the Use of Autocorrelation Analysis for Pitch Detection. IEEE Trans. ASSP-25, 1977.
    [27] NolI A M. Cepstrum Pitch Determination. J. Acoust. Soc. Am., 1967, Vol. 41.
    [28] Mousset E, Ainsworth W A. Fonollosa J A R. A comparison of several recent methods of fundamental frequency and voicing decision estimation. ECSLP, 1996, No. 2.
    [29] 吴宏，迟惠生．一种高性能的限定文本说话人确认系统．语音识别和合成，四川科学技术出版社，1994．
    [30] Gallagher C N, Wise G L. A. Theoretical Analysis of the Properties of Median Filters. IEEE Trans, 1981, ASSP-29.
    [31] D.B.Paul. The Wpwctral Envelop Estimation Vocoder on Aeoust. Speech and Signal Proc. IEEE Trans Assp-29, 1981.
    [32] Kang G S, Fransen L J. Low-Bit Rate Speech Encoders Based on Line-Spectrum Frequencies. ICASSP 84.
    [33] 杨行峻，迟慧生．语音信号数字处理．电子工业出版社，1995．
    [34] 林达悃．语音的频谱与音色．第三届全国声学学术会议论文摘要，1982．
    [35] 包紫薇，魏荣爵．用语噪声法研究发音人的音色特征．物理学报，1978，Vol 27，No 4．
    [36] 李锂．频谱及非频谱因素对音色的影响．西安工业学院学报，1997，Vol 17，No 2．
    [37] 林达悃，韩登爱．汉语童音及仿童音的平均谱．声学进展，1984，No 1．
    [38] 北大中文论坛，语音学．http://chinese.pku.edu.cn/bbs
    [39] 帕森斯．语音处理．国防工业出版社，1990，4．
    [40] R. J. McAulay and T. F. Quatieri. Mid-Rate Coding Based on A Sinusoidal

    Representation of Speech. MIT Lincoln Laboratory, MA 02173-0073.
    [41] 南京工学院数学教研组．积分变换．高等教育出版社，1997．
    [42] 胡广书．数字信号处理．清华大学出版社，2001．
    [43] Rabiner L R, and Gold B. Theory and Application of Digital Signal Processing, 1975.
    [44] A．V．奥本海姆，R．w．谢弗．数字信号处理．科学出版社，1980，12．
    [45] S S．雷欧．工程优化原理及应用．北京理工大学出版社，1990．
    [46] 邹理和．数字信号处理．国防工业出版社，1985．
    [47] http://jwc.seu.edu.cn/xuyue/w104/s6.6/importent/zhang4/fudu.htm
    [48] 倪国熙．常用的矩阵理论和方法．科学技术出版社，1984．
    [49] 陈永彬，王仁华．语音信号处理．中国科技大学出版社，1990，4．
    [50] A．V奥本海姆．信号与系统．西安交通大学出版社，1985．
    [51] 童诗白．模拟电子技术基础．高等教育出版社，1998．
    [52] 吴运昌．模拟集成电路原理与应用．华南理工大学出版社，1995．
    [53] 谢自美．电子线路设计·实验·测试．华中科技大学出版社，2000．
    [54] D．E．约翰逊，H．P穆尔．有源滤波器精确设计手册．电子工业出版社，1984．
    [55] Haykin S. Modern Filters. Macmillan Publishing Company, 1989.
    [56] http://www-s.ti.com/sc/ds/TL084.pdf.
    [57] http://www.national.com/ds.cgi/LF/LF411.pdf.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700