自动化守控系统的设计与实现

作者：沈新
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：自动化 ; 多声卡 ; 音频数据 ; 设备驱动程序 ; 遥控
英文关键词：automation ; multi audio cards ; audio data ; device driver ; remote control
学位年度：2007
导师：黄迪明
学科代码：081203
学位授予单位：电子科技大学
论文提交日期：2007-04-01

摘要

语音信号值守是一种常见的工作,被广泛应用在电信、医疗和军事等诸多领域。这项工作可能要长时间地面对噪声环境,而且根据业务数据性质的不同,重要的数据必须实现自动录制。在早期的守控系统中,从前端接收到终端信号值守的整个工作流程都需要工作人员手工操作,劳动强度很大,噪声环境对人的身体健康也产生着危害,信号录制设备的物理局限性还带来了数据的不可靠问题。就语音信号来说,在大多数计算机系统中,主要的音频设备是一块声卡,即便有多块声卡,操作系统几乎都限制使用其中的一块作为启用的音频设备。在某些场合,守控台位的工作要求同时使用多块声卡,实现对多路音频信号的数据采集,这样就提出了单计算机系统多声卡同时工作的课题。由于有多块声卡,采集到的数据会有多路,还必须实现声道的切换功能。声卡采集到的音频数据通常夹杂噪声,多数时候工作实际决定了采集到的音频数据夹杂种类繁多且分贝值高的噪声,这决定了需要研制的软件系统需要处理的音频数据是复杂的,为了保证工作数据的质量、改善工作人员的工作环境,必须对采集到的音频数据做有无判断。由于需要处理多路、大数据量的音频数据,而数据采集、处理的工作方式是各守控计算机分别采集处理或分发到其他计算机进行联合守控。广域网环境下,远程主机可以申请实时守控或者进行档案点播,因此,数据多播也是本系统研究的课题之一。本论文作者在分析人工守控方式的弊端基础上,针对某单位的工作实际,规划设计了以计算机为核心的硬件系统,提出了全新的自动化守控工作方式,以软硬件结合的形式实现了较高程度的语音信号自动值守。作者工作的主要内容为:
     1.设计并实现了音频数据的采集、传输和处理等技术,实现了语音的端点检测;
     2.设计并实现了基于Windows操作系统的硬件驱动程序;
     3.设计并实现了Windows下多声卡设备的控制;
     4.设计并实现了集群设备遥控。
Keeping watch on speech signals is one kind of work which can often be seen in telecom, medical treatment, military affairs, and so on. This kind of work often faces to long time noise. Moreover, some kinds of data must be automatically recorded according to their special character. In early systems, people must be an intervenor in the whole work flow, from the signal incepting to the signal keeping watch. It’s sure that this work is easy to make people feel tired and unhealthy. Fallibility is also be carried by the physical limitation of the recording devices. According to speech signals, in most computer systems, the main audio-device is an audio card. Although there are more than one audio cards in many computers, the operation systems on them always allow user using one of the audio cards as the only one in use. But in some units, people need more than one audio cards to work at the same time in one computer on the workstation desktop in order to collect multi routes of audio data. Accordingly, we must also be able to switch the audio channels. The data collected by audio card often contains noises, and this circs is more serious in my unit. In order to ensure the quality of data and improve the working conditions, we should take measures to judge and reduce the noises in the data. After being collected, the audio data must be dealt with, perhaps also be transferred in intranet. Because there are some other application systems which use the intranet to transfer data, we must consider the questions about multicast.
     The Author comes up with a new solution of automatic keeping watch. This solution is based on the design of hardware systems and software systems. Computers are its kernel. In this way, automatic keeping watch is implemented in a certain extent. The main results of the author’s work are as follows:
     1. Design and implementation of audio data’s collecting, transferring and dealing with. We solved the problem of judging the existence of sound from noises.
     2. Design and implementation of hardware drivers based on Windows.
     3. Design and implementation of multi audio cards’control in Windows.
     4. Design and implementation of mass devices’remote control.

引文

[1] 鲁宏伟,孔华锋,赵贻竹,裴晓泉.多媒体计算机原理与应用.清华大学出版社,2006,5
    [2] 林福宗.多媒体技术基础(第 2 版).清华大学出版社,2002,9
    [3] Gupta,and R. Jain.Visual Information Retrieval.Communications of the ACM,vol. 40,no.5,pp.69-79,May 1997.
    [4] J.C.Whitaker.张雪英,等译.数字音频技术宝典.科学出版社,2004,3
    [5] 易克初,田斌,付强.语音信号处理.国防工业出版社,2000
    [6] 杨行峻,迟惠生.语音信号数字处理.北京:电子工业出版社,1995
    [7] 张贤达,现代信号处理,北京:清华大学出版社,1995 年
    [8] Mason McCuskey. 游戏音效编程.重庆大学出版社,2005,3
    [9] Stallingsw.操作系统--内核与设计原理(第 4 版).电子工业出版社,2001.442-443
    [10] 周长发.Visual C++.NET 多媒体编程.电子工业出版社,2002 年 6 月第一版
    [11] 张雄伟,陈亮,杨吉斌.现代语音处理技术及应用.机械工业出版社,2003,8
    [12] 王炳锡.实用语音识别基础.国防工业出版社,2005,1
    [13] 邓勇刚,徐波,黄泰翼.Palm PC 语音识别算法及实现.计算机研究与发展,2000, 8
    [14] B.H. Juang.The past, present and future of speech processing.IEEE Signal Processing Magazine, May, 1998
    [15] 荆嘉敏,刘加,刘润生等.基于 HMM 的语音识别技术在嵌入式系统中的应用.电子技术应用,2003 Vol.29 No.10
    [16] Lee, C.H., Rabiner, L.R.A frame synchronous network search algorithm for connected word recognition.Proc. of IEEE, 77(2), pp.257-285, 1989
    [17] 方敏,浦剑涛,李成荣等.嵌入式语音识别系统的研究和实现.第七届全国人机语音通讯会议,2003,厦门
    [18] Huang, X.D., Jack, M.A.Semi-continuous hidden Markov models for speech signals. Computer Speech and Language, vol.3, pp.239-251, 1989
    [19] Zheng, F., Mou, X.L., Wu, W.H., et al.On the embedded multiple-model scoring scheme for speech recognition”.International Symposium on Chinese Spoken Language Processing (ISCSLP’98), ASR-A3, pp49-53, Singapore, 1998
    [20] Fang Zheng, Wenhu WU, Ditang Fang.A New Model for Speech Recognition: Center-Distance Continuous Probability Model.CJSLP’97, 187-192, Mar. 1997, Huangshan
    [21] Zheng, F., Song, Z.-J., and Xu, M.-X. EASYTALK: A large-vocabulary speaker-independent Chinese dictation machine.EuroSpeech‘99, Vol.2, pp.819-822, Budapest, Hungary, 1999
    [22] Fang Zheng, Guoliang Zhang.Integrating the energy information into MFCC, International Conference on Spoken Language Processing (ICSLP’00), pp. I-389~292, Oct. 16-20, Beijing
    [23] Jeliner F., Continuous Speech Recognition by Statistical Methods. In: Proc. IEEE, vol. 64, pp.532-556, Apr.1976
    [24] Imre Kiss, Marcel Vasilache.Low Complexity Technique for Embedded ASR System. ICSLP, Denver, Colorado USA, 2002
    [25] 陈德锋.嵌入式声控拨号器的设计与实现.清华大学工学硕士学位论文,2005,5
    [26] 张东滨,杜利民.语音识别的自适应束剪枝方法.电声技术,2004 年第 8 期
    [27] Janne Suontausta.Fast Decoding Techniques for Practical Real-time Speech Recognition System.IEEE Workshop ASRU99,Keystone,Clorado, 1999
    [28] Ney H. A Word Graph Algorithm for Large Vocabulary, Continuous Speech Recognition. In: Proc. ICSLP1994, Yokohama. 1994. 1355-1358
    [29] Haeb-Umbach R., Ney H.Improvements in Time-synchronous Beam Search for 10000-word Continuous Speech Recognition.IEEE Trans. Speech Audio Processing. 1994. v2, 353-356
    [30] 王卓,李鹏,苏牧,徐波.噪音环境下基于高阶谱的端点检测算法.第七届全国人机语音通讯会议,2003,厦门
    [31] J. G. Wilpon, L. R. Rabiner, and T. Martin.An improved word detection algorithm for telephone-quality speech incorporating both syntactic and semantic constraints.AT&T Bell Lab Tech. J, vol 63 pp 479_498, Mar, 1984
    [32] J. A. Haigh and J. S. Mason.Robust voice activity detection using cepstral features Proc.IEEE TENCON, 1993, pp 321-324
    [33] Tanenbauma S Woodhulia S.操作系统:设计及实现(第 2 版).清华大学出版社,1997
    [34] 孙守阁,徐勇.Windows 设备驱动程序技术内幕.清华大学出版社,2000,5
    [35] 武安河.Windows 2000/XP WDM 设备驱动程序开发(第 2 版).电子工业出版社,2005,5
    [36] Microsoft 公司.Microsoft Windows 驱动程序模型设计.北京大学出版社,2000,9
    [37] 黄承安.Visual C++.NET 经典开发案例.中国铁道出版社, 2003 年 3 月第一版
    [38] Jialin Shen, Jeihweih Hung, Linshan Lee.Robust entropy_based endpoint detection for speech recognition in noisy environments”.International Conference on spoken Language Processing, Sydney 1998
    [39] Fil Alleva, Xuedong Huang, Mei-Yuh Hwang, “An Improved Search Algorithm Using Incremental Knowledge for Continuous Speech Recognition” IEEE 1993
    [40] Thomas Hain, “Hidden Model Sequence Models for Automatic Speech Recognition” University of Cambridge 2001
    [41] Tapas Kanungo,Nathan S. Netanyahu,Christine D. Piatko,Ruth Silverman,and Angela Y. Wu.An Efficient k-Means Clustering Algorithm: Analysis and Implementation. IEEE Trans on Pattern Analysis and Machine Intelligence,2002