机器人手术网络遥控系统—语音识别和机械手控制

英文题名：Speech Recognition and Manipulator Control of Network Remote Control System in Robot Operation
作者：周振辉
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：网络控制 ; 机器人手术 ; 语音识别
英文关键词：Network Control ; Robot Surgical Operation ; Speech Recognition
学位年度：2006
导师：富钢
学科代码：081203
学位授予单位：沈阳航空工业学院
论文提交日期：2006-01-25

摘要

网络的利用是工业现代化的标志，而机器的自动化控制则是现代工业文明的产物。在汽车、医疗、航空航天等领域中，计算机与机械控制有着完美的结合。本文结合目前非常流行的语音识别技术，描述了在局域网环境下，服务器端和客户端计算机的医生如何通过声像来进行相互交流，以及如何通过麦克风来控制远端的机器人手臂来做出各种可能的动作，从而利用语音识别技术实现网络遥控机器人手术。本系统利用微软的Visual C++6.0和PLC，完成了基于局域网的6自由度机器人手臂控制系统的设计与实现；利用微软的Speech API 5.1(SAPI)，在熟悉了该语音开发工具包的基础上，汲取其中和本项目相关的部分，以Visual C++6.0为开发平台，完成了机器人手术网络遥控系统中的语音识别部分。
实践表明，基于网络和语音控制的机器人手术控制系统在实验室环境下是非常成功的，尽管在噪声环境下的鲁棒性还有待加强，但对语音识别在远程医疗领域的应用还是具有重要的参考价值。
The exploitation of the network is a symbol of industry modernization, and the automation control of machine is the outcome of modern industry civilization. In these fields of automobile, medical treatment, aviation and spaceflight etc, computer and mechanism control have combined together perfectly. Based on the very popular technique of the speech recognition currently, it described how to communicate with each other doctors of server and client by the voices and images, and how to control the long-distance robot's arm to finish all sorts of actions what it can through the microphone under the condition of LAN, thereby, the system that makes use of the technique of speech recognition to telecontrol the robot surgical operation through the network had been realized. With the Microsoft Visual C++ 6.0 and the PLC, this system accomplished the design and realization of the subsystem of the robot's arm control in the LAN with 6 DOF. Based on the knowing well of Microsoft speech development toolkit, drawing the correlative knowledge with the project from the toolkit, it fulfilled the rest of the system of telecontrol the robot surgical operation through the network with the Microsoft Speech API 5.1 (SAPI) and the development platform Visual C++ 6.0.
It is indicated from the practices that the system of robot surgical operation control which was base on the network and speech control has been very successful under the laboratory environment, though the robustness in the noise needs to strengthen, there still has very important reference value in the medical treatment field for the application of speech recognition.

引文

[1] 蔡景理．远程手术机器人应用现状和发展[J]．外科理论与实践，2003，8(4)：351-352
    [2] 泰斌杰，孙九爱．计算机辅助手术向远程手术方向的发展[J]．北京生物医学工程，2002，21(2)：156-159
    [3] 丁沛．语音识别中的抗噪声技术[D]：[博士学位论文]．北京：清华大学，2003
    [4] 肖江南．汉语孤立词识别系统开发与研究[D]：[硕士学位论文]．桂林：广西师范大学，2004
    [5] 钟金宏．基于音节的汉语连续语音声调识别方法研究[D]：[博士学位论文]．合肥：合肥工业大学，2001
    [6] 柳洪义．机器人技术基础[M]．北京：冶金工业出版社，2002．49-74
    [7] 高松海．遥控机器人[M]．北京：原子能出版社，1987
    [8] 周冬生．机械手肋骨冷弯机PLC控制系统的研究[D]：[硕士学位论文]．武汉：武汉理工大学，2004
    [9] 常晓玲．电气控制系统与可编程控制器[M]．北京：机械工业出版社，2003．1-41
    [10] 陈春雨，李景学．可编程控制器应用软件设计方法与技巧[M]．北京：电子工业出版社，1992．22-37
    [11] 宋建成．可编程控制器原理与应用[M]．北京：科学出版社，2004．1-164
    [12] Tamara Dean．计算机网络实用教程[M]．陶华敏，韩存兵，宋德伟译．北京：机械工业出版社，2000．1-8
    [13] 汪晓平，钟军．Visual C++网络通信协议分析与应用实现[M]．北京：人民邮电出版社，2003．76-125
    [14] 张友生．远程控制编程技术[M]．北京：电子工业出版社，2002
    [15] 王炳锡，屈丹，彭煊．实用语音识别基础[M]．北京：国防工业出版社，2005．287-301
    [16] 修国浩．基于WD／HMM的语音识别算法研究[D]：[硕士学位论文]．秦皇岛：燕山大学，2004
    [17] 马俊．语音识别技术研究[D]：[硕士学位论文]．哈尔滨：哈尔滨工程大学，2004
    [18] 朱淑琴．语音识别系统关键技术研究[D]：[硕士学位论文]．西安：西安电子科技大学，2004
    [19] 陈大为．基于HMM的说话人识别改进研究及应用[D]：[硕士学位论文]．杭州：浙江大学，2002
    [20] 张卫清．语音识别算法的研究[D]：[硕士学位论文]．南京：南京理工大学，2004
    [21] 刘永红．说话人识别系统的研究[D]：[硕士学位论文]．成都：西南交通大学，2003
    [22] 范轶翔．列尾装置自动检测系统的设计与实现[D]：[硕士学位论文]．成都：四川大学，2004
    [23] 吕刚．多通道说话人检索算法研究[D]：[硕士学位论文]．杭州：浙江大学，2005
    [24] 方建军，何广平．智能机器人[M]．北京：化学工业出版社，2004．1-8
    [25] Microsoft Inc．Speech API帮助文档[DB／DK]．\Wicrosoff Speech SDK 5.1\Docs\Help\sapi.chm
    [26] 李禹材，左友东，郑秀清等．基于Speech SDK的语音控制应用程序的设计与实现[J]．计算机应用，2004，24(6)：114-116
    [27] Michael Morrison，et al．XML揭秘——入门·应用·精通[M]．陆新年，陆新宇等译．北京：清华大学出版社，2001．1-8
    [28] 吕成国，韩纪庆，王承发等．基于环境判别学习的高噪声命令语音识别系统[J]．哈尔滨工业大学学报，2003，35(2)：134-137
    [29] 陈一宁，李科，周静芳等．语音识别／说话人识别中的高效算法[J]．计算机工程，2004，30(15)：1-3
    [30] 赵贤宇，王作英．用于语音识别的鲁棒自适应麦克风阵列算法[J]．清华大学学报(自然科学版)，2004，44(10)：1433-1436
    [31] 汪鹏，刘加，刘润生．基于离散HMM的非特定人关键词提取语音识别系统[J]．吉林大学学报(理学版)．2003，41(3)：347-351
    [32] 程雪林，吴开政，李宗葛．利用上下文和基频提高汉语连续数字串识别性能[J]．计算机工程与应用，2003(23)：84-86
    [33] 赵以宝，孙圣和，臧天仪．局域网并行处理在语音识别中的应用[J]．小型微型计算机系统，1999，20(9)：699-702
    [34] 朱杰，韦晓东．噪声环境中基于HMM模型的语音信号端点检测方法[J]．上海交通大学学报，1998，32(10)：14-16
    [35] 李晓燕，张翔，陈立伟．基于VC 6.0和OpenGL机械手三维仿真演示系统[J]．计算机工程与设计，2004，26(6)：982-985
    [36] 田斌．实用化汉语语音识别理论及关键技术研究[D]：[博士学位论文]．西安：西安电子科技大学，2000
    [37] 诸刚．汉语语音识别技术在机器人控制中的应用[J]．北京市计划劳动管理干部学院学报，2004，12(1)：47-49
    [38] 罗志增，赵敬斌．机器人语音控制及其实现[J]．杭州电子工业学院学报，2004，24(1)：30-34
    [39] 李虎生，刘加，刘润生．高性能汉语数码串语音识别[J]．电子学报，2001，29(5)：595-599
    [40] 吕萍，吴及，王作英．连续语音识别中的说话人快速自适应技术[J]．清华大学学报(自然科学版)，2002，42(7)：977-980
    [41] 郭跃华，周汉新．手术机器人的发展与现状[J]．中华外科，2005，43(1)：64-66
    [42] 崔冬青，李治柱，吴亚栋．一种噪声环境下连续语音识别的快速端点检测算法[J]．计算机工程与应用，2003(23)：95-97
    [43] 张江安，林良明，王国民．辅助内镜手术机器人系统的研究与关键技术[J]．中国医疗器械，2002，26(1)：54-58
    [44] 刘洋，富历新，杜志江．用于骨科手术的六自由度全自动医用C形臂[J]．高技术通讯，2005，15(1)：45-48
    [45] 李净，郑方，吴文虎．汉语连续语音识别中上下文相关的声韵母建模[J]．清华大学学报(自然科学版)，2004，44(1)：61-64
    [46] 张兰芳．基于Visual C++开发的并联六自由度平台测控系统[D]：[硕士学位论文]．杭州：浙江大学．2002
    [47] G Zavaliagkost, R Schwatz, J Makhoul. Batch, Incremental, and Instaneous Adaptation Techniques for Speech Recognition[C]. IEEE eds. Proceedings of International Conference on Acoustic Speech Signal Processing. Australia: Causal Productions Pry Ltd. Rundle Mall, 1995, 676-679
    [48] Qignang Lin, Chiwei Che. Normalizing the Vocal Tract Length for Speaker Independent Speech Recognition[J]. IEEE Signal Processing Letters, 1995, 2(11): 201-203
    [49] Hanisch E, Markus B, Gutt C, et al. Robot-assisted laparoscopic choleeystectomy and fundoplication—initial experiences with the Da Vinci system[J]. Chirurg, 2001, 72(3): 286-288
    [50] C Myers, L Rabiner. Connected Digit Recognition Using a Level Building DTW Algorithm[J]. IEEE Trans. onASSP, 1981(29): 351-363
    [51] C J Leggetter. Improved Acoustic Modeling for HMMs Using Linear Transformations[D]: [Dissertation for PhD]. UK: Cambridge University, 1995
    [52] M Afify, Y F Gong, J P Haton. A General Joint Additive and Convolutive Bias Compensation Approach Applied to Noisy Lombard Speech Recognition[J]. IEEE Trans. on Speech and Audio Processing, 1998, 6(6): 524-538
    [53] Cadiere GB, Himpens J, Vertruyen M, et al. Evaluation of telesurgical(robotic) Nissen fundoPlication[J]. Surg Endosc, 2001, 15(9):918-923
    [54] C H Lee, C H Lin, B H Juang. A Study on Speaker Adaptation of the Parameters of Continuous Density Hidden Markov Models[J]. IEEE Trans. on Acoustis and Speech Signal Processing, 1991, 39(4): 806-814
    [55] J L Gauvain, C H Lee. Maximum a Posteriori Estimation for Multivariate Gaussian Observations[J]. IEEE Trans. on Speech and Audio Processing, 1994, 2(2): 291-298
    [56] Degueldre M, Vandromme J, Huong PT, et al. Robotically assisted laparoscopic microsurgieal tubal reanastomosis: a feasibility study[J]. Fertil Steril, 2000, 74(5): 1020-1023
    [57] J Thiemann, P Kabal. Low Distortion Acoustic Noise Suppression Using a Perceptual Model for Speech Signals[C]. Prec. IEEE Workshop on Speech Coding, 2002:172-174
    [58] A Ganapathiraju, J Hamaker, J Picone. Hybrid SVM/HMM Architectures for Speech Recognition[C]. ICSLP 2000, 2000:594-507
    [59] D Xin, Z H Wu. Speaker Recognition Using Continuous Density Support Vector Machines[J]. IEEE Electronics Letters, 2001, 37(17): 1009-1011
    [60] Neff C, Potamianos G. Large-vocabulary Audio-visual Speech Recognition[C]. A Summary of the Johns Hopkins Summer 2000 Workshop. 2001 IEEE Fourth Workshop on Multimedia Signal Processing, 2001:619-624

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700