基于自然语言处理的语音识别后文本处理
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
目前对语音识别后处理的研究正呈现出多样化,语言学知识在研究过程中越来越受到重视,应该更加深入地应用语言学知识,应用自然语言理解方面的各种现有及正在兴起的方法来改善语音识别系统的性能。
     本课题以此为指导,主要针对“奥运多语言综合信息服务”项目的典型示范系统“CityGuide”,研究语音识别后语句检错纠错方法。将采用基于自然语言理解方法,即主要从语法、语义和语用三个方面出发,重点关注语用信息对识别正确率提高的贡献。本文的主要研究工作和成果有:
     1,在智能移动终端的语音识别引擎之后引入基于自然语言理解模块,特别需要指出的是,在原有算法(包括语法、语义算法)基础上增加了语用算法和一些辅助算法,使语音识别的正确率约从52%提高到70%。
     2,目前该演示系统已完成在智能手机上的实验性设计、实现与测试,并尝试引入智能移动平台的语音引擎,实现语音识别及识别后利用自然语言理解方法来进行纠错。目前系统主要支持单句语音输入,所支持语种为中文/英文两种语言。
     3,提出了一种基于元搜索技术的在线语料知识库采集、学习、构建和更新优化方案,特别针对语言本身存在一定的模糊性和不确定性的特点,探讨了模糊理论在文本分类中的应用,提出了一种梯形隶属度函数法将分类结果模糊化,以及引入模糊熵的概念来评估文本模糊化分类的性能,克服了原有实验系统语料库规模小、领域局限性大、来源不够丰富、缺乏时效性的缺点。
At present the post-processing of speech recognition research is showing a diversity of linguistic knowledge in the course of the study, more and more attention should be paid to the knowledge of applied linguistics, in order to improve the performance of speech recognition systems; we should use various existing and emerging methods of natural language understanding.
     According to the national 863 project of Olympics oriented Multilingual Intelligent Information Service System, this thesis studies mainly on text correction for ASR (Automatic Speech Recognition) result in a demo system called CityGuide. All information will be based on the theory of natural language understanding, that is, mainly from the syntax, semantics and pragmatics of the three aspects, focusing on contribution of the pragmatics information to increase the correct rate. The main research work and achievements are:
     1, A new module of CI based NLU is added after the ASR module in IMP. Original tests have shown that this module could improve the precision of ASR result to some extent. As to CityGuide corpus testing, after pragmatics and other information is added, the precision of ASR could be improved from 52% to 70%.
     2, A demo system for this module is implemented in IMP, and original testing is finished. More effort is made to import an ASR program in IMP to connect the ASR and correction directly. Currently the system supports one sentence voice input a time. Chinese and English languages are both acceptable.
     3, Based on a proposed online search technology corpus knowledgebase acquisition, learning, building and updating optimization programme, in particular for the ambiguity and uncertainty of the language, discussed the application of the fuzzy theory in the text classification, proposed a trapezoidal membership function, and the classification results will be ambiguous, as well as the introduction of the concept of fuzzy entropy to assess the fuzzy text of the classification performance, overcome the shortcomings, thoses are the original small-scale experimental system corpus, the limitations of the field, the source is not rich enough, the lack of limitation.
引文
[1]沈海峰,语音识别中的环境补偿研究,北京邮电大学博士论文,2006.6
    [2]L.A.Zadeh.Fuzzy Sets.Information and Control.8.1965:338-353.
    [3]H P Luhn.A statistical approach to mechanized encoding and searching of literary information.IBM Journal of Research and Development,1957,4(1):309-317
    [4]Maron M E.On Relevance,Probabilistic Indexing and Information Retrieval.Journal of the ACM(JACM),1967,7(3):216-244.
    [5]B.B.Rieger.Tree-like dispositional dependency structures for nonpropositional semantic inferencing.In Proceedings of the 7th Intern.Conf.on information Processing and Management of Uncertainty in Knowledge-based Systems(IPMU-98).R.Yager and B.Bouchon-Meunier,Eds.Paris(Editions EDK).1998:351-358.
    [6]Rada R.,Mili H.,Bicknell E.et al.Development an application of a metric on semantic nets.IEEE Transactions on Systems,Man and Cybernetics,1989,19(1):17-30.
    [7]何新贵.模糊数据库中的语义距离及模糊视图.计算机学报.1989,10:757-764.
    [8]胡敬东,语音识别后处理系统的研究,华南理工大学硕士学位论文,2000.3。
    [9]张建平,王作英,赵庆卫,陆大衿,清华大学电子工程系,语音理解中的容错技术的研究,电子学报[J],2000年3月第3期,Page 84-86
    [10]檀林.模糊知识处理在中文文本自动分类中的应用研究.[学位论文].山西大学.2004.
    [11]张瑞强,王作英,张建平,清华大学电子工程系,带拼音纠错的汉语音字转换技术[J],清华大学学报(自然科学版),1997年第37卷第10期,Page 9-11
    [12]李明琴 王作英 陆大衿,清华大学电子工程系,语音识别音字转换中的快速容错算法[J],中文信息学报,第16卷第5期,Page 38-43
    [13]沈玺,王永成,WEB语音搜索中查询概念纠错的研究,上海交通大学计算机科学与工程系,计算机仿真,2006年2月,第23卷第2期,P222-226
    [14]Lee,J.H.,Kim,M.H.,Lee,Y.I.,1993.Information retrieval based on conceptual distance in IS-A hierarchies.Journal of Documentation,49(2):188-207.
    [15]刘群,李素建.基于《知网》的词汇语义相似度计算[A].第三届汉语词汇语义学研讨会论文集[C].台北:[sn],2002.59-76.
    [16]钟义信.信息科学原理,第三版.北京:北京邮电大学出版社.2002.
    [17]李涓子(1999),汉语词义排歧方法研究,清华大学博士论文
    [18]刘开瑛,郭炳炎.《自然语言处理》.科学出版社
    [19]鲁松(2001),自然语言中词相关性知识无导获取和均衡分类器的构建,中国科学院计算技术研究所博士论文
    [20]Dagan I.,Lee L.and Pereira F.(1999),Similarity-based models of word cooccurrence probabilities,Machine Learning,Special issue on Machine Learning and Natural Language,1999
    [21]陈业华,黄元美,高峰.基于模糊熵的聚类有效性分析.燕山大学学报.第31卷第1期.2007,1:44-46
    [22]Ren-Yuan Lyu,etal."Godlen Mandarin(Ⅱ)An Improved Single-chip Real-time Mandarin Dictation Machine for Chinese Language with Very large Vocabulary".ICASSP.1993
    [23]Ren-Yuan Lyu,etal."Godlen Mandarin(Ⅲ)A User-Adaptive Prosodic-Segment-Based Mandarin Dictation Machine for Chinese Language with Very Large Vocabulary".ICASSP.1995
    [24]汤建华,徐进霈.《利用句法、语义循环递归网络实现拼音到汉子的自动转换系统》.《中文信息学报》.Vol.3,No.4
    [25]潘凌云,杨长生.《拼音、汉子计算机自动转换系统》.《计算机学报》.1990年4月
    [26]仲兴国.《多词组一次性拼音-汉子变换》.《中文信息学报》.Vol.4,No.2
    [27]钟义信,全信息自然语言理解方法论[C],第2届中日自然语言理解处理专家研讨会论文集,2002.
    [28]刘淑芬,广西大学 文化与传播学院,《试论语法分析中句法语义语用的三位一体性》,唐山师范学院学报,第27卷,第6期,2005年11月,Page38-40.
    [29]何自然,吴亚新,中国语言学研究会,《语用学概略》,http://www.pragmaticschina.com/Article/ArticleShow.asp?ArticleID=517,2005年9月
    [30][美]Paul C.Jorgensen著,韩柯,杜旭涛译,软件测试,机械工业出版社,2005
    [31]Yu SUN.Richard Knoury.Fakhri Karray.Semantic Context Classification by Means of Fuzzy Set Theory.In Proceeding of NLP-KE'05.Zong Chengqing.Wuhan,China.2005:250-255.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700