语言学习系统中发音质量的计算机辅助分析与评价

英文题名：Computer Analysis-based Pronunciation Quality Assessment in Language Learning System
作者：孙妍
论文级别：硕士
学科专业名称：微电子学与固体电子学
中文关键词：口语跟读 ; 发音质量 ; 语音信号处理 ; 计算机辅助评价
英文关键词：spoken-language imitation ; pronunciation quality ; speech signal processing ; computer analysis-based evaluation
学位年度：2007
导师：杨晓非
学科代码：080903
学位授予单位：华中科技大学
论文提交日期：2007-01-01

摘要

现代英语教学的重点已经由过去以培养学生语法、阅读理解能力为主转变到以培养学生听说及英语综合应用能力为主,口语训练及其评价受到了越来越多的关注。与此同时,由于计算机技术和信号处理技术的飞速发展,以及参加英语口语考试人数的迅猛增加,迫切需要对英语口语的计算机辅助测试系统进行研究和开发。
     本论文重点研究了英语口语跟读中发音质量的计算机辅助评价方法。参加口语测试者在语音训练库中,任意选取语音样本,根据样本的标准发音进行跟读。系统在运行过程中,分别提取二者的特征信息进行比对,通过计算标准模板和训练模板特征参数之间的欧几里德距离,评价发音质量。其中所涉及到的相关技术包括:语音信号的预处理;语音聚类算法;语音特征参数提取;矢量量化和失真测度分析等。
     论文取得的阶段性成果主要包括:其一,为了降低系统的误判率,本文采用了一种新的特征提取算法,使机器评分和专家评分的结果尽可能接近;其二,计算标准模板和被测信号的矢量失真测度时,应用了一种特殊的数学模型,能够定量地以百分制形式给出最终结果。论文的研究成果对其他类型的语言测试系统具有一定参考价值。
The focus of modern English education has changed, which concerned from students’grammar and reading comprehension skills to listening and speaking skills. Oral English training and evaluation thereof have abstracted more and more attentions. At the same time, because of network technology and speech signal process technology’s great development, as well as a crush increase number of oral English examinees, it is necessary to build and develop a computer design-based oral English test system.
     In the project, one of important functions of computer analysis-based pronunciation quality assessment was accomplished, which means following standards. Trainers would choose a demo from an established speech training bank depending on their requirements. When test starts, the system extracts two feature coefficients. Through comparing them, calculating a Euclidian distance of feature coefficients between standards and trainings, it can judge the pronunciation quality. All these referred theories include: speech signal pre-processing, speech clustering algorithm, feature extraction, and distortion measurement.
     Following key innovations were included. On one hand, in order to decrease the system’s error rate and get closer to experts’judgments, a new feature coefficient was adopted. On the other hand, a special method to calculate the Euclidian distance between standards and trainings was proposed, so as to give a reasonable score finally. Its achievement can also benefit other fields of language test to some degree.

引文

[1]贾卓燕,申瑞民.基于WEB的口语考试系统的设计与实现.计算机仿真,2004,21(4):140～142
    [2] Brown J D. Computers In Language Testing: Present Research and Some Future Directions. Language Learning &Techno logy, 1997,(7): 44～59
    [3]杨满珍. 20世纪90年代外国语言测试的发展.外语教学,2002,23(5):39～46
    [4] Ming J, Jancovic P, Smith F J, Robust speech recognition using probabilistic union models. IEEE Trans. Speech Audio Processing, 2002, 10(11): 403～414
    [5]刘云冰.语音识别技术的回顾与展望.软件技术评述,2005,(13):9～11
    [6]卫乃兴.中国学习者英语口语语料库初始研究.现代外语,2004,27(2):140～149
    [7] Bachman L F. Fundamental Considerations in Language Testing. London: Oxford University Press, 1990: 232～300
    [8] Milanovic M, Saville N. Performance Testing Cognition and Assessment. The 15th Language Testing Research Colloquium, Cambridge and Arnhem Cambridge, 1996:118～125
    [9]倪昕,蔡莲红.基于混合基元模型的非定长基元选取算法.小型微型计算机系统,2005,26(6):1079～1082
    [10]田巧智.计算机在语言测试中的应用.长春师范学院学报,2004,24(1):103～106
    [11]刘仿强.英语口语考试的设计和操作.中国科技信息,2005,(24):262～263
    [12]徐济仁,牛纪海,陈家松. wav文件格式实例分析.微型机与应用,2002,(3):50～51
    [13] Cervera, Teresa. The effect of MPEG audio compression on multidimensional set of voice parameters. Logopedics Phoniatrics Vocology, 2001, 26(8): 124～130
    [14]陈远阳.基于mp3文件的有声词典的实现.电脑知识与技术,2005,(6):54～56
    [15]汪勇,熊前兴. mp3文件格式解析.计算机应用与软件,2004,21(12):126～128
    [16] Bosse Lincoln. An Experimental High Fidelity Perceptual Audio Coder Project in MUS420 Win 97. Computer Based Learning Unit, England: University of Leeds, 1996
    [17]易克初,田赋,付强.语音信号处理.北京:国防工业出版社,2000:3～8
    [18]李灿军.语音识别技术应用研究.现代信息技术研究,2005,(02):72～75
    [19]赵力.语音信号处理.北京:机械工业出版社,2003:56～235
    [20]吕军,曹效英.基于语音识别的汉语发音自动评分系统的初步设计.现代教育技术,2006,16(3):51～54
    [21]贲俊,余小清,万旺根.基于音素的非特定人英语命令词识别算法研究.信号处理,2002,18(6):535～538
    [22] Juang B. H. The Past, Present. Future of Speech of Speech Processing. IEEE Signal Processing Magazine,1998, 37(12): 24～48
    [23]朱民雄,闻新,黄健群等.计算机语音技术.北京:北京航空航天大学出版社,2002:242～284
    [24]张刚,张雪英,马建芬.语音处理与编码.北京:兵器工业出版社,2000:98～110
    [25]修国浩.基于WD/HMM的语音识别算法研究:[硕士学位论文].秦皇岛:燕山大学,2004:1～19
    [26] Gong Y F. Speech Recognition in Noisy Environments: A Survey Speech Communication. 1995, 16(3): 261～291
    [27] Jong-Hwan Lee. Speech Feature Extraction Using Independent Component Analysis. IEEE Trans. Speech Audio Processing, 2005, 10(11): 45～66
    [28]雷静.语音识别技术的研究及其基本实现:[硕士学位论文].武汉:武汉理工大学,2002:5～13
    [29] Lawrence R R, Ronald W S. Digital Processing of Speech Signals. US: Prentice-Hall, 1993
    [30] Steven B Davis. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE trans on ASSP, 1980, 28(4): 357～366
    [31] White G M, Neely R B. Speech Recognition Experiments with Linear Prediction, Bandpass Filtering, and Dynamic Programming. IEEE trans on ASSP, 1976(24): 183～188
    [32]郭春霞,裘雪红.基于MFCC地说华人识别系统.电子科技,2005(11):53～56
    [33]何强,何英. Matlab扩展编程.北京:清华大学出版社,2002
    [34]梁维谦,王国梁,刘加,刘润生.基于音素的发音质量评价算法.清华大学学报,2005,45(1):5～8
    [35] Lawrence R, Juang B. Fundamentals of speech recognition, US: Prentice-Hall International, 1993
    [36]蔡莲红,黄德智,蔡锐.现代语音技术基础与应用.北京:清华大学出版社,2003
    [37]张玲华,郑宝玉,杨震.基于LPC分析的语音特征参数研究及其在说话人识别中的应用.南京邮电学院学报,2005,12:1~6
    [38] Ahmed Mezghani, Douglas O’Shaughnessy. Speaker Verification Using a New Representation Based on a Combination of MFCC and Formants, IEEE Electrical and Computer Engineering, 2005, (11): 1469～1472
    [39] Barnwell T. Recursive windowing for generating autocorrelation coefficients for LPC analysis. IEEE Transactions on Signal Processing 1981, 29(5): 1062～1066
    [40]谢异,左春.基于口语学习的语言平台设计与实现.计算机工程与设计,2006, 27(9):1689～1696
    [41]易千红,曾路.口语测试中的评分模板设置与应用.现代外语,2004,27(1):77～82

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700