婴幼儿语音信息处理与识别研究模型

英文篇名：Research Model of Infant Voice Information Processing and Recognition
作者：左正东 ; 万光彩 ; 杜佳轩
英文作者：ZUO Zheng-dong;WAN Guang-cai;DU Jia-xuan;Faculty of Finance,Anhui University of Finance and Economics;
关键词：梅尔频率倒谱系数 ; 贝叶斯判别 ; 语音语义识别 ; 基音频率 ; MATLAB ; EXCEL
英文关键词：Mel frequency inversion coefficient;;Bayesian discriminant;;speech semantic recognition;;base audio rate;;MATLAB;;EXCEL
中文刊名：山西师范大学学报(自然科学版)
英文刊名：Journal of Shanxi Normal University(Natural Science Edition)
机构：安徽财经大学金融学院;
出版日期：2019-06-17
出版单位：山西师范大学学报(自然科学版)
年：2019
期：02
语种：中文;
页：90-97
页数：8
CN：14-1263/N
ISSN：1009-4490
分类号：TN912.3

摘要

本文针对婴儿语音的识别及处理问题,通过Mel尺度倒谱参数(MFCC)等信号分析的参数,对婴儿语音情绪信息的数据采集和预处理过程及相应的特征参数提取方法进行了研究.综合运用了Matlab、Excel、Widi及TT Composer等软件求解,经过综合比较,本文采用了参数和方法,针对辨别婴儿性别以及婴儿身心状态所表达的情绪信息作模式识别研究,并给出了其技术实现方法和实验测试结果,取得了良好的识别效果.对于一男一女唱同一首歌的音频,我们基于性别差异角度对语音信号时域进行特征分析,通过绘制语谱图、能量图、相关函数图等,观察男女声的差异,可以发现语音信号的前面部分性别差异特征,在此基础上再利用MFCC分别得到男性和女性语音的48 110*24 MFCC特征矩阵.通过贝叶斯判别法,将语音进行性别判别归类,再利用该模型对婴儿的声音进行鉴别.
Aiming at the problem of infant speech recognition and processing,this paper studies the data acquisition and preprocessing process of infant phonetic emotion information and the corresponding method of feature parameter extraction through the parameters of signal analysis such as Mel scale inverted spectrum parameter( MFCC). Based on the comprehensive comparison of Matlab,Excel,Widi and TT Composer,this paper adopts parameters and methods to study the pattern recognition of emotional information expressed in the identification of infant sex and infant's physical and mental state,and gives its technical realization method and experimental test results. Good recognition results have been obtained. For the audio of a man and a woman singing the same song,we analyze the characteristics of speech signal time domain based on the angle of gender differences,and observe the differences of male and female sound by drawing the spectrum diagram,energy graph,correlation function diagram,etc. And it can be found that the characteristics of gender difference in the front part of speech signal. On this basis,the 48 110* 24 mecc characteristic matrix of male and female phonetics was obtained by using MFCC respectively. By Bayesian discriminant method,the speech was classified by sex discrimination,and the model was used to identify the infant's voice.

引文

[1]魏丽娜.婴儿情绪信息的模式识别技术研究与实现[D].复旦大学,2012.
    [2]赵清扬.婴儿需求表达语音信息的智能识别技术研究[D].复旦大学,2014.
    [3]王炳锡,屈丹,彭煊.实用语音识别基础[M].北京:国防工业出版社,2004.
    [4]李柏年,吴礼斌. MATLAB数据分析方法[M].北京:机械工业出版社,2016.