无人驾驶汽车行车环境下鲁棒性声学特征提取算法

英文篇名：Robust Acoustic Feature Extraction Algorithm for Driving Environment of Driverless Cars
作者：毛锦 ; 李林聪 ; 刘凯 ; 杜进辅 ; 崔亚辉
英文作者：MAO Jin;LI Lin-cong;LIU Kai;DU Jin-fu;CUI Ya-hui;School of Mechanical and Precision Instrument Engineering, Xi'an University of Technology;
关键词：汽车工程 ; 无人驾驶 ; 鲁棒特征提取 ; 声学事件检测 ; 谐波梅尔频率倒谱系数
英文关键词：automotive engineering;;driverless car;;robust acoustic feature extraction;;acoustic event detection;;harmonic mel-frequency cepstral coefficient
中文刊名：ZGGL
英文刊名：China Journal of Highway and Transport
机构：西安理工大学机械与精密仪器工程学院;
出版日期：2019-06-15
出版单位：中国公路学报
年：2019
期：v.32;No.190
基金：国家自然科学基金青年科学基金项目(61701397,51705419);; 陕西省自然科学基础研究计划项目(2017JQ5011);; 中国博士后科学基金项目(2019M653702,2018M633540)
语种：中文;
页：ZGGL201906018
页数：7
CN：06
ISSN：61-1313/U
分类号：173-179

摘要

无人驾驶汽车在行车过程中,需要通过视觉感知和听觉感知来构建当前周围环境模型,声学事件检测是听觉感知系统构建模型的核心所在。行车环境下声学事件检测系统面临着复杂而强烈的噪声挑战,尤其是行车过程中的风噪。声学事件检测中,常用的声学特征梅尔频率倒谱系数(MFCC)对噪声干扰十分敏感,为了解决这一问题,提出一种谐波梅尔频率倒谱系数(HMFCC)的鲁棒性声学特征提取算法,用于声学事件的目标分类。该算法通过声学信号的谐波模型与MFCC算法相结合,提取目标声学信号中的共振峰频率,改进传统Mel滤波器组,从而增强HMFCC中目标声学信号的中高频分量。研究结果表明:在不同的风噪环境下,基于HMFCC声学特征的检测结果具有较高的精准率和召回率,且在低噪和强噪环境下HMFCC和MFCC之间分类效果差异明显;低噪环境下,几种声学事件的HMFCC特征分类的平均精准率和召回率分别达到82.66%、84.15%,而基于MFCC特征分类检测的平均精准率和召回率只有73.93%、74.61%;随着风噪增强,MFCC特征分类精度严重下降,平均精准率和召回率仅为54.15%、44.95%,HMFCC特征在强噪环境下的平均精准率和召回率为72.16%、69.87%。行车环境下,HMFCC特征不仅可以提高分类的准确率,而且表现出对噪声不敏感的特性。
Driverless cars need to construct the current surrounding environment model through visual perception and auditory perception during driving. Acoustic event detection is the core of the auditory perception system construction model. In the driving environment, acoustic event detection systems face complex and strong noise challenges, especially wind noise during driving. MFCC is a commonly used acoustic feature in acoustic event detection, but that is very sensitive to noise interference. In order to solve this problem, this paper proposes a robust acoustic feature extraction algorithm called harmonic Mel frequency cepstral coefficients(HMFCC), which is used for target classification of acoustic events. The algorithm combines the harmonic model of acoustic signal with the MFCC algorithm to extract the formant frequency in the target acoustic signal and improve the traditional Mel filter bank. The purpose is to enhance the medium and high frequency components of the target acoustic signal in HMFCC. We collect different experimental data under the same conditions, excluding the wind level. One-third of the data is used to train the support vector machine model classifier, and the remaining data is used to verify the accuracy of the classification results. Through experimental data acquisition and analysis, we found that the detection results based on the acoustic characteristics of HMFCC have higher accuracy and recall rate under different wind noise environments, and the classification effect between HMFCC and MFCC is obvious in low noise and noisy environments. In the low noise environment, the average accuracy and recall rate of HMFCC feature classification of several acoustic events reached 82.66% and 84.15%, while the average accuracy and recall rate based on MFCC feature classification detection was only 73.93% and 74.61%. With the increase of wind noise, the accuracy of MFCC feature classification was seriously degraded, the average accuracy and recall rate were only 54.15% and 44.95%,and the average accuracy and recall rate of HMFCC characteristics in strong noise environment were 72.16% and 69.87%.

引文

[1] LI L,WANG X,WANG K F,et al.Parallel Testing of Vehicle Intelligence via Virtual-real Interaction [J].Science Robotics,2019,2 (1):1-3.
    [2] 《中国公路学报》编辑部.中国汽车工程学术研究综述· 2017[J].中国公路学报,2017,30(6):1-197.Editorial Department of China Journal of Highways and Transport.Review on China’s Automotive Engineering Research Progress:2017 [J].China Journal of Highways and Transport,2017,30 (6):1-197.
    [3] TIAN Y H,CHEN X L,XIONG H K,et al.Towards Human-like and Transhuman Perception in AI 2.0:A review [J].Frontiers of Information Technology & Electronic Engineering,2017,18 (1):58-67.
    [4] BUI M Q,DUONG V H,MATHULAPRANGSAN S,et al.A Survey of Polyphonic Sound Event Detection Based on Non-negative Matrix Factorization [C] // IEEE.2016 International Computer Symposium (ICS).New York:IEEE,2016:351-354.
    [5] 韩纪庆.声学事件检测技术的发展历程与研究进展[J].数据采集与处理,2016,31(2):231-241.HAN Ji-qing.History and State of Art of Acoustic Event Detection [J].Journal of Data Acquisition and Processing,2016,31 (2):231-241.
    [6] LI J,DAI W,METZE F,et al.A Comparison of Deep Learning Methods for Environmental Sound [C] // IEEE.IEEE International Conference on Acoustics.New York:IEEE,2017:126-130.
    [7] BARCHIESI D,GIANNOULIS D,STOWELL D,et al.Acoustic Scene Classification:Classifying Environments From the Sounds They Produce [J].IEEE Signal Processing Magazine,2015,32 (3):16-34.
    [8] MOHAMMADI M,MOHAMMADI H R S.Study of Speech Features Robustness for Speaker Verification Application in Noisy Environments [C] // IEEE.2016 8th International Symposium on Telecommunications.New York:IEEE,2017:489-493.
    [9] ALI M S A M,TAIB M N,TAHIR N M,et al.EEG Sub-band Spectral Centroid Frequencies Extraction Based on Hamming and Equiripple Filters:A Comparative Study [C] // IEEE.2014 IEEE 10th International Colloquium on Signal Processing & Its Applications (CSPA).New York:IEEE,2014:199-203.
    [10] MESAROS A,HEITTOLA T,ERONEN A,et al.Acoustic Event Detection in Real-life Recordings [C] // IEEE.18th European Signal Processing Conference.New York:IEEE,2014:1267-1271.
    [11] CHENG J,CHEN X,METALLINOU A.Deep Neural Network Acoustic Models for Spoken Assessment Applications [J].Speech Communication,2015,73:14-27.
    [12] XU J,XU C,ZOU B,et al.New Incremental Learning Algorithm With Support Vector Machines [J].IEEE Transactions on Systems,Man and Cybernetics:Systems,2018 (7):1-12.
    [13] FOSTER P,SIGTIA S,KRSTULOVIC S,et al.CHiME-home:A Dataset for Sound Source Recognition in a Domestic Environment [C] // IEEE.2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).New York:IEEE,2015:18-21.
    [14] MESAROS A,HEITTOLA T,BENETOS E,et al.Detection and Classification of Acoustic Scenes and Events:Outcome of the DCASE 2016 Challenge [J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2018,26 (2):379-393.
    [15] SAHIDULLAH M,SAHA G.A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition [J].IEEE Signal Processing Letters,2013,20 (2):149-152.
    [16] XU Y,KONG Q,WANG W,et al.Surrey-cvssp System for DCASE2017 Challenge Task4 [J].IEEE Workshop on Applications of Signal Processingto Audio and Acoustics,2017,19 (2):168-170.
    [17] HUANG J C,XIAO S L,ZHOU Q W,et al.A Robust Feature Extraction Algorithm for the Classification of Acoustic Targets in Wild Environments [J].Circuits,Systems,and Signal Processing,2015,34 (7):2395-2406.
    [18] WU Z J,CAO Z G.Improved MFCC-based Feature for Robust Speaker Identification [J].Tsinghua Science and Technology,2005,10 (2):158-161.
    [19] 刘红光,陆森林,曾发林.高速车辆气流噪声的试验研究[J].中国公路学报,2005,18(1):113-116.LIU Hong-guang,LU Sen-lin,ZENG Fa-lin.Experimental Study of Automobile Aerodynamic Noise [J].China Journal of Highways and Transport,2005,18 (1):113-116.
    [20] WILLIAM P E,HOFFMAN M W.Classification of Military Ground Vehicles Using Time Domain Harmonics’ Amplitudes [J].IEEE Transactions on Instrumentation & Measurement,2011,60 (11):3720-3731.
    [21] HUANG J,ZHANG X,ZHOU Q,et al.A Practical Fundamental Frequency Extraction Algorithm for Motion Parameters Estimation of Moving Targets [J].IEEE Transactions on Instrumentation & Measurement,2014,63 (2):267-276.
    [22] XU Y,HUANG Q,WANG W,et al.Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging [J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2017,25 (6):1230-1241.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700