摘要
头相关传输函数(Head-Related Transfer Function,HRTF)的个性化定制,是实现虚拟听觉系统(Virtual Audio Display,VAD)的关键技术之一。本文提出了一个基于稀疏表示和径向基函数(adial Basis Function,RBF)神经网络的HRTF个性化方法,通过LASSO回归分别计算出生理特征的稀疏系数和HRTF数据的稀疏系数,利用神经网络来建模两组系数的映射关系,并使用Pearson相关分析筛选与测试样本相关性强的数据作为训练集,所提方法只需要进行较少的训练就可以估计出个性化头相关传输函数。仿真实验表明,与已有的稀疏表示方法相比,本方法所需的训练集更小,估计误差更低。
The synthesis of personalized head-related transfer function(HRTF) is a key factor in virtual audio display. In this paper,a method based on sparse representation and radial basis function(RBF) neural network is proposed to obtain personalized HRTF,to construct a personalized model,LASSO regression is applied to get the sparse representations of subject's anthropometric features and subject's HRTF data respectively; then a neural network is used to model the relationship between the two representation sets. And some important samples are selected as training set according to the Pearson correlation analysis. Experiments show that our method outperforms the previous sparse representation method with smaller train set and lower estimation error.
引文
[1]谢菠荪.虚拟听觉在虚拟现实、通信及信息系统的应用[J].电声技术,2008(1):70-75.
[2]JIN C T,GUILLON P,EPAIN N,et al.Creating the sydney york morphological and acoustic recordings of ears database[J].IEEE Transactions on Multimedia,2014,16(1):37-46.
[3]谢菠荪.头相关传输函数与虚拟听觉重放[M].北京:国防工业出版社,2009.
[4]BLAUERT J.Spatial Hearing:The Psychophysics of Human Sound Localization[M].The MIT Press,revised edition,1966.
[5]WENZEL E,ARRUDA M,KISTLEF D,et al.Localization using non-individualized head-related transfer functions[J].Journal of the Acoustical Society of America,1993,94(1):111-123.
[6]MOLLER H,SORENSEN M F,JENSEN C B,C.B.JENSEN,et al.Binaural technique:Do we need individual recordings?[J].Journal of the Audio Engineering Society,1996,44(6):451-169.
[7]YAO S N,CHEN L J.HRTF Adjustments with Audio Quality Assessments[J].Archives of Acoustics,2013,38(1):55-62.
[8]ALGAZI V R,DUAD R O,THOMPSON D M.THOMP-SON,et al.The CIPIC HRTF database[C].Proceedings of the IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics,New Platz,NY,USA,2001.
[9]ZOTKIN D N,HWANG J,DURAISWAMI R,et al.HRTF personalization using anthropometric measurements[C].Proceedings of the IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics,New Platz,NY,USA,2003.
[10]FONTANA S,FARIAN A,GRENIER Y.A system for rapid measurement and direct customization of head-related impulse responses[C].120thAudio Engineering Society Convention,2006.
[11]MIDDLERBROOKS J C.Virtual localization improved by scaling non-individualized external-ear transfer functions in frequency[J].Journal of the Audio Engineering Society,1999,106(3):1480-1492.
[12]GUILLON P,GUIGNARD T,NICOL R.Head-related transfer function customization by frequency scaling and rotation shift based on a new morphological matching[C].125thAudio Engineering Society Convention,2008.
[13]GUMEROV N,DUAISWAMI R,TANG Z.Numerical study of the influence of the torso on the HRTF[C].IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2002:1965-1968.
[14]OTANI M,ISE S.Numerical calculation of the head-related transfer functions by using boundary element method[C].In Proceedings of the WESTPRAC VII,2000:305-308.
[15]OTANI M,ISE S.A fast calculation method of the head-related transfer functions for multiple source points based on the boundary element method[J].Acoustical Science and Technology,2003,24(5):259-266.
[16]NISHINO T,NAKAI Y,TAKEDA K,et al.Estimating head related transfer function using multiple regression analysis[J].The Transactions of the Institute of Electronics,Information and Communication Engineers.2001,84(3):260-268.
[17]HU H,ZHOU L,ZHANG J,et al.Head related transfer function personalization based on multiple regression analysis[C].International Conference on Computational Intelligence and Security,2006:829-1832.
[18]GRINDLAY G,VASILESCU M A O.A Multilinear(Tensor)Framework for HRTF Analysis and Synthesis[C].IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Honolulu,HI,2007.
[19]HU H,ZHOU L,MA H,et al.Head related transfer function personalization based on partial least square regression[J].Journal of Electronics&Information Technology,2008,30(1):154-158.
[20]HUANG Q H,LI L.Modeling individual HRTF tensor using high-order partial least squares[J].EURASIPJournal on Advances in Signal Processing,2014(1):58.
[21]HU H,ZHOU L,MA H,et al.HRTF personalization based on artificial neural network in individual virtual auditory space[J].Journal of Applied Acoustics,2008,69(2):163-172.
[22]LI L,HUANG H.Huang.HRTF personalization modeling based on RBF neural network[C].IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Vancouver,British Columbia,Canada,May 2013.
[23]WANG L,ZENG X Y.New method for synthesizing personalized head-related transfer function[C].IEEE International Workshop on Acoustic Signal Enhancement(IWAENC),Xi'an,China,2016.
[24]YAO S N,COLLINS T,LIANG C Y,C.Y.LIANG.Head-Related Transfer Function Selection Using Neural Networks[J].Archives of Acoustics,2017,42(3):365-373.
[25]BILINSKI P,AHRENS J,THOMAS M R P,M.R.P.THOMAS,et al.HRTF magnitude synthesis via sparse representation of anthropometric features[C].IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Florence,2014.
[26]HE J,GAN W,TAN E.On the preprocessing and postprocessing of HRTF individualization based on sparse representation of anthropometric features[C].IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Brisbane,QLD,2015.
[27]ZHU M,SHAHNAWAZ M,TUBAEO S,et al.HRTFpersonalization based on weighted sparse representation of anthropometric features[C].International Conference on3D Immersion(IC3D),Brussels,2017.
[28]KUKREJA S,LOFBERG J,BRENNER M J.A least absolute shrinkage and selection operator(LASSO)for nonlinear system identification[C].International Federation for Automatic Control Symposium on System Identification(IFAC SYSID),Newcastle,Australia,March 2006.
[29]JAAKKOLA T,HAUSSLER D.Exploiting generative models in discriminative classifiers[C].Proceedings of the Conference on Advances in Neural Information Processing Systems II,Massachusetts,1998.
[30]NIANJAN M,FALLSISE F.Neural networks and radial basis functions in classifying static speech patterns[J].Computer Speech and Language,1990,4(3):275-289.