非对称代价函数的稀疏卷积非负矩阵分解方法

英文篇名：A Sparse Convolutive Non-negative Matrix Factorization Method with Asymmetric Cost Function
作者：张倩敏 ; 陶亮 ; 周健 ; 王华彬
英文作者：ZHANG Qian-min;TAO Liang;ZHOU Jian;WANG Hua-bin;Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education,Anhui University;
关键词：稀疏卷积非负矩阵分解 ; 非对称代价函数 ; 板仓-斋藤距离 ; 语音可懂度
英文关键词：sparse convolutive non-negative matrix factorization;;asymmetric cost function;;Itakura-Saito distance;;speech intelligibility
中文刊名：XXCN
英文刊名：Journal of Signal Processing
机构：安徽大学计算智能与信号处理教育部重点实验室;
出版日期：2015-01-25
出版单位：信号处理
年：2015
期：v.31;No.185
基金：国家自然科学基金(61372137,61301295,61003131);; 安徽省自然科学基金(1308085QF100,1408085MF113)资助项目
语种：中文;
页：XXCN201501014
页数：8
CN：01
ISSN：11-2406/TN
分类号：99-106

摘要

提出一种基于非对称代价函数的稀疏卷积非负矩阵分解方法。该方法利用板仓-斋藤距离作为目标代价函数来衡量目标矩阵与重建矩阵的差异,使得较小的矩阵元素具有较小的重建误差,并且该代价函数具有尺度不变性的特点。为了考察其在弱语音成分重建方面的优势,将本文提出的算法应用于耳语音谱分解及重建实验。实验结果表明,与基于欧氏距离和基于Kullback-Leibler(K-L)散度的卷积非负矩阵分解算法相比,本文算法对于弱语音成分具有更好的重构效果,重建后的语音信号具有较大的可懂度。
A sparse convolutive non-negative matrix factorization method is proposed based on asymmetric cost function.The method utilizes the Itakura-Saito distance as the objective cost function to measure the error between a target matrix and its reconstruction version,making the smaller matrix element have a smaller reconstruction error,and the cost function has the property of scale invariant. In order to evaluate its advantage in the aspect of weak spectrum component reconstruction,whispered speech basis and its coefficients are derived by the proposed algorithm,and then they are used to reconstruct the whispered speech. Experimental results show that the proposed algorithm has a better reconstructive performance for weak speech component than that based on Euclidean distance and Kullback-Leibler( K-L) divergence. The reconstructed speech signal gains larger intelligibility improvement by the proposed method.

引文

[1]Fancourt C L,Principe J C.Competitive principal component analysis for locally stationary time series[J].IEEE Transactions on Signal Processing,1998,46(11):3068-3081.
    [2]Guo K L,Xu X M,Qiu F H,et al.A novel incremental weighted PCA algorithm for visual tracking[C]∥IEEE International Conference on Image Processing(ICIP),Melbourne,VIC,2013:3914-3918.
    [3]Ba Q L,Li X Y,Bai Z Y.Clustering collaborative filtering recommendation system based on SVD algorithm[C]∥IEEE International Conference on Software Engineering and Service Science,Beijing,2013:963-967.
    [4]甘金明,咸兆勇,玉振明,等.采用奇异值分解的图像模糊区域检测分割新方法[J].信号处理,2014,30(5):569-574.Gan J M,Xian Z Y,Yu Z M,et al.An Image Blurred Region Detection and Segmentation New Method Using Singular Value Decomposition[J].Journal of Signal Processing,2014,30(5):569-574.(in Chinese)
    [5]Jen-Tzung Chien,Hsin-Lung Hsieh.Convex Divergence ICA for Blind Source Separation[J].IEEE Transactions on Audio,Speech,and Language Processing,2012,20(1):302-313.
    [6]Xiaoxu Han.Nonnegative Principal Component Analysis for Cancer Molecular Pattern Discovery[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2010,7(3):537-549.
    [7]Rajwade A,Rangarajan A,Banerjee A.Image Denoising Using the Higher Order Singular Value Decomposition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(4):849-862.
    [8]Li X L,Adali T.Noncircular Complex ICA by Generalized Householder Reflections[J].IEEE Transactions on Signal Processing,2013,61(24):6423-6430.
    [9]Lee D,Seung H S.Learning the parts of objects by nonnegative matrix factorization[J].Nature,1999,401:788-791.
    [10]Mohammadiha N,Smaragdis P,Leijon A.Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization[J].IEEE Transactions on Audio,Speech,and Language Processing,2013,21(10):2140-2151.
    [11]Lu X Q,Wu H,Yuan Y.Double Constrained NMF for Hyperspectral Unmixing[J].IEEE Transactions on Geoscience and Remote Sensing,2014,52(5):2746-2758.
    [12]Huang K J,Sidiropoulos N D,Swami A.Non-Negative Matrix Factorization Revisited:Uniqueness and Algorithm for Symmetric Decomposition[J].IEEE Transactions on Signal Processing,2014,62(1):211-224.
    [13]李雨谦,皮亦鸣.基于非负矩阵分解的多波段SPOT图像融合及其应用[J].信号处理,2011,27(10):1557-1560.Li Y Q,Pi Y M.SPOT image fusion and application based on Non-negative Matrix Factorization[J].Signal Processing,2011,27(10):1557-1560.(in Chinese)
    [14]Smaragdis p.Convolutive speech bases and their application to supervised speech separation[J].IEEE Transactions on Audio,Speech,and Language Processing,2007,15(1):1-12.
    [15]Lee D,Seung H S.Algorithms for non-negative matrix factorization[C]∥Advances in Neural Information Processing Systems.Cambridge,Mass,USA:MIT Press,2001:556-562.
    [16]Paul D.O’Grady,Barak A.Pearlmutter.Discovering speech phones using convolutive non-negative matrix factorization with a sparseness constraint[J].Neurocomputing,2008,72:88-101.
    [17]Paul D.O’Grady.Sparse separation of under-determined speech mixtures[D].Maynooth:National University of Ireland Maynooth.2007.
    [18]Fumitada I,Shuji S.A statistical method for estimation of speech spectral density and formant frequencies[J].Electronics and Communications,1970,53(A):36-43.
    [19]Hoyer P O.Non-negative matrix factorization with sparseness constraints[J].Journal of Machine Learning Research,2004,5:1457-1469.
    [20]孙健,张雄伟,曹铁勇,等.基于卷积非负矩阵分解的语音转换方法[J].数据采集与处理,2013,28(2):141-148.Sun J,Zhang X W,Cao T Y,et al.Voice conversion based on convolutive nonnegative matrix factorization[J].Journal of Data Acquisition&Processing,2013,28(2):141-148.(in Chinese)
    [21]Taal C H,Hendriks R C,Heusdens R,et al.An Algorithm for Intelligibility Prediction of Time-Frequency Weighted Noisy Speech[J].IEEE Transactions on Audio,Speech,and Language Processing,2011,19(7):2125-2136.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700