Human mouth-state recognition based on learned discriminative dictionary and sparse representation combined with homotopy
详细信息    查看全文
  • 作者:Jianhua Wu ; Jinqiang Zhu ; Qiegen Liu ; Ye Zhang
  • 关键词:Mouth ; state recognition ; Label consistent K ; SVD ; Discriminative dictionary ; Homotopy ; Sparse representation
  • 刊名:Multimedia Tools and Applications
  • 出版年:2015
  • 出版时间:December 2015
  • 年:2015
  • 卷:74
  • 期:23
  • 页码:10697-10711
  • 全文大小:1,023 KB
  • 参考文献:1.Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322CrossRef
    2.Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23(6):681–685CrossRef
    3.Donoho DL (2006) For most large underdetermined systems of linear equations the minimal ℓ1-norm solution is also the sparsest solution. Commun Pure Appl Math 59(6):797–829MATH MathSciNet CrossRef
    4.Donoho D, Tsaig Y (2008) Fast solution of ℓ 1-norm minimization problems when the solution may be sparse. IEEE Trans Inf Theory 54(11):4789–4812MATH MathSciNet CrossRef
    5.Elad M (2010) Sparse and redundant representations from theory to applications in signal and image processing, Springer
    6.Gonzalez RC, Woods RE (2010) Digital Image Processing (Third Edition), Publishing House of Electronics Industry
    7.Jain AK, Duin RPW, Mao JC (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37CrossRef
    8.Jiang ZL, Lin Z, Davis LS (2011) Learning a discriminative dictionary for sparse coding via label consistent K-SVD, IEEE Conference on Computer Vision and Pattern Recognition. 1697–1704
    9.Karahanoglu NB, Erdogan H (2012) A* orthogonal matching pursuit: Best-first search for compressed sensing signal recovery. 22(4): 555–568
    10.Liu Q, Wang W, Jackson P (2012) Use of bimodal coherence to resolve permutation problem in convolutive BSS. Signal Process 92(8):1916–1927CrossRef
    11.Missaoui I, Lachiri Z (2012) Cepstral smoothing of binary masks for convolutive blind separation of speech mixtures. Int J Digit Content Technol Appl 6(17):532–541CrossRef
    12.Moussallam M, Daudet L, Richard G (2012) Matching pursuits with random sequential subdictionaries. 92(10): 2532–2544
    13.Pham DS, Venkatesh S (2008) Joint learning and dictionary construction for pattern recognition, IEEE Conference on Computer Vision and Pattern Recognition. 1–8
    14.Qin Q, Jiang ZN, Feng K, He W, Chen S (2012) A novel scheme for fault detection of reciprocating compressor valves based on basis pursuit, wave matching and support vector machine. 45(5): 897–908
    15.Rivet B, Girin L, Jutten C (2007) Visual voice activity detection as a help for speech source separation from convolution mixtures. Speech Comm 45(2):667–677CrossRef
    16.Shu K, Wang DH (2012) A dictionary learning approach for classification: separating the particularity and the commonality, Computer Vision-ECCV 2012, Springer. 186–199
    17.Stiefelhagen R, Meier U, Yang J (1997) Real-Time Lip-Tracking for Lipreading,Eurospeech’97 5th European Conference on Speech Communication and Technology. 2007–2010
    18.Wang CL, Lan L, Zhang YW, Gu MJ (2011) Face recognition based on principle component analysis and support vector machine, IEEE 3rd International Workshop on Intelligent Systems and Applications, 1-4
    19.Wang SL, Liew AWC (2007) ICA-Based lip feature representation for speaker authentication, Third International IEEE Conference on Signal-Image Technologies and Internet-Based System, 763–767
    20.Wright J, Yang AY, Ganesh A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227CrossRef
    21.Zhang Q, Li BX (2010) Discriminative K-SVD for dictionary learning in face recognition, IEEE Conference on Computer Vision and Pattern Recognition. 2691–2698
    22.Zhang Y, Qu S, Wu JH (2013) Human mouth-type recognition via learned dictionary and sparse representation. Int J Digit Content Technol Appl 7(4):599–606CrossRef
  • 作者单位:Jianhua Wu (1)
    Jinqiang Zhu (1)
    Qiegen Liu (1)
    Ye Zhang (1)

    1. Department of Electronic Information Engineering, Nanchang University, Nanchang, 330031, China
  • 刊物类别:Computer Science
  • 刊物主题:Multimedia Information Systems
    Computer Communication Networks
    Data Structures, Cryptology and Information Theory
    Special Purpose and Application-Based Systems
  • 出版者:Springer Netherlands
  • ISSN:1573-7721
文摘
In order to detect the number of audio sources and improve the speech recognition capability of an intelligent robot auditory system, recognizing human mouth-states, open or closed, is studied in this paper. A discriminative dictionary and sparse representation combined with homotopy based human mouth-state recognition algorithm is proposed. In the algorithm, a label consistent K-SVD (LC-KSVD) algorithm is used to learn a discriminative single over-complete dictionary and an optimal linear classifier simultaneously. Meanwhile, homotopy algorithm is used at the sparse decomposition stage. Experiments are carried out with the database established with the ROI images localized and extracted from the face images downloaded from Google online. Compared with several state-of-the-art methods, the proposed method obtains higher classification rates (CRs), costs less time for recognizing a test sample and has good noise immunity performance. Particularly, superior performance is attained when the training samples are extremely limited, even one sample per class.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700