基于内容的音频信息检索技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着现代信息技术、多媒体技术和网络技术的迅速发展,多媒体信息的数据量急剧增多。为了充分利用已有的音频信息资源,基于内容的音频信息检索技术受到越来越多的关注。音频数据的存在形式有静态与动态之分,在检索层次上也有表示级检索和语义级检索之分。音频数据的形态不同、检索层次不同,需要的检索方法也不同。尽管相关的研究很多,但音频检索技术仍然存在大量问题亟需解决。主要问题有:大多数检索算法在有噪声的情况下检索性能明显下降;音频数据维数高且具有时序性,构建索引非常困难;缺少针对动态音频检索的研究;音频形式的音乐由于获取语义信息困难,语义级检索的研究难度大、进展缓慢。从整体来看,音频检索技术尚处于实验探索阶段,缺少实用化的技术与系统。
     本文针对音频检索技术存在的问题,在以下方面对音频检索技术开展了研究工作:
     1、针对表示级的静态音频检索问题,提出了基于响度主分量特征的模糊直方图音频检索方法。在直方图模型设计中,根据响度数据的统计分布对直方图模型进行优化。并采用模糊直方图进一步提高直方图模型对噪声和响度数值扰动的鲁棒性。在检索时,利用活动搜索算法提高检索速度。实验结果表明,该方法具有较好的噪声鲁棒性(Robustness)。
     2、针对表示级的静态音频索引问题,提出了基于响度主分量模糊直方图的索引方法。采用响度主分量模糊直方图表示音频数据后,长度不同的两段音频数据,只要长度倍数不超过一定限度,其直方图相似度均能正确反映二者之间的包含关系。根据这一特点,提出了二叉树与链表相结合的索引方法。在检索过程中,根据检索目标的长度及长度倍数上限值在索引中选择合适的搜索层次范围。实验结果表明,该索引可大幅度地提高检索速度。
     3、针对表示级的动态音频检索问题,提出了基于分段的实时音频检索方法。该方法将检索目标划分为片段序列,并使用检索窗控制参与检索的片段。研究了算法中灵活的目标检出判别标准、快速检索控制策略、检索反应滞后时间估计数学模型、基于音频分类的多目标快速检索方法等问题。实验结果表明,该方法的速度快、可控性好、检索反应延迟小、对检索目标发生部分残缺以及噪声均具有较好的鲁棒性。
Rapid development of modern information technology, multimedia technology and network technology has resulted in large and ever-increasing stores of multimedia data. Therefore, content-based audio information retrieval (CBAIR) technology has been attracting more and more attention to the efforts to make full use of existing audio information. Audio data can be static or dynamic, and the audio retrieval can be at expression level or semantic level. Different audio form and different retrieval level requires different retrieval methods. Although much work has been done in relevant researches, there still exist many unsolved difficulties in CBAIR area. The major difficulties include the following: the performance of most present retrieval methods deteriorates dramatically under noise; it is very difficult to index audio data which is highly dimensional and of time sequence); little work has been done on dynamic audio retrieval; research in the semantic-level retrieval for audio music progresses hard and slowly due to the serious difficulties in extracting semantic information from audio music). Generally speaking, CBAIR technology is still at experimental stage and lacks applicable technology and system.
     Centering around the problems existing in CBAIR, this dissertation studies the following problems:
     1. For the problem of expression-level retrieval of static audio, fuzzy histogram audio retrieval method based on principal loudness component is developed. In the design of histogram model, statistical distribution of loudness is used to optimize the histogram model. At the same time, fuzzy histogram is used to improve the robustness of histogram model to noise and small change in loudness value. Active search method is used in the histogram retrieval. Experimental results show that the method has better robustness to noise.
     2. For the problem of expression-level indexing of static audio, a novel indexing method based on fuzzy histogram of principal loudness component is presented. When audio data is expressed by fuzzy histogram of principal loudness component, the similarity between their histograms can correctly reflect
引文
1 J. Foote. An overview of audio information retrieval. Multimedia Systems. 1999, 7(1):2~11
    2 D. Roy, C. Malamud. Speaker identification based text to audio alignment for an audio retrieval system. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), 1997.4, 2:1099~1102
    3 G.H. Li, D.F. Wu, J. Zhang. Concept framework for audio information retrieval: ARF. Journal of Computer Science and Technology, 2003.9, 18(5): 667~673
    4 S.W. Smoliar, J.D. Baker, T. Nakayama, et al. Multimedia search: an authoring perspective. Proceeding of the First International Workshop on Image Databases and Multimedia Search, 1996, 1:1~8
    5 G. Smith, H. Murase, K. Kashino. Quick audio retrieval using active search. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '98), 1998, 6:3777~3780
    6 K. Kashino, G. Smith, H. Murase. Time-series active search for quick retrieval of audio and video. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '99), 1999, 6:2993~ 2996
    7 K. Kashino, T. Kurozumi, H. Murase. Feature fluctuation absorption for a quick audio retrieval from long recordings. Proceeding of the 15th International Conference on Pattern Recognition, 2000.9, 3:98~101
    8 K. Kashino, T. Kurozumi, H. Murase. A quick search method for audio and video signals based on histogram pruning. IEEE Transactions on Multimedia, 2003.9, 5(3):348~357
    9 C. Spevak, E. Favreau. Soundspotter-a prototype system for content-based audio retrieval. Proceeding of the 5th International Conference on Digital Audio Effects(DAFx-02), Hamburg, Germany, 2002:27~32
    10 S.E. Johnson, P.C. Woodland. A method for direct audio search with applications to indexing and retrieval. Proceeding of International Conference on Acoustics, Speech, and Signal Processing (ICASSP'2000), 2000.6, 3:1427~1430
    11 李超, 熊璋, 朱成军. 基于距离相关图的音频相似性度量方法. 北京航空航天大学学报, 2006,32(2): 224~227
    12 蔡锐. 面向音效检测和场景分类的音频内容分析. 清华大学博士论文, 2006.4
    13 R. Cai, L. Lu, H.J. Zhang, L.H. Cai. Highlight sound effects detection in audio stream. Proceeding of the 4th IEEE International Conference on Multimedia and Expo, 2003.3:37~40
    14 R. Cai, L. Lu, A. Hanjalic, L.H. Cai. A flexible framework for key audio effects detection and auditory context inference. IEEE Transactions on Speech Audio Process, 2006,14(3)
    15 E. Wold, T. Blum, D. Keislar, et al. Classification, search and retrieval of audio—Handbook of multimedia computing, ed. B. Furht, CRC Press,1999: 207~225
    16 E. Wold, T. Blum, D. Keislar, et al. Content-based classification, search, and retrieval of audio. IEEE Multimedia, 1996, 3(3):27~36
    17 J. Foote. A similarity measure for automatic audio classification. AAAI 1997 Spring Symposium on Intelligent Integration and Use of Text, Image, Video and Audio Corpora, Stanford University, USA, AAAI Technical Report SS-97-03, 1997:1~7
    18 S.Z. Li. Content-based audio classification and retrieval using the nearest feature line method. IEEE Transactions on Speech and Audio Processing, 2000.9, 8(5):619~625
    19 T. Zhang, C.-C. J. Kuo. Hierarchical classification of audio data for archiving and retrieving. IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP '99), 1999, 6:3001~3004
    20 吴飞, 庄越挺, 潘云鹤. 基于增量学习支持向量机的音频例子识别与检索. 计算机研究与发展, 2003.7, 40(7):950-955
    21 Z. Liu, Q. Huang. Content-based indexing and retrieval-by-example in audio. IEEE International Conference on Multimedia and Expo(ICME 2000), 30 July - 2 Aug, 2000,2:877~880
    22 M.C. Liu, C.R. Wan. A study on content-based classification and retrieval of audio database. IEEE Database Engineering and Applications Symposium, 2001:339-345
    23 J.J. Burred, A. Lerch. Hierarchical automatic audio signal classification.Journal of the Audio Engineering Society, 2004,52(7):724~739
    24 T. Zhang, C.-C. J. Kuo. Audio content analysis for online audiovisual data segmentation and classification. IEEE Transactions on Speech and Audio Processing, 2001.5,9(4):441~457
    25 M. Rajapakse, L.L. Wyse. Generic audio classification using a hybrid model based on GMMs and HMMs. 11th International Conference on Multimedia Modeling(MMM 2005), 2005:53~58
    26 L. Lu, H.J. Zhang, H. Jiang. Content analysis for audio classification and segmentation. IEEE Transactions on Speech and Audio Processing, 2002.10, 10(7):504~516
    27 R. Huang, J.H.L. Hansen. High-level feature weighted GMM network for audio stream classification. 8th International Conference on Spoken Language Processing(ICSLP 2004), Jeju Island, Korea, 2004:1061~1064
    28 梁伟, 姜洪臣, 张树武, 徐波. 一种快速的音频分割与分类技术. 第八届全国人机语音通讯学术会议论文集, 2005.10:80~84
    29 姜洪臣, 梁伟, 张树武, 徐波. 音频场景分类的音频特征提取和分析. 第八届全国人机语音通讯学术会议论文集, 2005.10:91~95
    30 卢坚, 陈毅松, 孙正兴等. 基于隐马尔可夫模型的音频自动分类. 软件学报, 2002, 13(8):1593~1597
    31 陈忠克, 郭振江, 刘骏伟等. 足球比赛精彩场景的自动分析与提取. 计算机辅助设计与图形学学报, 2004, 16(6):856~860
    32 陈剑赟, 李云浩, 吴玲达等. 辅助足球视频切分的音频自动分类与分段. 国防科技大学学报, 2004, 26(6):49~53
    33 O.M. Mubarak, E. Ambikairajah, J. Epps. Analysis of an MFCC-based audio indexing system for efficient coding of multimedia sources. Proceeding of the Eighth International Symposium on Signal Processing and Its Applications, 2005.8, 2:619~622
    34 C. Senac, E. Ambikairajah. Audio indexing using feature warping and fusion techniques. IEEE 6th Workshop on Multimedia Signal Processing, 2004.9:359~362
    35 P. Piamsa-Nga, N.A. Alexandridis, S. Srakaew, et al. In-clip search algorithm for content-based audio retrieval. Proceeding of Third International Conference on Computational Intelligence and MultimediaApplications(ICCIMA '99.), 1999.9:263~267
    36 H.H. Hoos, K. Renz, M. Gorg. GUIDO/MIR-an experimental musical information retrieval system based on GUIDO music notation. International Symposium on Music Information Retrieval, 2001:41~50
    37 A. Ghias, J. Logan, D. Chamberlin, et al. Query by humming: musical information retrieval in an audio database. Proceeding of ACM International Conference on Multimedia, 1995.11:231~236
    38 J.S. Roger Jang, M.Y. Gao. A query-by-singing system based on dynamic programming. Proceeding of International Workshop on Intelligent Systems Resolutions( the 8th Bellman Continuum), 2000.10:85~89
    39 L. Lu, H. You, H.J. Zhang. A new approach to query by humming in music retrieval. IEEE International Conference on Multimedia and Expo, 2001: 776~779
    40 R.J. McNab, L.A. Smith, I.H. Witten, et al. Towards the digital music library: tune retrieval from acoustic input. Proceeding of ACM Digital Libraries Conference, 1996:11~18
    41 李明, 颜永红. 一种基于哼唱的音乐检索方法. 第八届全国人机语音通讯学术会议论文集, 2005.10:433~437
    42 冯雅中, 庄越挺, 潘云鹤. 一种启发式的用哼唱检索音乐的层次化方法. 计算机研究与发展, 2004, 41(2):333~339
    43 李杨, 吴亚栋, 刘宝龙. 一种新的近似旋律匹配方法及其在哼唱检索系统中的应用. 计算机研究与发展, 2003, 40(11):1554~1560
    44 Y.W. Zhu, M. Kankanhalli, Q. Tian. Similarity matching of continuous melody contours for humming querying of melody databases. IEEE Workshop on Multimedia Signal Processing, 2002.10:249~252
    45 N. Hu, R.B. Dannenberg. A comparison of melodic database retrieval techniques using sung queries. Joint Conference on Digital Libraries(2002), New York: ACM Press, 2002:301~307.
    46 W.P. Birmingham, R.B. Dannenberg, G.H. Wakefield. MUSART: music retrieval via aural queries. Second International Symposium on Music Information Retrieval(ISMIR 2001 ), Bloomington, Indiana, 2001:73~81
    47 D. Byrd, T. Crawford. Problems of music information retrieval in the real world. Information Processing and Management, 2002.3, 38(2):249~272
    48 J.S. Roger Jang, H.R. Lee, C.H. Yeh. Query by tapping: a new paradigm for content-based music retrieval from acoustic input. IEEE Pacific Rim Conference on Multimedia, 2001:590~597
    49 G. Eisenberg, J.M. Batke, T. Sikora. BeatBank—an MPEG-7 compliant query by tapping system. Proceeding of the 116th AES Convention, Berlin, 2004.5:53~59
    50 J. Pickens, T. Crawford. Harmonic models for polyphonic music retrieval. Proceeding of the Eleventh International Conference on Information and Knowledge Management, McLean, Virginia, USA, 2002.11:430~437
    51 J. Pickens, J. PabloBello, G. Monti. Polyphonic score retrieval using polyphonic audio queries: a harmonic modeling approach. Journal of New Music Research, 2003,32(2):223-236
    52 N. Hu, R.B. Dannenberg, G. Tzanetakis. Polyphonic audio matching and alignment for music retrieval. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2003.10:185~188
    53 G. Neve, N. Orio. Indexing and retrieval of music documents through pattern analysis and data fusion techniques. International Conference on Music Information Retrieval, Barcelona, ES, 2004:216~223
    54 J. Foote. ARTHUR: retrieving orchestral music by long-term structure. Proceeding of the International Symposium on Music Information Retrieval, 2000:1~7
    55 J. Foote, M. Cooper, U. Nam. Audio retrieval by rhythmic similarity. International Conference on Music Information Retrieval, 2002:81~85
    56 C. Yang. Music database retrieval based on spectral similarity. Proceeding of the International Symposium on Music Information Retrieval, 2001: 37~38
    57 C. Yang. MACS: music audio characteristic sequence indexing for similarity retrieval. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2002, 8(12):122~131
    58 G. Tzanetakis, A. Ermolinskyi, P. Cook. Pitch histograms in audio and symbolic music information retrieval. Proceeding of International Conference on Music Information Retrieval, 2003:31~38
    59 G. Tzanetakis, P. Cook. Musical genre classification of audio signals. IEEETransactions on Speech and Audio Processing, 2002,10(5):293~302
    60 T. Li, G. Tzanetakis. Factors in automatic musical genre classification of audio signals. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2003.10:143~146
    61 张一彬, 周杰, 边肇祺. 基于样本的流行歌曲关键段分割方法. 电子学报, 2006,34(2):220~225
    62 M.D. Plumbley, S.A. Abdallah, J.P. Bello, et al. Automatic music transcription and audio source separation. Cybernetics and Systems, 2002, 33(6):603~627
    63 J.A. Moorer. On the segmentation and analysis of continuous musical sound by digital computer. CCRMA report STAN-M-3, [PhD Dissertation], 1975.7
    64 K. Kashino, S.J. Godsill. Bayesian estimation of simultaneous musical notes based on frequency domain modeling. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '04), 2004,4:305~308
    65 C. Raphael. Automatic transcription of piano music. International Conference on Musical Information Retrieval, 2002:15~19
    66 M. Goto. A real-time music scene description system: detecting melody and bass lines in audio signals. Working Notes of the IJCAI-99 Workshop on Computational Auditory Scene Analysis, 1999:31~40
    67 M. Goto. A robust predominant-F0 estimation method for real-time detection of melody and bass-lines in CD recordings. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2000), 2000, 2:757~760
    68 M. Goto. A predominant-F0 estimation method for CD recordings: MAP estimation using EM algorithm for adaptive tone models. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2001), 2001, 5:3365~3368
    69 M. Goto. A predominant-F0 estimation method for real-world musical audio signals:MAP estimation for incorporating prior knowledge about F0s and tone models. Workshop on Consistent & Reliable Acoustic Cues for Sound Analysis, 2001
    70 A.D. Cheveigne. Separation of concurrent harmonic sounds: fundamental frequency estimation and a time domain cancellation model of auditoryprocessing. Journal of the Acoustical Society of America, 1993, 93(6): 3271~3290
    71 A.D. Cheveigne, H. Kawahara. Multiple period estimation and pitch perception model. Speech Communication, 1999, 27:175~185
    72 A.P. Klapuri. Automatic transcription of music. Proceeding of the Stockholm Music Acoustics Conference, Stockholm, Sweden, August 6-9, 2003
    73 A.P. Klapuri. Multipitch estimation and sound separation by the spectral smoothness principle. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’01), 2001,5:3381~3384
    74 I. Barbancho, A.M. Barbancho, A. Jurado, et al. Transcription of piano recordings. Applied Acoustics, 2004, 65(12):1261~1287
    75 J. Yin, T. Sim, Y. Wang, et al. Music transcription using an instrument model. IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP '05), 2005, 3:217~220
    76 W.A. Sethares, R.D. Morris, J.C. Sethares. Beat tracking of musical performances using low-level audio features. IEEE Transactions on Speech and Audio Processing, 2005,13(2):275~285
    77 M. Goto, Y. Muraoka. Real-time beat tracking for drumless audio signals: Chord change detection for musical decisions. Speech Communication, 1999, 27:311~335
    78 M. Goto. An audio-based real-time beat tracking system for music with or without drum-sounds. Journal of New Music Research, 2001, 30(2): 159~171
    79 G.J.F. Jones, J.T. Foote, K.S. Jones, et al. Video mail retrieval: the effect of word spotting accuracy on precision. International Conference on Acoustics, Speech, and Signal Processing (ICASSP-95), 1995,1:309~312
    80 F. Kubala, S. Colbath, D. Liu, et al. Rough 'n' Ready: a meeting recorder and browser. ACM Computing Surveys, 1999,31(2):7~10
    81 J. Makhoul, F. Kubala, T. Leek, et al. Speech and language technologies for audio indexing and retrieval. IEEE Transactions on Speech and Audio Processing, 2000, 88(8):1338~1353
    82 A.G. Hauptmann, M.J. Witbrock. Informedia: news-on-demand multimediainformation acquisition and retrieval. Intelligent Multimedia Information Retrieval, Cambridge, MA: AAAI/MIT Press, 1997:213~239
    83 A.G. Hauptmann, W.H. Lin. Beyond the informedia digital video library: video and audio analysis for remembering conversations. IEEE Workshop on Automatic Speech Recognition and Understanding(ASRU), 2001: 296~300
    84 R. Jin, A. Hauptmann. Learning to select good title words: a new approach based on reversed information retrieval. International Conference on Machine Learning (ICML'01), Williams College, Berkshires, MD, 2001: 242~249
    85 W.H. Lin. Modeling timing features in broadcast news video classification. IEEE International Conference on Multimedia and Expo (ICME'04), Taipei, Taiwan, 2004.7,3:1991~1994
    86 M.Y. Chen, A. Hauptmann. Multi-modal classification in digital news libraries. Joint Conference on Digital Libraries (JCDL'04), 2004.6:212~213
    87 M. Federico. A system for the retrieval of Italian broadcast news. Speech Communication, 2000, 32(1-2):37~47
    88 J.H.L. Hansen, R.Q. Huang, P. Mangalath. SPEECHFIND: spoken document retrieval for a national gallery of the spoken word. Proceeding of the 6th Nordic Signal Processing Symposium(NORSIG 2004), 2004:1~4
    89 A. Merlino, M. Maybury. An empirical study of the optimal presentation of multimedia summaries of broadcast news. Automated Text Summarization. Cambridge, MIT Press, 1999:391~401
    90 D. Abberley, S. Renals, G. Cook. Retrieval of broadcast news documents with the THISL system. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '98), 1998, 6:3781~3784
    91 周梁, 高鹏, 丁鹏, 徐波. 语音识别准确率与检索性能的关联性研究. 中文信息学报, 2006,3(20):99~104
    92 潘复平, 赵庆卫, 颜永红. 一个基于语音识别的音频检索系统的实现. 第八届全国人机语音通讯学术会议论文集, 2005.10:428~432
    93 B. Logan, P. Moreno, O. Deshmukh. Word and subword indexing approaches for reducing the effects of OOV queries on spoken audio. Proceeding of HLT, San Diego, California, USA 2002
    94 S.E. Johnson, P. Jourlin, G.L. Moore, et al. The cambridge university spoken document retrieval system. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '99), 1999,1:49~52
    95 H.M. Wang. Experiments in syllable-based retrieval of broadcast news speech in mandarin Chinese. Speech Communication, 2000, 32:49~60
    96 D.A. James, S.J. Young. A fast lattice-based approach to vocabulary- independent word spotting. IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP '94), 1994,1:377~380
    97 J.T. Foote, S.J. Young, G.J.F. Jones, et al. Unconstrained keyword spotting using phone lattices with application to spoken document retrieval. Computer Speech and Language. 1997,2:207~224
    98 F. Seide,P. Yu,C.Y. Ma, et al. Vocabulary-independent search in spontaneous speech. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’04), 2004,1:253~258
    99 郝杰, 李星. 汉语连续语音识别中关键词可信度的贝叶斯估计. 声学学报, 2002, 27(5):393~397
    100 罗骏, 欧智坚. 一种高效的语音关键词检索系统. 通信学报, 2006, 27(2): 113~118
    101 B.R. Bai, B.L. Chen, H.M. Wang. Syllable-based Chinese text/spoken document retrieval. International Journal of Pattern Recognition and Artificial Intelligence, 2000, 14(5):603~616
    102 K. Ng, V. Zue. Information fusion for spoken document retrieval. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2000), 2000,4:2405~2408
    103 P. Yu, F. Seide. A hybrid word/phoneme-based approach for improved vocabulary-independent search in spontaneous speech. Proceeding of INTERSPEECH-2004, 2004:293~296
    104 吴飞. 基于内容的多媒体融合分析与检索. 浙江大学博士论文, 2002.7
    105 杨行峻, 迟惠生等. 语音信号数字处理. 北京: 电子工业出版社, 1995.8: 41~42
    106 R. Sarikaya. Robust and efficient techniques for speech recognition in noise. Department of electrical and computer engineering, Duke university. [Phd dissertion], 2001
    107 李霄寒, 戴蓓倩, 方绍武等. 高阶 MFCC 的话者识别性能及其噪声鲁棒性. 信号处理, 2001,17(2):124~129
    108 马大猷著. 现代声学理论基础. 北京: 科学出版社, 2004.3:53~61
    109 K. Lin, H.V. Jagadish, C. Faloutsos. The TV-tree: an index structure for high 2 dimensional data. VLDB Journal, 1995, 3:517~542.
    110 N. Beckmann, H.P. Kriegel, R. Schneider, et al. The R*-tree: an efficient and robust access method for points and rectangles. Proceeding of ACM SIGMOD International Conference on Management of Data , Atlantic City , NJ, 1990 ,322~331
    111 D.A. White, R. Jain. Similarity indexing with the SS-tree. Proceeding of 12th International Conference on Data Engineering, New Orleans, LA, 1996, 516~523
    112 S. Berchtold, D. Keim, H.P. Kriegel. The X-tree: an index structure for high-dimensional data. Proceeding of 22nd International Conference on Very Large Data Bases, Mumbay India, 1996:28~39
    113 S. Berchtold, C. Bohm, H.P. Kriegel. The pyramid-technique: towards indexing beyond the curse of dimensionality. Proceeding of ACM SIGMOD International Conference on Management of Data, Seattle,1998,142~153
    114 张海勤, 欧阳为民, 蔡庆生. 聚类金字塔树:一种新的高维空间数据索引方法. 中国科学技术大学学报, 2001.12, 31(6):707~714
    115 张红, 黄泰翼, 宋俊寿. 一种频域基频提取新方法. 声学学报, 1999.7, 24(4):438~445
    116 蒋刚毅, 郑义. 基于数学形态滤波的语音信号基音特征提取. 声学学报, 1998.11, 23(6):522~528
    117 高戈, 李明, 胡瑞敏. 基音周期估计算法研究. 声学学报, 2003.11, 28(6): 540~544
    118 赵鹤鸣, 朱祺, 陈雪勤等. 临界频带子波变换用于混叠语音分离的研究.声学学报, 2004.3, 29(2):177~181
    119 R. Meddis, L. O'Mard. A unitary model for pitch perception. The Journal of the Acoustical Society of America, 1997,102:1811~1820
    120 T. Tolonen, M. Karjalainen. A computationally efficient multipitch analysis model. IEEE Transactions on Speech and Audio Processing, 2000, 8(6): 708~716
    121 N.H. Fletcher, T.D. Rossing. The physics of musical instruments (2nd edition). Sjpringer-Verlag New York, Inc., 1998
    122 龚镇雄. 音乐声学. 北京: 电子工业出版社, 1995.5:92~132
    123 M. Vetterli. A theory of multirate filter banks. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1987, 35(3):356~372
    124 G. Golub, C. VanLoan. An analysis of the total least squares problem. SIAM Journal Number Anal, 1980,17(6):883~893
    125 R.D. Fierro, G.H. Golub, P.C. Hansen, et al. Regularization by truncated total least squares. SIAM Journal Science Computer, 1997,18(4):1223~ 1241
    126 N. Kosugi, et al. A practical query-by-humming system for a large music database. ACM Multimedia, 2000:333~342
    127 W. Chai. Melody retrieval on the Web. MIT, Cambridge, [Master dissertation], 2000
    128 S. Dixon. Learning to detect onsets of acoustic piano tones. MOSART Workshop on Current Research Directions in Computer Music, Barcelona, Spain, November 2001
    129 E. Terhardt. Calculating virtual pitch. Hearing Res., 1979, 1:155~182

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700