Capturing the Temporal Domain in Echonest Features for Improved Classification Effectiveness

设为首页

收藏本站

网站地图 | English | 公务邮箱

About the library

Background
History
Leadership
Organization

Readers' Guide

Opening Hours
Collections
Help Via Email

Publications

Electronic Information Resources

Capturing the Temporal Domain in Echonest Features for Improved Classification Effectiveness

详细信息查看全文

作者：Alexander Schindler (17) (18)
Andreas Rauber (17)
刊名：Lecture Notes in Computer Science
出版年：2014
出版时间：2014
年：2014
卷：1
期：1
页码：214-227
全文大小：381 KB
参考文献：1. Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The million song dataset. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011)
2. Cano, P., G贸mez, E., Gouyon, F., Herrera, P., Koppenberger, M., Ong, B., Serra, X., Streich, S., Wack, N.: ISMIR 2004 audio description contest. Technical report (2006)
3. Dieleman, S., Schrauwen, B.: Audio-based music classification with a pretrained convolutional network. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011)
4. Ellis, D.P.W.: Classifying music audio with timbral and chroma features. In: Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007) (2007)
5. Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303鈥?19 (2011) CrossRef
6. Hall, Mark, Frank, Eibe, Holmes, Geoffrey, Pfahringer, Bernhard, Reutemann, Peter, Witten, Ian H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10鈥?8 (2009) CrossRef
7. Jehan, T., DesRoches, D.: Analyzer documentation (analyzer version 3.08). Website (2011). http://developer.echonest.com/docs/v4/_static/AnalyzeDocumentation.pdf. Accessed 17 Apr 2012
8. Lidy, T., Mayer, R., Rauber, A., Pertusa, A., Inesta, J.M.: A cartesian ensemble of feature subspace classifiers for music categorization. In: Proceedings of the 11th International Conference on Music Information Retrieval (ISMIR 2010) (2010)
9. Lidy, T., Rauber, A.: In: Proceedings of the 6th International Society for Music Information Retrieval Conference (ISMIR 2005) (2005)
10. Lidy, T., Silla Jr., C.N., Cornelis, O., Gouyon, F., Rauber, A., Kaestner, Caa, Koerich, A.L.: On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-Western and ethnic music collections. Signal Process. 90(4), 1032鈥?048 (2010) CrossRef
11. Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval (2000)
12. McKay, C., Fujinaga, I.: Musical genre classification: is it worth pursuing and how can it be improved. In: Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR 2006), pp.101鈥?06 (2006)
13. McKay, C., Fujinaga, I.: jMIR: tools for automatic music classification. In: Proceedings of the International Computer Music Conference, pp. 65鈥?8 (2009)
14. Pampalk, E., Rauber, A., Merkl, D.: Content-based organization and visualization of music archives. In: Proceedings of the 10th ACM International Conference on Multimedia, p. 570 (2002)
15. Rauber, A., Pampalk, E., Merkl, D.: The SOM-enhanced JukeBox: organization and visualization of music collections based on perceptual models. J. New Music Res. 32(2), 193鈥?10 (2003) CrossRef
16. Schindler, A., Mayer, R., Rauber, A.: Facilitating comprehensive benchmarking experiments on the million song dataset. In: Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR 2012) (2012)
17. Silla, Jr., C.N., Koerich, A.L., Catholic, P., Kaestner, C.A.A.: The Latin music database. In: Proceedings of the 9th International Conference of Music Information Retrieval, p. 451. Lulu. com (2008)
18. Tzanetakis, G.: Manipulation, analysis and retrieval systems for audio signals. Ph.D. thesis (2002)
19. Tzanetakis, George, Cook, Perry: Marsyas: a framework for audio analysis. Organised Sound 4(3), 169鈥?75 (2000) CrossRef
20. Tzanetakis, George, Cook, Perry: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293鈥?02 (2002) CrossRef
21. Witten, I.H., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.J.: Weka: practical machine learning tools and techniques with Java implementations (1999)
作者单位：Alexander Schindler (17) (18)
Andreas Rauber (17)

17. Department of Software Technology and Interactive Systems, Vienna University of Technology, Vienna, Austria
18. Intelligent Vision Systems, AIT Austrian Institute of Technology, Vienna, Austria
ISSN：1611-3349

文摘

This paper proposes Temporal Echonest Features to harness the information available from the beat-aligned vector sequences of the features provided by The Echo Nest. Rather than aggregating them via simple averaging approaches, the statistics of temporal variations are analyzed and used to represent the audio content. We evaluate the performance on four traditional music genre classification test collections and compare them to state of the art audio descriptors. Experiments reveal, that the exploitation of temporal variability from beat-aligned vector sequences and combinations of different descriptors leads to an improvement of classification accuracy. Comparing the results of Temporal Echonest Features to those of approved conventional audio descriptors used as benchmarks, these approaches perform well, often significantly outperforming their predecessors, and can be effectively used for large scale music genre classification.