Content-based audio classification and retrieval by support vector machines

被引:320
作者
Guo, GD [1 ]
Li, SZ
机构
[1] Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USA
[2] Microsoft Res China, Beijing 100080, Peoples R China
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2003年 / 14卷 / 01期
关键词
audio classification; binary tree; content-based retrieval; distance-from-boundary (DFB); pattern recognition; support vector machines (SVMs);
D O I
10.1109/TNN.2002.806626
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
Support vector machines (SVMs) have been recently proposed as a new learning algorithm for pattern recognition. In this paper, the SVMs with a binary tree recognition strategy are used to tackle the audio classification problem. We illustrate the potential of SVMs on a common audio database, which consists of 409 sounds of 16 classes. We compare the SVMs based classification with other popular approaches. For audio retrieval, we propose a new metric, called distance-from-boundary (DFB). When a query audio is given, the system first finds a boundary inside which the query pattern is located. Then, all the audio patterns in the database are sorted by their distances to this boundary. All boundaries are learned by the SVMs and stored together with the audio database. Experimental comparisons for audio retrieval are presented to show the superiority of this hovel metric to other similarity measures.
引用
收藏
页码:209 / 215
页数:7
相关论文
共 22 条
[1]
[Anonymous], P 13 INT C MACH LEAR
[2]
Cook M. P., 1993, MODELING AUDITORY PR
[3]
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[4]
AUTOMATIC-INDEXING OF A SOUND DATABASE USING SELF-ORGANIZING NEURAL NETS [J].
FEITEN, B ;
GUNZEL, S .
COMPUTER MUSIC JOURNAL, 1994, 18 (03) :53-65
[5]
FEITEN B, 1991, P 1991 INT COMP MUS
[6]
An overview of audio information retrieval [J].
Foote, J .
MULTIMEDIA SYSTEMS, 1999, 7 (01) :2-10
[7]
Content-based retrieval of music and audio [J].
Foote, JT .
MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS II, 1997, 3229 :138-147
[8]
TOWARD AN INTELLIGENT EDITOR OF DIGITAL AUDIO - SIGNAL-PROCESSING METHODS [J].
FOSTER, S ;
SCHLOSS, WA ;
ROCKMORE, AJ .
COMPUTER MUSIC JOURNAL, 1982, 6 (01) :42-51
[9]
Fukunaga K., 1990, INTRO STAT PATTERN R
[10]
Gunn S.R., 1998, SUPPORT VECTOR MACH