An overview of audio information retrieval

被引:177
作者
Foote, J [1 ]
机构
[1] Natl Univ Singapore, Inst Syst Sci, Singapore 119597, Singapore
关键词
D O I
10.1007/s005300050106
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of audio information retrieval is familiar to anyone who has returned from vacation to find an answering machine full of messages. While there is not yet an "AltaVista" for the audio data type, many workers are finding ways to automatically locate, index, and browse audio using recent advances in speech recognition and machine listening. This paper reviews the state of the art in audio information retrieval, and presents recent advances in automatic speech recognition, word spotting, speaker and music identification, and audio similarity with a view towards making audio less "opaque". A special section addresses intelligent interfaces for navigating and browsing audio and multimedia documents, using automatically derived information to go beyond the tape recorder metaphor.
引用
收藏
页码:2 / 10
页数:9
相关论文
共 53 条
[1]  
[Anonymous], P 3 ACM INT C MULT S
[2]  
[Anonymous], P AAAI 1997 SPRING S
[3]  
ARIKI S, 1996, INT S COOP DAT SYST
[4]  
Arons B., 1997, ACM Transactions on Computer-Human Interaction, V4, P3, DOI 10.1145/244754.244758
[5]  
BROWN MG, 1996, P ACM MULT 96 NOV 19, P35
[6]  
CHEN F, 1997, MANAGING MULTIMEDIA
[7]   AUTOMATIC-INDEXING OF A SOUND DATABASE USING SELF-ORGANIZING NEURAL NETS [J].
FEITEN, B ;
GUNZEL, S .
COMPUTER MUSIC JOURNAL, 1994, 18 (03) :53-65
[8]  
FEITEN B, 1991, P 1991 INT COMP MUS
[9]   Rapid speaker ID using discrete MMI feature quantisation [J].
Foote, JT .
EXPERT SYSTEMS WITH APPLICATIONS, 1997, 13 (04) :283-289
[10]  
FOOTE JT, P SPIE, V3229, P138