Summarizing audiovisual contents of a video program

被引:13
作者
Gong, YH [1 ]
机构
[1] NEC Labs Amer Inc, Cupertino, CA 95014 USA
关键词
video summarization; audiovisual summarization; partial audiovisual alignment; bipartite graph; minimum spanning tree; maximum bipartite matching;
D O I
10.1155/S1110865703211082
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 [电气工程]; 0809 [电子科学与技术];
摘要
In this paper, we focus on video programs that are intended to disseminate information and knowledge such as news, documentaries, seminars, etc, and present an audiovisual summarization system that summarizes the audio and visual contents of the given video separately, and then integrating the two summaries with a partial alignment. The audio summary is created by selecting spoken sentences that best present the main content of the audio speech while the visual summary is created by eliminating duplicates/redundancies and preserving visually rich contents in the image stream. The alignment operation aims to synchronize each spoken sentence in the audio summary with its corresponding speaker's face and to preserve the rich content in the visual summary. A Bipartite Graph-based audiovisual alignment algorithm is developed to efficiently find the best alignment solution that satisfies these alignment requirements. With the proposed system, we strive to produce a video summary that: (1) provides a natural visual and audio content overview, and (2) maximizes the coverage for both audio and visual contents of the original video without having to sacrifice either of them.
引用
收藏
页码:160 / 169
页数:10
相关论文
共 8 条
[1]
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[2]
2-9
[3]
Multiscale content extraction and representation for video indexing [J].
Ferman, AM ;
Tekalp, AM .
MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS II, 1997, 3229 :23-31
[4]
An energy-circuit population model for Great Egrets (Ardea alba) at Lake Okeechobee, Florida, USA [J].
Smith, JP .
ECOLOGICAL MODELLING, 1997, 97 (1-2) :1-21
[5]
TONOMURA Y, 1993, P INTERCHI 93, P131
[6]
UEDA H, 1991, P CHI 91, P343
[7]
YEUNG MM, 1995, P SOC PHOTO-OPT INS, V2417, P399, DOI 10.1117/12.206067
[8]
[No title captured]