Summarizing scientific articles: Experiments with relevance and rhetorical status

被引:262
作者
Teufel, S
Moens, M
机构
[1] Univ Cambridge, Comp Lab, Cambridge CB3 0FD, England
[2] Rhetor Syst, Edinburgh EH8 9LS, Midlothian, Scotland
[3] Univ Edinburgh, Edinburgh EH8 9LS, Midlothian, Scotland
关键词
D O I
10.1162/089120102762671936
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article we propose a strategy for the summarization of scientific articles that concentrates on the rhetorical status of statements in an article: Material for summaries is selected in such a way that summaries can highlight the new contribution of the source article and situate it with respect to earlier work. We provide a gold standard for summaries of this kind consisting of a substantial corpus of conference articles in computational linguistics annotated with human judgments of the rhetorical status and relevance of each sentence in the articles. We present several experiments measuring our judges' agreement on these annotations. We also present an algorithm that, on the basis of the annotated training material, selects content from unseen articles and classifies it into a fixed set of seven rhetorical categories. The output of this extraction and classification system can be viewed as a single-document summary in its own right; alternatively, it provides starting material for the generation of task-oriented and user-tailored summaries designed to give users an overview of a scientific field.
引用
收藏
页码:409 / 445
页数:37
相关论文
共 49 条
[1]  
[Anonymous], IEEE COMPUTER
[2]  
[Anonymous], P 22 ANN INT ACM SIG
[3]  
[Anonymous], 2000, P 1 N AM CHAPTER ASS
[4]  
Barzilay Regina., 1999, Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, P550, DOI [10.3115/1034678.1034760, DOI 10.3115/1034678.1034760, DOI 10.1115/10146781014760]
[5]   MACHINE-MADE INDEX FOR TECHNICAL LITERATURE - AN EXPERIMENT [J].
BAXENDALE, PB .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1958, 2 (04) :354-361
[6]   AUTOMATIC CONDENSATION OF ELECTRONIC PUBLICATIONS BY SENTENCE SELECTION [J].
BRANDOW, R ;
MITZE, K ;
RAU, LF .
INFORMATION PROCESSING & MANAGEMENT, 1995, 31 (05) :675-685
[7]  
Carletta J, 1996, COMPUT LINGUIST, V22, P249
[8]  
Dunning T., 1993, Computational Linguistics, V19, P61
[9]   NEW METHODS IN AUTOMATIC EXTRACTING [J].
EDMUNDSON, HP .
JOURNAL OF THE ACM, 1969, 16 (02) :264-+
[10]   AUTOMATIC ABSTRACTING AND INDEXING - SURVEY AND RECOMMENDATIONS [J].
EDMUNDSON, HP ;
WYLLYS, RE .
COMMUNICATIONS OF THE ACM, 1961, 4 (05) :226-234