A progressive sentence selection strategy for document summarization

被引:15
作者
Ouyang, You [1 ]
Li, Wenjie [1 ]
Zhang, Renxian [1 ]
Li, Sujian [2 ]
Lu, Qin [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
[2] Peking Univ, Minist Educ, Key Lab Computat Linguist, Beijing, Peoples R China
关键词
Document summarization; Saliency and coverage; Progressive sentence selection; Asymmetric sentence relationship;
D O I
10.1016/j.ipm.2012.05.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Saliency and coverage are two of the most important issues in document summarization. In most summarization methods, the saliency issue is usually of top priority. Many studies are conducted to develop better sentence ranking methods to identify the salient sentences for summarization. It is also well acknowledged that sentence selection strategies are very important, which mainly aim at reducing the redundancy among the selected sentences to enable them to cover more concepts. In this paper, we propose a novel sentence selection strategy that follows a progressive way to select the summary sentences. We intend to ensure the coverage of the summary first by an intuitive idea, i.e., considering the uncovered concepts only when measuring the saliency of the sentences. Moreover, we consider the subsuming relationship between sentences to define a conditional saliency measure of the sentences instead of the general saliency measures used in most existing methods. Based on these ideas, a progressive sentence selection strategy is developed to discover the "novel and salient" sentences. Compared with traditional methods, the saliency and coverage issues are more integrated in the proposed method. Experimental studies conducted on the DUC data sets demonstrate the advantages of the progressive sentence selection strategy. (C) 2012 Published by Elsevier Ltd.
引用
收藏
页码:213 / 221
页数:9
相关论文
共 15 条
[1]  
[Anonymous], P 45 ANN M ASS COMP
[2]  
[Anonymous], 2003, P 2003 C N AM CHAPT
[3]  
[Anonymous], 1958, IBM Journal of Research and Development
[4]  
Banerjee S., 2002, Computational Linguistics and Intelligent Text Processing. Third International Conference, CICLing 2002. Proceedings (Lecture Notes in Computer Science Vol.2276), P136
[5]  
Carbonell J., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P335, DOI 10.1145/290941.291025
[6]  
Conroy J., 2006, P COLING ACL MAIN C, P152
[7]  
Cunningham H, 2002, 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P168
[8]  
Katragadda R., 2009, P ACL IJCNLP 2009 C, P105
[9]  
Kummamuru Krishna., 2004, Proceedings of the 13th international conference on World Wide Web, P658
[10]  
Kupiec J., 1995, SIGIR Forum, P68