结合词语分布信息的TFIDF关键词抽取方法研究

被引:2
作者
徐振强 [1 ,2 ]
李保利 [1 ,2 ]
机构
[1] 河南工业大学信息科学与工程学院
[2] 数字出版技术国家重点实验室
关键词
抽取; TFIDF; 词语分布; 自动标引;
D O I
暂无
中图分类号
TP391.1 [文字信息处理];
学科分类号
081203 ; 0835 ;
摘要
介绍了关键词抽取的相关工作,并对基于TFIDF的关键词抽取算法进行了分析。结合词语在文本中的分布均衡程度和首次出现位置等特征,提出了一种改进的TFIDF算法,并给出了相应的计算公式。在文档数量和文档平均长度不同的3个语料上进行了对比实验。实验结果表明,结合词语分布信息的TFIDF关键词抽取方法是可行和有效的。
引用
收藏
页码:59 / 63
页数:5
相关论文
共 12 条
[1]  
Single Document Keyphrase Extraction Using Neighborhood Knowledge. WAN Xiaojun,XIAO Jianguo. Proceedings of the 23rd AAAI Conference on Artificial Intelligence . 2008
[2]  
Unsupervised Approaches for Automatic Keyword Extraction Using Meeting Transcripts. LIU Feifan,Deana Pennell,LIU Fei,et al. Proceedings of Human Language Technologies:The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics . 2009
[3]  
Conundrums in Unsupervised Keyphrase Extraction:Making Sense of The-art. Kazi Saidul Hasan,Vincent N. Rroceedings of the 23rd International Conference on Computational Linguistics . 2010
[4]  
Improved automatic keyword extraction given more linguistic knowledge. Anette Hulth. Proceedings of the Conference Empirical Methods in Natural Language Processing(EMNLP‘2003) . 2003
[5]  
Clustering to findexemplar terms for keyphrase extraction. Liu Zhiyuan,Li Peng,Zheng Yabin,et al. Conferenceon Empirical Methods in Natural Language Processing(EMNLP’09) . 2009
[6]  
Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. Kristina Toutanova,Christopher D.Manning. Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000) . 2000
[7]  
Keyphrase extraction in scientific publications. NGUYEN T,KAN M Y. Proceedings of the 10thInternational Conference on Asian Digital Libraries . 2007
[8]  
TextRank: Bringing Order into Texts. Rada Mihalcea,Paul Tarau. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004) . 2004
[9]  
Keyword Extraction Using Support Vector Machine. Zhang k,Xu H,Tang J,Li J Z. Proceedings of the Seventh International Conference on Web-Age Information Management (WAIM 2006) . 2006
[10]  
DFKI KeyWE:Ranking Keyphrases Extracted from Scientific Articles. Kathrin Eichler,Günter Neumann. Proceedings of The 5th International Workshop on Semantic Evaluation . 2010