Credibility-inspired ranking for blog post retrieval

被引：2

作者：

Wouter Weerkamp

Maarten de Rijke

机构：

[1] University of Amsterdam,ISLA

来源：

Information Retrieval | 2012年 / 15卷

关键词：

Credibility; Blog post retrieval; Reranking;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Credibility of information refers to its believability or the believability of its sources. We explore the impact of credibility-inspired indicators on the task of blog post retrieval, following the intuition that more credible blog posts are preferred by searchers. Based on a previously introduced credibility framework for blogs, we define several credibility indicators, and divide them into post-level (e.g., spelling, timeliness, document length) and blog-level (e.g., regularity, expertise, comments) indicators. The retrieval task at hand is precision-oriented, and we hypothesize that the use of credibility-inspired indicators will positively impact precision. We propose to use ideas from the credibility framework in a reranking approach to the blog post retrieval problem: We introduce two simple ways of reranking the top n of an initial run. The first approach, Credibility-inspired reranking, simply reranks the top n of a baseline based on the credibility-inspired score. The second approach, Combined reranking, multiplies the credibility-inspired score of the top n results by their retrieval score, and reranks based on this score. Results show that Credibility-inspired reranking leads to larger improvements over the baseline than Combined reranking, but both approaches are capable of improving over an already strong baseline. For Credibility-inspired reranking the best performance is achieved using a combination of all post-level indicators. Combined reranking works best using the post-level indicators combined with comments and pronouns. The blog-level indicators expertise, regularity, and coherence do not contribute positively to the performance, although analysis shows that they can be useful for certain topics. Additional analysis shows that a relative small value of n (15–25) leads to the best results, and that posts that move up the ranking due to the integration of reranking based on credibility-inspired indicators do indeed appear to be more credible than the ones that go down.

引用

页码：243 / 277

页数：34

共 15 条

[1]

Chen M.(2010)Using blog content depth and breadth to access and classify blogs International Journal of Business and Information 5 26-45

[2]

Ohta T.(2009)An effective coherence measure to determine topical consistency in user generated content International Journal on Document Analysis and Recognition 12 185-203

[3]

He J.(2008)An analysis on document length retrieval trends in language modeling smoothing Information Retrieval Journal 11 109-138

[4]

Weerkamp W.(2007)Making sense of credibility on the Web: Models for evaluating online information and recommendations for future research Journal of the American Society for Information Science and Technology 58 2078-2091

[5]

Larson M.(2010)Predicting podcast preference: An analysis framework and its application Journal of the American Society for Information Science and Technology 61 374-391

[6]

de Rijke M.(2011)Blog feed search with a post index Information Retrieval Journal 14 515-545

[7]

Losada D. E.(undefined)undefined undefined undefined undefined-undefined

[8]

Azzopardi L.(undefined)undefined undefined undefined undefined-undefined

[9]

Metzger M.(undefined)undefined undefined undefined undefined-undefined

[10]

Tsagkias M.(undefined)undefined undefined undefined undefined-undefined

← 1 2 →