Predictors of high-quality answers

被引:43
作者
Blooma, Mohan John [1 ]
Goh, Dion Hoe-Lian [1 ]
Chua, Alton Yeow-Kuan
机构
[1] Nanyang Technol Univ, Div Informat Studies, Wee Kim Wee Sch Commun & Informat, Singapore, Singapore
关键词
Community-driven question answering; Quality; Yahoo! Answers; Internet; Search engines; RELEVANCE CRITERIA;
D O I
10.1108/14684521211241413
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Purpose - The purpose of this study is to examine the predictors of high-quality answers in a community-driven question answering service (Yahoo! Answers). Design/methodology/approach - The identified predictors were organised into two categories: social and content features. Social features refer to the community aspects of the users and are extracted from explicit user interaction and feedback. Content features refer to the intrinsic and extrinsic content quality of answers that could be used to select the high-quality answers. In total the framework built in this study comprises 17 features from two categories. Based on a randomly selected dataset of 1,600 question-answer pairs from Yahoo! Answers, high-quality answer predictors were identified. Findings - The results of the analysis showed the importance of content appraisal features over social and textual content features. The features identified as strongly associated with high-quality answers include positive votes, completeness, presentation, reliability and accuracy. Features weakly associated with high-quality answers were high frequency words, answer length, and best answers answered. Features related to the asker's user history were found not to be associated with high-quality answers. Practical implications - This work could help in the reuse of answers for new questions. The study identified features that most influence the selection of high-quality answers. Hence they could be used to select high-quality answers for answering similar questions posed by users in the future. When a new question is posed, similar questions are first identified, and the answers for these questions are extracted and routed to the proposed quality framework for identifying high-quality answers. Based on the overall quality index computed, the high-quality answer could be returned to the asker. Originality/value - Previous studies in identifying high-quality answers were conducted using either of two approaches. First using social and textual content features found in community-driven question answering services and second using content appraisal features by thorough assessment of answer quality provided by experts. However no study had integrated both approaches. Hence this study addresses this gap by developing an integrated generalisable framework to identify features that influence high-quality answers.
引用
收藏
页码:383 / 400
页数:18
相关论文
共 48 条
[1]  
AGICHTEIN E, 2009, ACM T KNOWL DISCOV D, V3
[2]  
Agichtein Eugene, 2008, Proceedings of the 17th International Conference on World Wide Web, P467
[3]  
[Anonymous], 2008, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, DOI DOI 10.1145/1390334.1390417
[4]  
[Anonymous], 2007, P 16 ACM C CONFERENC, DOI DOI 10.1145/1321440.1321575
[5]  
[Anonymous], 2008, Proceedings of the 17th International Conference on World Wide Web, DOI DOI 10.1145/1367497.1367587
[6]  
[Anonymous], 2008, P 2008 INT C WEB SEA
[7]  
[Anonymous], 2007, Proc. WWW 2007
[8]  
[Anonymous], 2006, Data Quality: Concepts, Methodologies and Techniques, DOI [DOI 10.1007/3-540-33173-5_1, DOI 10.1007/3-540-33173-5]
[9]  
BARRY CL, 1994, J AM SOC INFORM SCI, V45, P149, DOI 10.1002/(SICI)1097-4571(199404)45:3<149::AID-ASI5>3.0.CO
[10]  
2-J