Automatic classification of academic web page types

被引:18
作者
Kenekayoro, Patrick [1 ]
Buckley, Kevan [1 ]
Thelwall, Mike [1 ]
机构
[1] Univ Wolverhampton, Stat Cybermetr Res Grp, Wulfruna St Wolverhampto WV1 1LY, W Midlands, England
关键词
Webometrics; Link classification; Supervised learning; Decision tree induction; Support vector machines; SITE INTERLINKING; LINK ANALYSIS; INFORMATION; FRAMEWORK;
D O I
10.1007/s11192-014-1292-9
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Counts of hyperlinks between websites can be unreliable for webometrics studies so researchers have attempted to find alternate counting methods or have tried to identify the reasons why links in websites are created. Manual classification of individual links in websites is infeasible for large webometrics studies, so a more efficient approach to identifying the reasons for link creation is needed to fully harness the potential of hyperlinks for webometrics research. This paper describes a machine learning method to automatically classify hyperlink source and target page types in university websites. 78 % accuracy was achieved for automatically classifying web page types and up to 74 % accuracy for predicting link target page types from link source page characteristics.
引用
收藏
页码:1015 / 1026
页数:12
相关论文
共 25 条