Fullerene data mining using bibliometrics and database tomography

被引:44
作者
Kostoff, RN
Braun, T
Schubert, A
Toothman, DR
Humenik, JA
机构
[1] Off Naval Res, Arlington, VA 22217 USA
[2] Eotvos Lorand Univ, Inst Inorgan & Analyt Chem, H-1443 Budapest, Hungary
[3] Hungarian Acad Sci Lib, Budapest, Hungary
[4] RSIS Inc, Mclean, VA USA
[5] Noesis Inc, Arlington, VA 22203 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 2000年 / 40卷 / 01期
关键词
D O I
10.1021/ci990045n
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Database tomography (DT) is a textual database analysis system consisting of two major components: (1) algorithms for extracting multiword phrase frequencies and phrase proximities (physical closeness of the multiword technical phrases) from any type of large textual database, to augment (2) interpretative capabilities of the expert human analyst. DT was used to derive technical intelligence from a fullerenes database derived from the Science Citation Index and the Engineering Compendex. Phrase frequency analysis by the technical domain experts provided the pervasive technical themes of the fullerenes database, and phrase proximity analysis provided the relationships among the pervasive technical themes. Bibliometric analysis of the fullerenes literature supplemented the DT results with author/journal/institution publication and citation data. Comparisons of fullerenes results with past analyses of similarly structured near-earth space, chemistry, hypersonic/supersonic flow, aircraft, and ship hydrodynamics databases are made. One important finding is that many of the normalized bibliometric distribution functions are extremely consistent across these diverse technical domains and could reasonably be expected to apply to broader chemical topics than fullerenes that span multiple structural classes. Finally, lessons learned about integrating the technical domain experts with the data mining tools are presented.
引用
收藏
页码:19 / 39
页数:21
相关论文
共 20 条
[1]  
ANWAR MA, 1997, SCIENTOMETRICS, V40, P1
[2]  
BRADFORD SC, 1934, ENGINEERING, P137
[3]  
*ENG INF INC, 1999, ENG COMP
[4]   HISTORY OF CITATION INDEXES FOR CHEMISTRY - A BRIEF REVIEW [J].
GARFIELD, E .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1985, 25 (03) :170-174
[5]  
*I SCI INF, 1999, SCI CIT IND
[6]   Citation analysis cross-field normalization: A new paradigm [J].
Kostoff, RN .
SCIENTOMETRICS, 1997, 39 (03) :225-230
[7]   The use and misuse of citation analysis in research evaluation - Comments on theories of citation? [J].
Kostoff, RN .
SCIENTOMETRICS, 1998, 43 (01) :27-43
[8]   Database Tomography for technical intelligence: Comparative roadmaps of the research impact assessment literature and the journal of the American Chemical Society [J].
Kostoff, RN ;
Eberhart, HJ ;
Toothman, DR ;
Pellenbarg, R .
SCIENTOMETRICS, 1997, 40 (01) :103-138
[9]   Database tomography for technical intelligence: A roadmap of the near-earth space science and technology literature [J].
Kostoff, RN ;
Eberhart, HJ ;
Toothman, DR .
INFORMATION PROCESSING & MANAGEMENT, 1998, 34 (01) :69-85
[10]   Database tomography for information retrieval [J].
Kostoff, RN ;
Eberhart, HJ ;
Toothman, DR .
JOURNAL OF INFORMATION SCIENCE, 1997, 23 (04) :301-311