Text mining using database tomography and bibliometrics: A review

被引:109
作者
Kostoff, RN
Toothman, DR
Eberhart, HJ
Humenik, JA
机构
[1] Off Naval Res, Arlington, VA 22217 USA
[2] RSIS Inc, Mclean, VA 22102 USA
[3] NOESIS Inc, Manassas, VA USA
关键词
database tomography; text mining; bibliometrics; innovation; information retrieval; information extraction; cluster; taxonomies;
D O I
10.1016/S0040-1625(01)00133-0
中图分类号
F [经济];
学科分类号
02 ;
摘要
Database tomography (DT) is a textual database analysis system consisting of two major components: (1) algorithms for extracting multiword phrase frequencies and phrase proximities (physical closeness of the multiword technical phrases) from any type of large textual database, to augment (2) interpretative capabilities of the expert human analyst. DT has been used to derive technical intelligence from a variety of textual database sources, most recently the published technical literature as exemplified by the Science Citation Index (SCI) and the Engineering Compendex (EC). Phrase frequency analysis (the occurrence frequency of multiword technical phrases) provides the pervasive technical themes of the topical databases of interest, and phrase proximity analysis provides the relationships among the pervasive technical themes. In the structured published literature databases, bibliometric analysis of the database records supplements the DT results by identifying the recent most prolific topical area authors; the journals that contain numerous topical area papers; the institutions that produce numerous topical area papers; the keywords specified most frequently by the topical area authors; the authors whose works are cited most frequently in the topical area papers; and the particular papers and journals cited most frequently in the topical area papers. This review paper summarizes: (1) the theory and background development of DT; (2) past published and unpublished literature study results; (3) present application activities; (4) potential expansion to new DT applications. In addition, application of DT to technology forecasting is addressed. (C) 2001 Elsevier Science Inc. All rights reserved.
引用
收藏
页码:223 / 253
页数:31
相关论文
共 28 条
[1]  
[Anonymous], 1999, THESIS U WASHINGTON
[2]  
BRADFORD SC, 1934, ENGINEERING, V137
[3]   Science and technology roadmaps [J].
Kostoff, RN ;
Scaller, RR .
IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, 2001, 48 (02) :132-143
[4]   Database tomography applied to an aircraft science and technology investment strategy [J].
Kostoff, RN ;
Green, KA ;
Toothman, DR ;
Humenik, JA .
JOURNAL OF AIRCRAFT, 2000, 37 (04) :727-730
[5]   Citation analysis cross-field normalization: A new paradigm [J].
Kostoff, RN .
SCIENTOMETRICS, 1997, 39 (03) :225-230
[6]   The use and misuse of citation analysis in research evaluation - Comments on theories of citation? [J].
Kostoff, RN .
SCIENTOMETRICS, 1998, 43 (01) :27-43
[7]   Database Tomography for technical intelligence: Comparative roadmaps of the research impact assessment literature and the journal of the American Chemical Society [J].
Kostoff, RN ;
Eberhart, HJ ;
Toothman, DR ;
Pellenbarg, R .
SCIENTOMETRICS, 1997, 40 (01) :103-138
[8]   Database tomography for technical intelligence: A roadmap of the near-earth space science and technology literature [J].
Kostoff, RN ;
Eberhart, HJ ;
Toothman, DR .
INFORMATION PROCESSING & MANAGEMENT, 1998, 34 (01) :69-85
[9]   Database tomography for information retrieval [J].
Kostoff, RN ;
Eberhart, HJ ;
Toothman, DR .
JOURNAL OF INFORMATION SCIENCE, 1997, 23 (04) :301-311
[10]   Science and technology innovation [J].
Kostoff, RN .
TECHNOVATION, 1999, 19 (10) :593-604