Taxonomy visualization in support of the semi-automatic validation and optimization of organizational schemas

被引:5
作者
Borner, Katy [1 ]
Hardy, Elisha [1 ]
Herr, Bruce [1 ]
Holloway, Todd [2 ]
Paley, W. Bradford
机构
[1] Indiana Univ, SLIS, Wells Lib, Bloomington, IN 47405 USA
[2] Indiana Univ, Dept Comp Sci, Bloomington, IN 47405 USA
基金
美国国家科学基金会;
关键词
patents; taxonomy; ontology; classification hierarchy; visualization;
D O I
10.1016/j.joi.2007.03.002
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Never before in history has mankind produced and had access to so much data, information, knowledge, and expertise as today. To organize, access, and manage these valuable assets effectively, we use taxonomies, classification hierarchies, ontologies, controlled vocabularies, and other approaches. We create directory structures for our files. We use organizational hierarchies to structure our work environment. However, the design and continuous update of these organizational schemas with potentially thousands of class nodes organizing millions of entities is challenging for any human being. The taxonomy visualization and validation (TV) tool introduced in this paper supports the semi-automatic validation and optimization of organizational schemas such as file directories, classification hierarchies, taxonomies, or other structures imposed on a data set for organization, access, and naming. By showing the "goodness of fit" for a schema and the potentially millions of entities it organizes, the TV tool eases the identification and reclassification of misclassified information entities, the identification of classes that grow too large, the evaluation of the size and homogeneity of existing classes, the examination of the "well-formedness" of an organizational schema, and more. As a demonstration, the TV tool is applied to display and examine the United States Patent and Trademark Office patent classification, which organizes more than three million patents into about 160,000 distinct patent classes. The paper concludes with a discussion and an outlook to future work. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:214 / 225
页数:12
相关论文
共 10 条
[1]   The Semantic Web - A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities [J].
Berners-Lee, T ;
Hendler, J ;
Lassila, O .
SCIENTIFIC AMERICAN, 2001, 284 (05) :34-+
[2]   Visualizing knowledge domains [J].
Börner, K ;
Chen, CM ;
Boyack, KW .
ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 2003, 37 :179-255
[3]  
BORNER K, 2002, VISUAL INTERFACES DI
[4]   GoPubMed: Exploring PubMed with the gene ontology [J].
Doms, A ;
Schroeder, M .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W783-W786
[5]  
GOLDSTONE RL, 2003, COMPREHENSIVE HDB PS, P591
[6]   Toward principles for the design of ontologies used for knowledge sharing [J].
Gruber, TR .
INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 1995, 43 (5-6) :907-928
[7]   Designing highly flexible and usable cyberinfrastructures for convergence [J].
Herr, Bruce W. ;
Huang, Weixua ;
Penumarthy, Shashikant ;
Borner, Katy .
PROGRESS IN CONVERGENCE: TECHNOLOGIES FOR HUMAN WELLBEING, 2006, 1093 :161-179
[8]  
Novak J.D., 1998, LEARNING CREATING US
[9]  
SHIFFRIN RM, 2004, P NATL ACAD SCI S1, V101
[10]   ScholOnto: An ontology-based digital library server for research documents and discourse [J].
Shum S.B. ;
Motta E. ;
Domingue J. .
International Journal on Digital Libraries, 2000, 3 (3) :237-248