ChemDig: new approaches to chemically significant indexing and searching of distributed web collections

被引:6
作者
Gkoutos, GV [1 ]
Leach, C [1 ]
Rzepa, HS [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Chem, London SW7 2AY, England
关键词
D O I
10.1039/b110693g
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
We describe an extension of the ht://Dig robot-based internet indexing and search engine to include the retrieval of information included in a variety of molecular data formats as defined by chemical MIME types. This is achieved by invoking chemical meta-parsers, software agents designed to provide key meta-data information about the content of the external chemical files. This meta-data can include, for example, derived molecular formula, molecular mass and atom connection table ( SMILES) where the content of the le allows this, and other types of content such as author information and supplied keywords. These terms can be automatically added to the searchable terms, and the search outputs can be automatically linked via database requests to other external databases containing chemical information. We report our experience in applying this robot to indexing five different remote sites. We discuss different mechanisms for storing and searching for the chemical content, ranging from simple keyword-based searches qualified by chemically significant boolean terms, chemical similarity searches and our experiments in creating more highly structured content that expresses the chemical data using XML-based markup and where XSLT transforms for filtering, searching and rendering the information are used.
引用
收藏
页码:656 / 666
页数:11
相关论文
共 29 条
[1]  
Aloisio G, 2000, LECT NOTES COMPUT SC, V1823, P32
[2]   An active organisation system for customised, secure agent discovery [J].
Antonopoulos, N ;
Shafarenko, A .
JOURNAL OF SUPERCOMPUTING, 2001, 20 (01) :5-35
[3]  
Brecher JS, 1998, CHIMIA, V52, P658
[4]  
CLARK RM, 2000, INT C MECH ROB 2000
[5]   WWW-based chemical information system [J].
Ertl, P ;
Jacob, O .
THEOCHEM-JOURNAL OF MOLECULAR STRUCTURE, 1997, 419 :113-120
[6]   A robot-based resource discovery tool for adding chemical meta-information and value to web-based documents [J].
Gkoutos, GV ;
Kenway, PR ;
Rzepa, HS .
NEW JOURNAL OF CHEMISTRY, 2001, 25 (04) :635-638
[7]   JChemTidy: A tool for converting chemical Web document collections to an XHTML']HTML representation [J].
Gkoutos, GV ;
Kenway, PR ;
Rzepa, HS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (02) :253-258
[8]  
GKOUTOS GV, 2000, INT J CHEM, V3
[9]  
GKOUTOS GV, 1999, ELECT C SYNTH ORG CH
[10]  
GKOUTOS GV, 2001, INT J CHEM, V4