ChemDig: new approaches to chemically significant indexing and searching of distributed web collections

被引:6
作者
Gkoutos, GV [1 ]
Leach, C [1 ]
Rzepa, HS [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Chem, London SW7 2AY, England
关键词
D O I
10.1039/b110693g
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
We describe an extension of the ht://Dig robot-based internet indexing and search engine to include the retrieval of information included in a variety of molecular data formats as defined by chemical MIME types. This is achieved by invoking chemical meta-parsers, software agents designed to provide key meta-data information about the content of the external chemical files. This meta-data can include, for example, derived molecular formula, molecular mass and atom connection table ( SMILES) where the content of the le allows this, and other types of content such as author information and supplied keywords. These terms can be automatically added to the searchable terms, and the search outputs can be automatically linked via database requests to other external databases containing chemical information. We report our experience in applying this robot to indexing five different remote sites. We discuss different mechanisms for storing and searching for the chemical content, ranging from simple keyword-based searches qualified by chemically significant boolean terms, chemical similarity searches and our experiments in creating more highly structured content that expresses the chemical data using XML-based markup and where XSLT transforms for filtering, searching and rendering the information are used.
引用
收藏
页码:656 / 666
页数:11
相关论文
共 29 条
[11]  
GKOUTOS GV, 2001, UNPUB INT J CHEM
[12]  
IBISON P, 1992, J CHEM INF COMP SCI, V32, P373
[13]  
LEACH C, 1997, ELECT C HET CHEM 96
[14]  
Lin SK, 2000, INTERNET J CHEM, V3, part. no.
[15]  
LIN SK, 1999, MDPI
[16]  
MCNAUGHT A, 2001, CHEM INT, V23
[17]   A universal approach to web-based chemistry using XML and CML [J].
Murray-Rust, P ;
Rzepa, HS ;
Wright, M ;
Zara, S .
CHEMICAL COMMUNICATIONS, 2000, (16) :1471-1472
[18]   Chemical markup, XML, and the Worldwide Web. 1. Basic principles [J].
Murray-Rust, P ;
Rzepa, HS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (06) :928-942
[19]   Development of chemical markup language (CML) as a system for handling complex chemical content [J].
Murray-Rust, P ;
Rzepa, HS ;
Wright, M .
NEW JOURNAL OF CHEMISTRY, 2001, 25 (04) :618-634
[20]   The World-Wide Web as a chemical information tool [J].
MurrayRust, P ;
Rzepa, HS ;
Whitaker, BJ .
CHEMICAL SOCIETY REVIEWS, 1997, 26 (01) :1-10