Extraction of information from the text of chemical patents. 1. Identification of specific chemical names

被引:22
作者
Kemp, N [1 ]
Lynch, M [1 ]
机构
[1] Univ Sheffield, Dept Informat Studies, Sheffield S10 2TN, S Yorkshire, England
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 1998年 / 38卷 / 04期
关键词
D O I
10.1021/ci980324v
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Much attention has been paid to translating isolated chemical names into forms such as connection tables, but less effort has been expended in identifying substance names in running text to make them available for processing. The requirement for automatic name identification becomes a more urgent priority today, not the least in light of the inherent importance of patents and the increasing complexity of newly synthesized substances and, with these, the need for error-free processing of information from patent and other documents. The elaboration of a methodology for isolating substance names in the text of English-language patents is described here, using, in part, the SGML (Standard Generalized Markup Language) of the patent text as an aid to this process. Evaluation of the procedures, which are still at an early stage of development, demonstrates that even simple methods can achieve very high degrees of success.
引用
收藏
页码:544 / 551
页数:8
相关论文
共 26 条
[1]   EXTRACTION OF CHEMICAL-REACTION INFORMATION FROM PRIMARY JOURNAL TEXT [J].
AI, CS ;
BLOWER, PE ;
LEDWITH, RH .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1990, 30 (02) :163-169
[2]  
[Anonymous], 1990, SGML HDB
[3]  
[Anonymous], P 16 C COMP LING
[4]   AUTOMATIC INTERPRETATION OF THE TEXTS OF CHEMICAL PATENT ABSTRACTS .2. PROCESSING AND RESULTS [J].
CHOWDHURY, GG ;
LYNCH, MF .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1992, 32 (05) :468-473
[5]   AUTOMATIC INTERPRETATION OF THE TEXTS OF CHEMICAL PATENT ABSTRACTS .1. LEXICAL ANALYSIS AND CATEGORIZATION [J].
CHOWDHURY, GG ;
LYNCH, MF .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1992, 32 (05) :463-467
[6]   COMPUTER TRANSLATION OF IUPAC SYSTEMATIC ORGANIC-CHEMICAL NOMENCLATURE .5. STEROID NOMENCLATURE [J].
COOKEFOX, DI ;
KIRBY, GH ;
LORD, MR ;
RAYNER, JD .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1990, 30 (02) :128-132
[7]   COMPUTER TRANSLATION OF IUPAC SYSTEMATIC ORGANIC CHEMICAL NOMENCLATURE .1. INTRODUCTION AND BACKGROUND TO A GRAMMAR-BASED APPROACH [J].
COOKEFOX, DI ;
KIRBY, GH ;
RAYNER, JD .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1989, 29 (02) :101-105
[8]   COMPUTER TRANSLATION OF IUPAC SYSTEMATIC ORGANIC CHEMICAL NOMENCLATURE .2. DEVELOPMENT OF A FORMAL GRAMMAR [J].
COOKEFOX, DI ;
KIRBY, GH ;
RAYNER, JD .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1989, 29 (02) :106-112
[9]   COMPUTER TRANSLATION OF IUPAC SYSTEMATIC ORGANIC CHEMICAL NOMENCLATURE .3. SYNTAX ANALYSIS AND SEMANTIC PROCESSING [J].
COOKEFOX, DI ;
KIRBY, GH ;
RAYNER, JD .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1989, 29 (02) :112-118
[10]   COMPUTER TRANSLATION OF IUPAC SYSTEMATIC ORGANIC-CHEMICAL NOMENCLATURE .4. CONCISE CONNECTION TABLES TO STRUCTURE DIAGRAMS [J].
COOKEFOX, DI ;
KIRBY, GH ;
LORD, MR ;
RAYNER, JD .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1990, 30 (02) :122-127