Two biomedical sublanguages: a description based on the theories of Zellig Harris

被引:120
作者
Friedman, C
Kra, P
Rzhetsky, A
机构
[1] Columbia Univ, Dept Med Informat, New York, NY 10032 USA
[2] CUNY Queens Coll, Dept Comp Sci, Flushing, NY 11367 USA
[3] Columbia Univ, Genome Ctr, New York, NY 10032 USA
关键词
D O I
10.1016/S1532-0464(03)00012-1
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Natural language processing (NLP) systems have been developed to provide access to the tremendous body of data and knowledge that is available in the biomedical domain in the form of natural language text. These NLP systems are valuable because they can encode and amass the information in the text so that it can be used by other automated processes to improve patient care and our understanding of disease processes and treatments. Zellig Harris proposed a theory of sublanguage that laid the foundation for natural language processing in specialized domains. He hypothesized that the informational content and structure form a specialized language that can be delineated in the form of a sublanguage grammar. The grammar can then be used by a language processor to capture and encode the salient information and relations in text. In this paper, we briefly summarize his language and sublanguage theories. In addition, we summarize our prior research, which is associated with the sublanguage grammars we developed for two different biomedical domains. These grammars illustrate how Harris' theories provide a basis for the development of language processing systems in the biomedical domain. The two domains and their associated sublanguages discussed are: the clinical domain, where the text consists of patient reports, and the biomolecular domain, where the text consists of complete journal articles. (C) 2003 Elsevier Science (USA). All rights reserved.
引用
收藏
页码:222 / 235
页数:14
相关论文
共 56 条
[1]  
[Anonymous], 1991, Theory of language and information: a mathematical approach
[2]  
[Anonymous], 1986, ANAL LANGUAGE RESTRI
[3]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[4]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[5]  
Bairoch A, 1997, J MOL MED-JMM, V75, P312
[6]   GenBank [J].
Benson, DA ;
Boguski, MS ;
Lipman, DJ ;
Ostell, J ;
Ouellette, BFF .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :1-7
[7]  
BLANQUET A, 1999, P AMIA S, P1029
[8]  
Blaschke C, 1999, Proc Int Conf Intell Syst Mol Biol, P60
[9]   Knowledge representation and retrieval using conceptual graphs and free text document self-organisation techniques [J].
Chu, S ;
Cesnik, B .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2001, 62 (2-3) :121-133
[10]  
Chute CG, 1997, J AM MED INFORM ASSN, P570