Language resources for hebrew

被引:52
作者
Itai, Alon [2 ]
Wintner, Shuly [1 ]
机构
[1] Univ Haifa, Dept Comp Sci, IL-31905 Haifa, Israel
[2] Technion Israel Inst Technol, Dept Comp Sci, IL-32000 Haifa, Israel
基金
以色列科学基金会;
关键词
language resources; Hebrew; corpora; lexicon; morphological processing; WordNet;
D O I
10.1007/s10579-007-9050-8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We describe a suite of standards, resources and tools for computational encoding and processing of Modern Hebrew texts. These include an array of XML schemas for representing linguistic resources; a variety of text corpora, raw, automatically processed and manually annotated; lexical databases, including a broad-coverage monolingual lexicon, a bilingual dictionary and a WordNet; and morphological processors which can analyze, generate and disambiguate Hebrew word forms. The resources are developed under centralized supervision, so that they are compatible with each other. They are freely available and many of them have already been used for several applications, both academic and industrial.
引用
收藏
页码:75 / 98
页数:24
相关论文
共 49 条
  • [1] ADLER M, 2006, P 21 INT C COMP LING, P665, DOI DOI 10.3115/1220175.1220259
  • [2] AGIRRE E, 1996, P 16 INT C COMP LING, P16
  • [3] [Anonymous], 1996, BALANCING ACT COMBIN
  • [4] BARHAIM R, 2005, P ACL WORKSH COMP AP, P39
  • [5] BARHAIM R, 2008, IN PRESS NATURAL LAN
  • [6] BARKALI S, 2000, HEBREW
  • [7] Beesley K, 2003, FINITE STATE MORPHOL
  • [8] Bentivogli L., 2002, P 1 INT C GLOB WORDN
  • [9] BLACK W, 2006, P 3 GLOB WORDNET M
  • [10] BONNEMA R, 1997, DATA ORIENTED SEMANT