Olera: Semisupervised web-data extraction with visual support

被引:43
作者
Chang, CH [1 ]
Kuo, SC [1 ]
机构
[1] Natl Cent Univ, Dept Comp Sci & Informat Engn, Chungli, Taiwan
关键词
D O I
10.1109/MIS.2004.71
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
OLEPA is a semisupervised information-extraction system that produces extraction rules from semistructured Web documents without requiring detailed annotation of the training documents. It performs well for program-generated Web pages with few training pages and limited user intervention.
引用
收藏
页码:56 / 64
页数:9
相关论文
共 11 条
[1]  
Arasu A, 2003, SIGMOD'03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, P337, DOI DOI 10.1145/872757.872799
[2]   Self-pumped and mutually pumped phase conjugation in pentagon-shaped BaTiO3 crystal with plus c-face incident geometry [J].
Chang, CC ;
Chen, TC ;
Hu, GW ;
Yau, HF ;
Ye, PX .
PHOTOREFRACTIVE EFFECTS, MATERIALS AND DEVICES, PROCEEDINGS, 2001, 62 :681-681
[3]   Automatic information extraction from semi-structured Web pages by pattern discovery [J].
Chang, CH ;
Hsu, CN ;
Lui, SC .
DECISION SUPPORT SYSTEMS, 2003, 35 (01) :129-147
[4]  
Crescenzi V., 2001, Proceedings of the 27th International Conference on Very Large Data Bases, P109
[5]  
GUSFIELD D, 1993, B MATH BIOL, V55, P141, DOI 10.1007/BF02460299
[6]   Generating finite-state transducers for semi-structured data extraction from the Web [J].
Hsu, CN ;
Dung, MT .
INFORMATION SYSTEMS, 1998, 23 (08) :521-538
[7]  
HSU CN, 1999, P IJCAI 99 WORKSH TE, P38
[8]  
Kushmerick N, 1997, INT JOINT CONF ARTIF, P729
[9]  
Laender AHF, 2002, SIGMOD REC, V31, P84
[10]  
MUSLEA I, 1999, P 3 INT C AUT AG, P190