Incremental validation of XML documents

被引:52
作者
Balmin, A
Papakonstantinou, Y
Vianu, V
机构
[1] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA
[2] Univ Calif San Diego, San Diego, CA 92103 USA
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2004年 / 29卷 / 04期
关键词
update; validation; XML; algorithms; experimentation;
D O I
10.1145/1042046.1042050
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We investigate the incremental validation of XML documents with respect to DTDs, specialized DTDs, and XML Schemas, under updates consisting of element tag renamings, insertions, and deletions. DTDs are modeled as extended context-free grammars. "Specialized DTDs" allow the decoupling of element types from element tags. XML Schemas are abstracted as specialized DTDs with limitations on the type assignment. For DTDs and XML Schemas, we exhibit an O(m log n) incremental validation algorithm using an auxiliary structure of size O(n), where n is the size of the document and m the number of updates. The algorithm does not handle the incremental validation of XML Schema wrt renaming of internal nodes, which is handled by the specialized DTDs incremental validation algorithm. For specialized DTDs, we provide an O(m log(2) n) incremental algorithm, again using an auxiliary structure of size O(n). This is a significant improvement over brute-force re-validation from scratch. We exhibit a restricted class of DTDs called local that arise commonly in practice and for which incremental validation can be done in practically constant time by maintaining only a list of counters. We present implementations of both general incremental validation and local validation on an XML database built on top of a relational database. Our experimentation includes a study of the applicability of local validation in practice, results on the calibration of parameters of the auxiliary data structure, and results on the performance comparison between the general incremental validation technique, the local validation technique, and brute-force validation from scratch.
引用
收藏
页码:710 / 751
页数:42
相关论文
共 12 条
[1]   One-unambiguous regular languages [J].
Bruggemann-Klein, A ;
Wood, D .
INFORMATION AND COMPUTATION, 1998, 142 (02) :182-206
[2]  
CHOI B, 2002, P 5 INT WORKSH WEB D, P43
[3]  
CLUET S, 1998, P ACM SIGMOD INT C M, P177
[4]  
Dong G., 1995, Proceedings of the Fourteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. PODS 1995, P139, DOI 10.1145/212433.220204
[5]   AUGMENTING PARSERS TO SUPPORT INCREMENTALITY [J].
GHEZZI, C ;
MANDRIOLI, D .
JOURNAL OF THE ACM, 1980, 27 (03) :564-579
[6]   OPTIMAL INCREMENTAL PARSING [J].
LARCHEVEQUE, JM .
ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1995, 17 (01) :1-15
[7]   COMPLEXITY MODELS FOR INCREMENTAL COMPUTATION [J].
MILTERSEN, PB ;
SUBRAMANIAN, S ;
VITTER, JS ;
TAMASSIA, R .
THEORETICAL COMPUTER SCIENCE, 1994, 130 (01) :203-236
[8]   INCREMENTAL RECURSIVE DESCENT PARSING [J].
MURCHING, AM ;
PRASAD, YV ;
SRIKANT, YN .
COMPUTER LANGUAGES, 1990, 15 (04) :193-204
[9]  
PAPAKONSTANTINO.Y, 2000, P 19 ACM SIGMOD SIGA, P35
[10]   Dyn-FO: A parallel, dynamic complexity class [J].
Patnaik, S ;
Immerman, N .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (02) :199-209