Querying the World Wide Web

被引:23
作者
Mendelzon A.O. [1 ]
Mihaila G.A. [1 ]
Milo T. [2 ]
机构
[1] Department of Computer Science and CSRI, University of Toronto, Toronto
[2] Computer Science Department, Tel Aviv University, Tel Aviv
关键词
Index server; Web SQL; World Wide Web;
D O I
10.1007/s007990050004
中图分类号
学科分类号
摘要
The World Wide Web is a large, heterogeneous, distributed collection of documents connected by hypertext links. The most common technology currently used for searching the Web depends on sending information retrieval requests to "index servers" that index as many documents as they can find by navigating the network. One problem with this is that users must be aware of the various index servers (over a dozen of them are currently deployed on the Web), of their strengths and weaknesses, and of the peculiarities of their query interfaces. A more serious problem is that these queries cannot exploit the structure and topology of the document network. In this paper we propose a query language, WebSQL, that takes advantage of multiple index servers without requiring users to know about them, and that integrates textual retrieval with structure and topology-based queries. We give a formal semantics for WebSQL using a calculus based on a novel "virtual graph" model of a document network. We propose a new theory of query cost based on the idea of "query locality," that is, how much of the network must be visited to answer a particular query. We give an algorithm for characterizing WebSQL queries with respect to query locality. Finally, we describe a prototype implementation of WebSQL written in Java. © Springer-Verlag 1997.
引用
收藏
页码:54 / 67
页数:13
相关论文
共 19 条
[1]  
Abiteboul, S., Vianu, V., Queries and computation on the Web (1997) Proc. ICDT '97
[2]  
Abiteboul, S., Cluet, S., Milo, T., Querying and updating the file (1993) Proceedings of the 19th VLDB Conference
[3]  
Beeri, C., Kornatzky, Y., A logical query language for hypertext systems (1990) Proc. of the European Conference On Hypertext, pp. 67-80. , Cambridge University Press
[4]  
Berners-Lee, T., Cailliau, R., Luotonen, A., Frystyk, N.H., Secret, A., The World-Wide Web (1994) Commun. ACM, 37, pp. 76-82
[5]  
Christophides, V., Abiteboul, S., Cluet, S., Scholl, M., From structured documents to novel query facilities Proc. ACM SIGMOD'94, 1994, pp. 313-324
[6]  
Consens, M.P., Mendelzon, A.O., Expressing structural hypertext queries in Graphlog Hypertext'89, 1989, pp. 269-292
[7]  
Dreilinger, D., (1996), http://guaraldi.cs.colostate.edu:2000/, Savvysearch home page
[8]  
Guting, R.H., Zicari, R., Choy, D.M., An algebra for structured o ce documents (1989) ACM TOIS, 7, pp. 123-157
[9]  
Hasan, M.Z., Golovchinsky, G., Noik, E.G., Charoenkitkarn, N., Chignell, M., Mendelzon, A.O., Modjeska, D., Visual Web surfing with Hy+ (1995) Proceedings CASCON '95, , ftp://db.toronto.edu/pub/papers/cascon95-multisurf.ps.Z, Toronto, November, IBM Canada
[10]  
Konopnicki, D., Shmueli, O., A query system for the World Wide Web Proc. VLDB '95, 1995, pp. 54-65