共 11 条
[1]
Efficient Similarity Joins for Near-Duplicate Detection
[J].
ACM TRANSACTIONS ON DATABASE SYSTEMS,
2011, 36 (03)
[4]
SEPIA: estimating selectivities of approximate string predicates in large Databases[J] . Liang Jin,Chen Li,Rares Vernica.The VLDB Journal . 2008 (5)
[7]
Record linkage:Similarity measures and algorithms. Koudas N,Sarawagi S,Srivastava D. Proc.of the ACM SIGMOD Int’l Conf.on Management of Data . 2006
[8]
U-sing q-grams in a DBMS for ApproximateString Processing. Luis Gravano,Panagiotis G.Ipeirotis,H.V.Jagadish,Nick Koudas,S.Muthukrish-nan,Lauri Pietarinen,Divesh Srivastava. IEEE Data EngineeringBulletin . 2001
[9]
Finding similar files in a large file system. UDI M. 1994 Winter USENIX Technical Conference . 1994
[10]
Spotsigs:Robust and Efficient Near Duplicate Detection in Large Web Collections. Theobald, M,Siddharth, J,Paepcke, A. Proceedings of the 31 st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval . 2008