Date
2007
Dewey
Organisation des données
Sujet
databases; algebraic signature; Search algorithm
Conference name
33rd International Conference on Very Large Data Bases VLDB 2007
Conference date
09-2007
Conference city
Vienne
Conference country
Autriche
Book title
Proceedings of the 33rd International Conference on Very Large Data Bases, University of Vienna, Austria, September 23-27, 2007
Author
Koch, Christoph; Gehrke, Johannes; Garofalakis, Minos; Srivastava, Divesh; Aberer, Karl; Deshpande, Anand; Forescu, Daniela; Chan, Chee Yong; Ganti, Venkatesh; Kanne, Karl-Christian; Klas, Wolfgang; Neuhold, Erich J.
Publisher
AMC
Year
2007
ISBN
978-1-59593-649-3
Author
Litwin, Witold
Mokadem, Riad
Rigaux, Philippe
Schwartz, Thomas
Type
Communication / Conférence
Item number of pages
207-218
Abstract (EN)
We propose a novel string search algorithm for data stored
once and read many times. Our search method combines
the sublinear traversal of the record (as in Boyer Moore or
Knuth-Morris-Pratt) with the agglomeration of parts of the
record and search pattern into a single character – the algebraic signature – in the manner of Karp-Rabin. Our experiments show that our algorithm is up to seventy times faster
for DNA data, up to eleven times faster for ASCII, and up to
a six times faster for XML documents compared with an im-
plementation of Boyer-Moore. To obtain this speed-up, we
store records in encoded form, where each original character
is replaced with an algebraic signature.
Our method applies to records stored in databases in general and to distributed implementations of a Database As
Service (DAS) in particular. Clients send records for insertion and search patterns already in encoded form and servers
never operate on records in clear text. No one at a node can
involuntarily discover the content of the stored data.