• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail - Request a copy

Combining Chunk Boundary and Chunk Signature Calculations for Deduplication

Litwin, Witold; Long, Darrell; Schwarz, Thomas (2012), Combining Chunk Boundary and Chunk Signature Calculations for Deduplication, Revista IEEE Latin America, 10, 1, p. 1305-1311. 10.1109/TLA.2012.6142477

Type
Article accepté pour publication ou publié
Date
2012
Journal name
Revista IEEE Latin America
Volume
10
Number
1
Publisher
IEEE
Pages
1305-1311
Publication identifier
10.1109/TLA.2012.6142477
Metadata
Show full item record
Author(s)
Litwin, Witold
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Long, Darrell

Schwarz, Thomas
Abstract (EN)
Many modern, large-scale storage solutions offer deduplication, which can achieve impressive compression rates for many loads, especially for backups. When accepting new data for storage, deduplication checks whether parts of the data is already stored. If this is the case, then the system does not store that part of the new data but replaces it with a reference to the location where the data already resides. A typical deduplication system breaks data into chunks, hashes each chunk, and uses an index to see whether the chunk has already been stored. Variable chunk systems offer better compression, but process data byte-for-byte twice, first to calculate the chunk boundaries and then to calculate the hash. This limits the ingress bandwidth of a system. We propose a method to reuse the chunk boundary calculations in order to strengthen the collision resistance of the hash, allowing us to use a faster hashing method with fewer bytes or a much larger (256 times by adding two bytes) storage system with the same high assurance against chunk collision and resulting data loss.
Subjects / Keywords
Algebraic Signatures; Deduplication

Related items

Showing items related by title and author.

  • Thumbnail
    AS-Index: A Structure For String Search Using n-grams and Algebraic Signatures 
    Constantin, Camelia; du Mouza, Cedric; Litwin, Witold; Rigaux, Philippe; Schwarz, Thomas (2016) Article accepté pour publication ou publié
  • Thumbnail
    AS-Index: A Structure For String Search Using n-grams and Algebraic Signatures 
    du Mouza, Cedric; Litwin, Witold; Rigaux, Philippe; Schwarz, Thomas (2009) Communication / Conférence
  • Thumbnail
    AS-Index: A Structure for String Search Using n-Grams and Algebraic Signatures 
    Rigaux, Philippe; Litwin, Witold; du Mouza, Cédric; Schwarz, Thomas (2009) Communication / Conférence
  • Thumbnail
    Cumulative Algebraic Signatures for Fast String Search, Protection Against Incidental Viewing and Corruption of Data in an SDDS 
    Litwin, Witold; Mokadem, Riad; Schwarz, Thomas (2007) Communication / Conférence
  • Thumbnail
    Improved deduplication through parallel binning 
    Zhang, Zhike; Bhagwat, Deepavali; Litwin, Witold; Long, Darrell; Schwarz, Thomas (2012) Communication / Conférence
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo