0.2.0 • Published 8 years ago

chunkchunk v0.2.0

Weekly downloads
-
License
ISC
Repository
-
Last release
8 years ago

Variable File Chunker

Uses the BuzHash library to chop up files to feature determined chunks. Good for deduplication.

const ChunkChunk = require('chunkchunk'),
      fs = require('fs'),
      file = fs.openSync('my/file');

const chunkee = new ChunkChunk(file, { max: 40000 })
// create variable chunks of `my/file` to a maximum size
// of 40k, with the default `min` chunk size of 50% of that.
const chunk = chunkee.nextChunk();
// { hash: 'sha256...', buffer: <Buffer ...> }
const chunks = chunkee.toEnd();
// [ {hash:.., buffer:<..>}, {...}]

fs.close(file);

Example Runs

The following is an example run of npm test chunking a ~120kb .jpg of batman. First column is number of bytes in the chunk, with a max chunk setting of 40k and a min of 20k. Second column is the sha256 of that chunk.

[master] ~ npm test

> chunkchunk@0.2.0 test C:\code\experiments\chunkchunk
> node test/test

4 chunks in 0s 30.234033ms
39005   QQitkmhKkmWfrX+p59Nk49kctS1TrMHhpnFka08Bya4=
34113   aOsmGJhJUeWNPPVbsyfqMyRx1F28rwQnvWiwwN/qVDo=
21484   FX98NrA8OlKWzSGJpXIAslixSRU4QJBPBEVEkcc9EXA=
23960   BY970o+31e5szl0TIGuDtfnPbH41tzWq2WYcK0Pn+1c=
118562

TODO

  • Make ChunkChunk take a string, instead of a file descriptor.
  • This library screams to be made a Transform Stream.
  • Add config option for feature 'uniqueness'
0.2.0

8 years ago