Async methods #141

dduponchel · 2014-06-12T19:50:24Z

This pull request adds a lot of things. I planned to create it with streaming methods (to close properly #121) but I'm taking too long (and rebasing these commits become painful).

The highlights are :

rework ZipObject and CompressedObject
use pako's utf8 implementation (better and faster)
use pako's crc32 implementation (faster)
add asynchronous methods

The async methods won't fix #121 : they remove the time limit but not the memory limit. The whole data is still loaded in memory.

There are 6 new async methods : asTextAsync, asBinaryAsync, asArrayBufferAsync, asUint8ArrayAsync, asNodeBufferAsync and JSZip#generateAsync.

zip.file("long_text.txt").asTextAsync(function (err, content) {
  if (err) {
    // handle error
  }
  // content contains the text
});

zip.generateAsync(function (err, content) {
  if (err) {
    // handle error
  }
  // content contains the zip file
});

I will update this pull request after #140 since its outcome will change my implementation.

edit : added pako's crc32 implementation.

These changes shouldn't change a lot the end result, but should help a little.

Optimize the call to file(name) : instead of using filter (in O(n)), fetch the entry directly (in O(1)).

The index went too far, causing the optimizer to drop the compiled code.

Because of the try/catch, the JIT compiler can't do its magic. This patch moves some of the hot code into separate functions : those functions can be compiled, the try/catch remains in the main loop.

When loading a zip file with unicode path/comment, we were combining two costly operations : read data as (binary) string and then convert the binary string to a real string. This patch skip the first string conversion.

The previous code used a closure to keep a reference to the compressed content and pointers to the beginning/end of the file in this content. Then we had some complicated code to extract the content and handle the corner cases. This new version adds two new ("private" with a _) methods to the ZipObject instances : _compress and _decompress. A ZipObject instance will have two states : compressed (with a CompressedObject) and decompressed (with a string, uint8array, etc). With this patch, no more closure : for each file we create a sub{string,array} for the data and then create a CompressedObject.

The transformation uint8array -> string (to check against the signature) can be costly if repeated for a lot of entries. This commit uses the reader to check the signature, speeding things with a Uint8ArrayReader.

TODO : fix the utf8 implementation to make this test pass :)

@puzrin

Thanks @puzrin for the hints :)

The separation between the generation of the CompressedObjects and the actual use of them will help for the next commit, adding generateAsync().

The "content is empty" check can be done at the insertion of the content and not when compressing it.

They should help with performances issues.

nathan-muir · 2014-09-18T05:32:41Z

As an alternative, I've been working on making it a little more friendly to the UI thread by using Web Workers, and Promises. Results/WIP are at https://github.com/nathan-muir/jszip/tree/inflate-worker

Stuk · 2015-02-17T17:52:38Z

Superseded by #195

dduponchel added 22 commits June 12, 2014 21:07

optimization : small changes

20b7802

These changes shouldn't change a lot the end result, but should help a little.

optimize the call to file(name)

7509f1c

Optimize the call to file(name) : instead of using filter (in O(n)), fetch the entry directly (in O(1)).

allow the JIT compilation of JSZip.base64

264e27f

The index went too far, causing the optimizer to drop the compiled code.

function arrayLikeToString : allow JIT

6900e0f

Because of the try/catch, the JIT compiler can't do its magic. This patch moves some of the hot code into separate functions : those functions can be compiled, the try/catch remains in the main loop.

the filename is already read in the local part

4f764f6

unicode path/comment : skip a step

b1c21b1

When loading a zip file with unicode path/comment, we were combining two costly operations : read data as (binary) string and then convert the binary string to a real string. This patch skip the first string conversion.

ZipEntry : avoid StringReader if possible

b093198

remove an useless string conversion

015210f

Inline function call.

d8dcec7

read signature : don't call readString if possible

d8108e7

The transformation uint8array -> string (to check against the signature) can be costly if repeated for a lot of entries. This commit uses the reader to check the signature, speeding things with a Uint8ArrayReader.

reading zip : skip useless work

9a2b10f

refactor : create writer/ and reader/

5fe7ced

extra fields : pre-calculate end position

5d5343e

Adding the Pile of Poo test

7364428

TODO : fix the utf8 implementation to make this test pass :)

rework a bit stringifyByChunk

e6d11c5

replace the UTF-8 implementation with pako's.

629b96e

Thanks @puzrin for the hints :)

Rework ZipObject

3f81070

Simplify a bit the generate() final conversion

56bf1d0

refactor : move the generate code to generate.js

425ae62

The separation between the generation of the CompressedObjects and the actual use of them will help for the next commit, adding generateAsync().

Move logic into fileAdd.

b7cb66e

The "content is empty" check can be done at the insertion of the content and not when compressing it.

Add asynchronous methods

4a59d31

They should help with performances issues.

dduponchel mentioned this pull request Jun 14, 2014

astral character support in chrome and IE #142

Closed

replace the crc32 implementation with pako's

cdcdbce

Stuk mentioned this pull request Jun 17, 2014

Add comment support #137

Merged

dduponchel mentioned this pull request Jun 18, 2014

[suggestion] Reduce crc32 code size #143

Closed

dduponchel mentioned this pull request Jan 8, 2015

async / stream support #195

Merged

Stuk closed this Feb 17, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async methods #141

Async methods #141

dduponchel commented Jun 12, 2014

nathan-muir commented Sep 18, 2014

Stuk commented Feb 17, 2015

Async methods #141

Async methods #141

Conversation

dduponchel commented Jun 12, 2014

nathan-muir commented Sep 18, 2014

Stuk commented Feb 17, 2015