Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async methods #141

Closed
wants to merge 23 commits into from
Closed

Async methods #141

wants to merge 23 commits into from

Conversation

dduponchel
Copy link
Collaborator

This pull request adds a lot of things. I planned to create it with streaming methods (to close properly #121) but I'm taking too long (and rebasing these commits become painful).

The highlights are :

  • rework ZipObject and CompressedObject
  • use pako's utf8 implementation (better and faster)
  • use pako's crc32 implementation (faster)
  • add asynchronous methods

The async methods won't fix #121 : they remove the time limit but not the memory limit. The whole data is still loaded in memory.

There are 6 new async methods : asTextAsync, asBinaryAsync, asArrayBufferAsync, asUint8ArrayAsync, asNodeBufferAsync and JSZip#generateAsync.

zip.file("long_text.txt").asTextAsync(function (err, content) {
  if (err) {
    // handle error
  }
  // content contains the text
});

zip.generateAsync(function (err, content) {
  if (err) {
    // handle error
  }
  // content contains the zip file
});

I will update this pull request after #140 since its outcome will change my implementation.

edit : added pako's crc32 implementation.

These changes shouldn't change a lot the end result, but should help a
little.
Optimize the call to file(name) : instead of using filter (in O(n)),
fetch the entry directly (in O(1)).
The index went too far, causing the optimizer to drop the compiled code.
Because of the try/catch, the JIT compiler can't do its magic.
This patch moves some of the hot code into separate functions : those
functions can be compiled, the try/catch remains in the main loop.
When loading a zip file with unicode path/comment, we were combining two costly
operations : read data as (binary) string and then convert the binary string
to a real string. This patch skip the first string conversion.
The previous code used a closure to keep a reference to the compressed
content and pointers to the beginning/end of the file in this content.
Then we had some complicated code to extract the content and handle the
corner cases.
This new version adds two new ("private" with a _) methods to the
ZipObject instances : _compress and _decompress. A ZipObject instance
will have two states : compressed (with a CompressedObject) and
decompressed (with a string, uint8array, etc).
With this patch, no more closure : for each file we create a
sub{string,array} for the data and then create a CompressedObject.
The transformation uint8array -> string (to check against the signature)
can be costly if repeated for a lot of entries. This commit uses the
reader to check the signature, speeding things with a Uint8ArrayReader.
TODO : fix the utf8 implementation to make this test pass :)
The separation between the generation of the CompressedObjects and the
actual use of them will help for the next commit, adding generateAsync().
The "content is empty" check can be done at the insertion of the content
and not when compressing it.
They should help with performances issues.
@nathan-muir
Copy link

As an alternative, I've been working on making it a little more friendly to the UI thread by using Web Workers, and Promises. Results/WIP are at https://github.com/nathan-muir/jszip/tree/inflate-worker

@dduponchel dduponchel mentioned this pull request Jan 8, 2015
@Stuk
Copy link
Owner

Stuk commented Feb 17, 2015

Superseded by #195

@Stuk Stuk closed this Feb 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants