-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvement and potential tools to adopt #35
Comments
Did you clone the project and manually run it directly in Node.js? I want to confirm how you run this whole thing up. Can you share the source file here or privately via the mail? |
Yes, I copied the source code to run it. What is your email address, you can base64 and reply to me. |
This comment was marked as outdated.
This comment was marked as outdated.
I can confirm that this file will OOM during the unpacking process. |
The following issue is also probably tangentially relevant to this, as parsing a large file may end up in files getting silently lost along the way: |
Yes, they are relevant. |
My suggestion here would be a rather major change, so it's not worth looking into too deeply unless it turns out there is no good/efficient way to manage this with the current AST parsers/etc; but one thing I stumbled across/was thinking about the other day was that the These links/resources probably aren't exhaustive; but figured I would share them as a starting point in case this was a path that was worth looking into at some stage:
|
I decided to suspend this request for a while. This feature will be implemented on the new web IDE instead of the current playground. In the meantime, you can use CLI to run on local machine. |
@pionxzh Oo, new web IDE sounds interesting. Are you able to tell us more about your plan/roadmap with it, and what you're hoping it will allow/enable? |
I tried the sample with the new CLI. It took ~2GB and 60 seconds to pass for unpacker (and failed to unpack it into multiple files😞; I will check further!).
For unminify, it took 3.4 hours... to finish the whole process. Memory usage is from 100MB ~ 1GB ~ 2GB depending on the rule. I will improve the CLI to write files more frequently during the unminify process, and add a |
With cli you may need to prompt the user to pass node V8's configuration |
@0xdevalias I have created a package /packages/ide which provides a vscode-like experience. I want to use it to replace the existing playground. I am still trying to figure out the idea haha.. Module mapping won't be exposed, instead, you can directly edit the filename in the file explore. And two tabs will be provided for the compare view, which you can drag and organize in any form you like. The editor would be Monaco, which offers a better editing experience. Features like download, better unminify handling, error reporting... But I might drop the idea and simply put Monaco into the existing playground. |
@StringKe We can add some hints in the CLI usage description. But I didn't hit on OOM during the whole process with your sample. |
@pionxzh Oo.. nice! I created a new issue for this, and will add my /2c over there so as not to clutter this thread too much:
@pionxzh With the |
I think you don't need additional work on the CLI to export the flamegraph? Should be same as how people measure normal nodejs project. |
@pionxzh True true.. I think I got confused when you were talking about adding a I asked ChatGPT, and it gave some good tool suggestions/how to, included in the expandable below: Some notes from ChatGPT on profilingtl;dr:
Running the previous tools on my sample webpack'd code (Ref): ⇒ cd ./unpacked/_next/static/chunks
⇒ npx 0x --open -- node ../../../../node_modules/@wakaru/unpacker/dist/cli.cjs 496.js -o ./496-v2
🔥 ProfilingGenerated 51 modules from 496.js to 496-v2 (5,408.6ms)
🔥 Flamegraph generated in
file:///Users/REDACTED/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/62569.0x/flamegraph.html ⇒ npx 0x --open -- node ../../../../node_modules/@wakaru/unminify/dist/cli.cjs ./496-v2/*.js -o ./496-v2-unminified
🔥 Profiling• Transforming 496-v2-unminified/496-v2/module-10604.js (1,681.9ms)
• Transforming 496-v2-unminified/496-v2/module-13282.js (650.8ms)
• Transforming 496-v2-unminified/496-v2/module-17915.js (1,994.3ms)
• Transforming 496-v2-unminified/496-v2/module-180.js (183ms)
• Transforming 496-v2-unminified/496-v2/module-19051.js (158ms)
• Transforming 496-v2-unminified/496-v2/module-21437.js (911.3ms)
• Transforming 496-v2-unminified/496-v2/module-2368.js (551.6ms)
• Transforming 496-v2-unminified/496-v2/module-24148.js (226.5ms)
• Transforming 496-v2-unminified/496-v2/module-25094.js (1,217ms)
• Transforming 496-v2-unminified/496-v2/module-25345.js (674.5ms)
• Transforming 496-v2-unminified/496-v2/module-25422.js (451.2ms)
• Transforming 496-v2-unminified/496-v2/module-30931.js (975.5ms)
• Transforming 496-v2-unminified/496-v2/module-31721.js (551.3ms)
• Transforming 496-v2-unminified/496-v2/module-32165.js (625.8ms)
• Transforming 496-v2-unminified/496-v2/module-32689.js (424.6ms)
• Transforming 496-v2-unminified/496-v2/module-36112.js (486.7ms)
• Transforming 496-v2-unminified/496-v2/module-36716.js (1,069.3ms)
• Transforming 496-v2-unminified/496-v2/module-37541.js (113.7ms)
• Transforming 496-v2-unminified/496-v2/module-38631.js (151.8ms)
• Transforming 496-v2-unminified/496-v2/module-44925.js (64ms)
• Transforming 496-v2-unminified/496-v2/module-45036.js (27,773.9ms)
• Transforming 496-v2-unminified/496-v2/module-46110.js (839.3ms)
[un-jsx] children from attribute and arguments are both present: jsx(g.Z, {
# ..snip..
• Transforming 496-v2-unminified/496-v2/module-48101.js (1,249.6ms)
[un-jsx] children from attribute and arguments are both present: jsx("div", { className: "text-yellow-500", children: e }, t)
[un-jsx] children from attribute and arguments are both present: jsx("div", { className: "text-red-500", children: e }, t)
• Transforming 496-v2-unminified/496-v2/module-49910.js (876.8ms)
• Transforming 496-v2-unminified/496-v2/module-5046.js (283.5ms)
• Transforming 496-v2-unminified/496-v2/module-56244.js (872.3ms)
• Transforming 496-v2-unminified/496-v2/module-57311.js (3,180.6ms)
• Transforming 496-v2-unminified/496-v2/module-57924.js (428.4ms)
• Transforming 496-v2-unminified/496-v2/module-61119.js (917.9ms)
• Transforming 496-v2-unminified/496-v2/module-62440.js (160.7ms)
[un-jsx] children from attribute and arguments are both present: jsx(w.Z, {
# ..snip..
[un-jsx] children from attribute and arguments are both present: jsx("div", {
# ..snip..
[un-jsx] children from attribute and arguments are both present: jsx(Et, { x: i, children: e }, t)
• Transforming 496-v2-unminified/496-v2/module-63390.js (5,776.7ms)
[un-jsx] children from attribute and arguments are both present: jsx(m.E.div, {
# ..snip..
[un-jsx] children from attribute and arguments are both present: jsxs(N.Z, {
# ..snip..
[un-jsx] children from attribute and arguments are both present: jsx(N.Z, {
# ..snip..
• Transforming 496-v2-unminified/496-v2/module-63727.js (4,760.3ms)
• Transforming 496-v2-unminified/496-v2/module-66523.js (447.6ms)
• Transforming 496-v2-unminified/496-v2/module-69403.js (309.9ms)
• Transforming 496-v2-unminified/496-v2/module-697.js (266.3ms)
• Transforming 496-v2-unminified/496-v2/module-74437.js (263.9ms)
• Transforming 496-v2-unminified/496-v2/module-75179.js (274.9ms)
• Transforming 496-v2-unminified/496-v2/module-75515.js (181.2ms)
• Transforming 496-v2-unminified/496-v2/module-75527.js (4,662ms)
• Transforming 496-v2-unminified/496-v2/module-76559.js (382.7ms)
• Transforming 496-v2-unminified/496-v2/module-77442.js (367.2ms)
• Transforming 496-v2-unminified/496-v2/module-85449.js (322.9ms)
• Transforming 496-v2-unminified/496-v2/module-86433.js (529.2ms)
• Transforming 496-v2-unminified/496-v2/module-86526.js (97.1ms)
• Transforming 496-v2-unminified/496-v2/module-86573.js (1,461.6ms)
• Transforming 496-v2-unminified/496-v2/module-870.js (268ms)
• Transforming 496-v2-unminified/496-v2/module-87105.js (6,067.3ms)
• Transforming 496-v2-unminified/496-v2/module-87316.js (194.1ms)
• Transforming 496-v2-unminified/496-v2/module-90076.js (1,509.4ms)
• Transforming 496-v2-unminified/496-v2/module-94860.js (621ms)
• Transforming 496-v2-unminified/496-v2/module-97732.js (1,447.6ms)
🔥 Flamegraph generated in
file:///Users/REDACTED/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/62934.0x/flamegraph.html
⇒ npx clinic doctor -- node ../../../../node_modules/@wakaru/unpacker/dist/cli.cjs 496.js -o ./496-v2
To generate the report press: Ctrl + C
Generated 51 modules from 496.js to 496-v2 (4,701.4ms)
Analysing data
Generated HTML file is file:///Users/REDACTED/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/.clinic/65647.clinic-doctor.html ⇒ npx clinic doctor -- node ../../../../node_modules/@wakaru/unminify/dist/cli.cjs ./496-v2/*.js -o ./496-v2-unminified
To generate the report press: Ctrl + C
• Transforming 496-v2-unminified/496-v2/module-10604.js (891.1ms)
# ..snip..
Analysing data
Generated HTML file is file:///Users/REDACTED/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/.clinic/65956.clinic-doctor.html ⇒ npx clinic flame -- node ../../../../node_modules/@wakaru/unminify/dist/cli.cjs ./496-v2/*.js -o ./496-v2-unminified
To generate the report press: Ctrl + C
• Transforming 496-v2-unminified/496-v2/module-10604.js (1,591.3ms)
# ..snip..
[== ] Analysing dataRangeError: Invalid string length
at JSON.stringify (<anonymous>)
at zeroEks (/Users/devalias/dev/0xdevalias/chatgpt-source-watch/node_modules/0x/index.js:56:51)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5) |
I will try |
The benchmark shows that We have several options to consider:
It's important to note that while |
@pionxzh Was there a reason you chose I feel like it could be an ideal choice:
I feel like having rules that only work on the CLI and not the web UI will increase maintenance effort; and could easily lead to the two becoming out of sync; which sounds like a bad direction for the project. Similarly, since you also say that you wouldn't use it for more complex rules/rewrites; even if we did have a web capable build of it, I wonder if it's worth fragmenting the rules and how they're implemented between 2 tools/methods/styles; as then contributors/maintainers will need to know how to make simpler rules in one tool, and more complex rules in another. If this was a path that was followed, here are some potentially relevant resources; that seem to suggest
Edit: Looking at the
@pionxzh Curious, are your benchmarking tools/etc added to this repo? Or just a local thing? |
|
@pionxzh When you say "doesn't provide the matcher", what do you mean?
It's also seemingly commonly used as the basis for 'code navigation systems', which could be useful in the web-IDE side of things:
This 'magic string'?
|
@pionxzh No worries :)
@pionxzh Is that graph showing Wasm is known to have some overheads that makes it slower than native bindings, so that makes sense.
@pionxzh nods that makes sense. I haven't worked with it properly before, so can't give any real world experiences on it; but from reading through the documentation, it sounds like it's pretty powerful.
@pionxzh You might have missed this question earlier. |
Oh, you can check the benchmark at /benches. Run with |
@pionxzh Oh true! I totally missed that somehow (committed 3cd85f6). Thanks :)
@pionxzh Haha. Sounds good, thanks! :) There was another parser project I stumbled across while looking
|
Oh I know this project, will check em. I think now the requirement would be
There is a wasm binding, but it's not so ready-to-use. It's currently built for their playground. Not sure, will check more on these options. |
@pionxzh nods, these make sense and sound reasonable.
@pionxzh Though curious, what is this aspect needed for?
@pionxzh Yeah, sounds like it's not very maintained currently; or at the very least, it's not currently published: The playground's
There doesn't seem to be a But the
Some other historical context: From a brief skim of the issues, some of these might be relevant/of interest also:
The issue on benchmarking linked to Boa's benchmarks also look similar, and seem to use a tool called
|
Since I got curious about the specifics of
And digging into the code (as it might be useful to understand aspects of it if going down the
This might also be an interesting/useful read:
|
There's a similar 'performance' related issue over on the While the full thread may be of interest, here are some particularly relevant snippets based on some of the alternative parsers/etc they have explored:
|
And here are some ideas I've had/been exploring around the potential for how to 'ease the overhead' of crossing back and forth between the JS/Rust side of things; and to avoid needing to re-write the entire project in Rust to benefit from rust-based parsers:
|
Thanks for the insight. https://github.com/meriyah/meriyah looks interesting. I will spend some time looking into it.
Actually that's how ast-grep works. Here is a quote from Benchmark TypeScript Parsers: Demystify Rust Tooling Performance:
btw, we are kind of off the topic in this long thread lol. Let me update the title. |
@pionxzh No worries :)
@pionxzh Sounds good. Would be interested to hear what you think of it once you do.
@pionxzh Oh true. I was thinking it probably did, particularly after skimming the source the other day (Ref), but I wasn't 100% sure still.
@pionxzh That's super interesting and neat to know. For future reference, here's a link to the non-medium-paywalled version of that article (Ref) It was a really interesting read in general! The results of the benchmarks for synchronous performance were pretty interesting, I definitely didn't expect It was also really interesting in the async parsing to see just how much These notes towards the end were also interesting/worth paying attention to:
Cool to see this note at the very end aligns with one of the thoughts I had too!
@pionxzh Haha yeah, I was thinking that recently as well, it certainly seems to have evolved into more of a general 'performance optimisation' exploration. Was skimming through the
|
|
I have a 5Mb sized file that needs to be processed, I've tried running it directly in node and it takes up to 21G of memory space in some transform processing.
However, in some of the processing, it may fail, so maybe it would be possible to save the data that was processed successfully? And use it again next time?
The text was updated successfully, but these errors were encountered: