Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Static Asset Generation Performance #12922

Merged
merged 4 commits into from
Jan 10, 2025

Conversation

adamchal
Copy link
Contributor

@adamchal adamchal commented Jan 7, 2025

Changes

Improve static asset generation performance by removing the await in the for loop. This change will allow the p-queue concurrency to limit the number of parallel tasks. Previously, the await in the for loop would cause all asset generation tasks to run serially.

core/build/generate.ts:

for (const [originalPath, transforms] of staticImageList) {
-	await generateImagesForPath(originalPath, transforms, assetsCreationPipeline, queue);
+	queue.add(() => generateImagesForPath(originalPath, transforms, assetsCreationPipeline));
}
await queue.onIdle();
  • Do not await in this for loop so that tasks can execute based on the p-queue concurrency.
  • Do not pass queue to generateImagesForPath.

assets/build/generate.ts:

for (const [_, transform] of transformsAndPath.transforms) {
-	await queue.add(async () => generateImage(transform.finalPath, transform.transform));
+	await generateImage(transform.finalPath, transform.transform);
}
  • Continue processing transforms for a single image serially.
  • Do not queue.add here, prevents unintentional adding to the queue after we expect the queue to be complete await queue.onIdle();

Fixes #12845

Design Decision:

We have 3 source images (A.png, B.png, C.png) and 3 transforms for each:

A1.png A2.png A3.png
B1.png B2.png B3.png
C1.png C2.png C3.png

Option 1

Enqueue all transforms indiscriminantly

|_A1.png   |_B2.png   |_C1.png
|_B3.png   |_A2.png   |_C3.png
|_C2.png   |_A3.png   |_B1.png
  • Advantage: Maximum parallelism, saturate CPU
  • Disadvantage: Spike in context switching

Option 2

Enqueue all transforms, but constrain processing order by source image

|_A3.png   |_B1.png   |_C2.png
|_A1.png   |_B3.png   |_C1.png
|_A2.png   |_B2.png   |_C3.png
  • Advantage: Maximum parallelism, saturate CPU (same as Option 1) in hope to avoid context switching
  • Disadvantage: Context switching still occurs and performance still suffers

Option 3

Enqueue each source image, but perform the transforms for that source image sequentially

\_A1.png   \_B1.png   \_C1.png
 \_A2.png   \_B2.png   \_C2.png
  \_A3.png   \_B3.png   \_C3.png
  • Advantage: Less context switching
  • Disadvantage: If you have a low number of source images with high number of transforms then this is suboptimal.

Best Option:

Option 3. Most projects will have a higher number of source images with a few transforms on each. Even though Option 2 should be faster and should prevent context switching, this was not observed in nascent tests. Context switching was high and the overall performance was half of Option 3.

If looking to optimize further, please consider the following:

  • Avoid queue.add() in an async for loop. Notice the await queue.onIdle(); after this loop. We do not want to create a scenario where tasks are added to the queue after queue.onIdle() resolves. This can break tests and create annoying race conditions.
  • Exposing a concurrency property in astro.config.mjs to allow users to override Node’s os.cpus().length default.
  • Create a proper performance benchmark for asset transformations of projects in varying sizes of source images and transforms.

Testing

No additional tests required.

Docs

Static asset generation performance should increase by os.cpus().length in most Astro projects with assets.

Copy link

changeset-bot bot commented Jan 7, 2025

🦋 Changeset detected

Latest commit: 11ce57e

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added the pkg: astro Related to the core `astro` package (scope) label Jan 7, 2025
Copy link

codspeed-hq bot commented Jan 7, 2025

CodSpeed Performance Report

Merging #12922 will not alter performance

Comparing adamchal:asset-generation-performance (1821ae0) with main (673a518)

Summary

✅ 6 untouched benchmarks

@adamchal
Copy link
Contributor Author

adamchal commented Jan 7, 2025

@ascorbic I’m skeptical that this change would affect any tests, but 5 CI tests are failing. A quick investigation reveals that some fixture assets are ENOENT. I’m looking for historical success with these tests but figured I’d check quickly with you first before digging deeper into the tests.

As you mentioned, because we await queue.onIdle(); the transformed asset should be finalized before any test would run. My fear is that this assumption is wrong, but that would likely be a dark tunnel of race conditions that just happens to work if we generate images more slowly (one-by-one).

@ascorbic
Copy link
Contributor

ascorbic commented Jan 8, 2025

Yes, this does make me concerned that it's not waiting correctly. One way to check could be to push all of the generateImagesForPath responses into an array and await Promise.all at the end. If that fixes it then it probably means that the idle wait isn't catching everything

@adamchal
Copy link
Contributor Author

adamchal commented Jan 8, 2025

Bingo! It was the queue.add() being called in the async for loop in assets/build/generate.ts:

for (const [_, transform] of transformsAndPath.transforms) {
	await queue
		.add(async () => generateImage(transform.finalPath, transform.transform))
		.catch((e) => {
			throw e;
		});
}

The queue.add() here was happening after the await queue.onIdle(); executed in core/build/generate.ts.
I resolved by performing the queue.add() in core/build/generate.ts and removing the queue parameter in generateImagesForPath(). I added comments explaining these design choices and pitfalls encountered for future us.

@adamchal adamchal force-pushed the asset-generation-performance branch from ebedb9f to 3064167 Compare January 8, 2025 21:43
@ascorbic
Copy link
Contributor

ascorbic commented Jan 9, 2025

OK, great. Well found. I'll review this in detail today, because it will need a careful look because it's a critical piece.

@ascorbic
Copy link
Contributor

ascorbic commented Jan 9, 2025

Could you update the PR description to match what you finally did, so we can refer to it in future? Including most of the wording from the comment would cover most of it.

@ascorbic ascorbic merged commit faf74af into withastro:main Jan 10, 2025
16 checks passed
@astrobot-houston astrobot-houston mentioned this pull request Jan 10, 2025
@adamchal adamchal deleted the asset-generation-performance branch January 10, 2025 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: astro Related to the core `astro` package (scope)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Image processing parallelity
2 participants