Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate object mappings incrementally #3100

Merged
merged 2 commits into from
Oct 11, 2024
Merged

Generate object mappings incrementally #3100

merged 2 commits into from
Oct 11, 2024

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Oct 8, 2024

This PR generates object mapping when each block is added to the archiver, rather than only at the end of a full segment. When a part of a block added in Archiver::with_initial_state(), those mappings are generated when the next block is added.

This reduces load bursts on the mapping indexer, and reduces object submission to mapping latency.

Piece generation still requires a full segment.

Cleanups

  • Simplify segment item access

Code contributor checklist:

@teor2345 teor2345 added the improvement it is already working, but can be better label Oct 8, 2024
@teor2345 teor2345 self-assigned this Oct 8, 2024
teor2345

This comment was marked as outdated.

nazar-pc

This comment was marked as resolved.

@teor2345

This comment was marked as resolved.

@nazar-pc

This comment was marked as resolved.

@teor2345 teor2345 changed the title Refactor archiving code for incremental mapping generation Generate object mappings incrementally Oct 10, 2024
@teor2345

This comment was marked as resolved.

@teor2345

This comment was marked as resolved.

Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the approach I was imagining, thanks!
Mostly have nits about internal APIs and suggested improvements to comments. The only thing I don't understand is the change of with_initial_state() API.

Maybe we could set a 40 minute timeout for the test, instead of the default 6 hours?
I’ve only ever seen them take about 25 minutes.

We use self-hosted runners that are much more powerful than the basic official runners, but our CI is configured to not depend on our runners in any way and to remain fully functional in forks (so contributors can run all the tests, produce custom builds, etc.). Latest run in my fork took ~1h25m, so I think decreasing time limit for tests and clippy to 2h is probably fine, but I wouldn't go as low as 40 minutes.

crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
Copy link
Contributor Author

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll give these changes a shot now.

crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
Copy link
Contributor Author

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be what you want?

crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
crates/subspace-archiving/src/archiver.rs Outdated Show resolved Hide resolved
Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think produce_object_mappings should process all segments like it did before. This simplifies assumptions removing the need to write comments about it, this also allows us to not return ArchiveBlockOutcome from constructor, maintaining previous API.

@teor2345
Copy link
Contributor Author

Ok, this should do it the way you want now.

Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you!

@teor2345 teor2345 added this pull request to the merge queue Oct 11, 2024
Merged via the queue into main with commit f8b6abf Oct 11, 2024
9 checks passed
@teor2345 teor2345 deleted the incremental-map branch October 11, 2024 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement it is already working, but can be better
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants