Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: store large checkpoints as multipart upload #5026

Merged
merged 14 commits into from
Jan 30, 2025

Conversation

sergiupopescu199
Copy link
Contributor

@sergiupopescu199 sergiupopescu199 commented Jan 27, 2025

Description of change

Add the ability to store large checkpoints as multipart upload on a remote store (e.g AWS S3).

Links to any relevant issues

fixes #4983

Type of change

Choose a type of change, and delete any options that are not relevant.

  • Enhancement (a non-breaking change which adds functionality)

How the change has been tested

  • Followed the docker/iota-data-ingestion/README.md instructions of setting up localstack and all the necessary components to test locally
  • updated the BlobWorker to accommodate as checkpoint 0 with size around 13GB
#[async_trait]
impl Worker for BlobWorker {
    async fn process_checkpoint(&self, checkpoint: CheckpointData) -> Result<()> {
        let bytes = Blob::encode(&checkpoint, BlobEncoding::Bcs)?.to_bytes();
        let location = Path::from(format!(
            "{}.chk",
            checkpoint.checkpoint_summary.sequence_number
        ));

        let bytes = if checkpoint.checkpoint_summary.sequence_number == 0 {
            vec![0; 13000 * 1024 * 1024] // Around 13GB worth of zeros
        } else {
            bytes
        };

        self.upload_blob(
            bytes,
            checkpoint.checkpoint_summary.sequence_number,
            location,
        )
        .await?;

        Ok(())
    }
}
  • Ran the local node and the data-ingestion-core bin
  • Checked if any timeouts were encountered

Change checklist

Tick the boxes that are relevant to your changes, and delete any items that are not.

  • I have followed the contribution guidelines for this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation

@sergiupopescu199 sergiupopescu199 requested review from a team as code owners January 27, 2025 09:23
Copy link

vercel bot commented Jan 27, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
apps-backend ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2025 1:49pm
apps-ui-kit ✅ Ready (Inspect) Visit Preview Jan 30, 2025 1:49pm
rebased-explorer ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2025 1:49pm
wallet-dashboard ✅ Ready (Inspect) Visit Preview Jan 30, 2025 1:49pm

@iota-ci iota-ci added infrastructure Issues related to the Infrastructure Team sc-platform Issues related to the Smart Contract Platform group. labels Jan 27, 2025
Copy link
Contributor

@kodemartin kodemartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @sergiupopescu199, it looks good. Left a couple of suggestion to consider/discuss.

Copy link
Contributor

@kodemartin kodemartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good with a comment that needs addressing and one more optional nitpick.

Copy link
Member

@samuel-rufi samuel-rufi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR @sergiupopescu199 ! From my side in addition only some refinements on documentation.

Copy link
Contributor

@tomxey tomxey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 🌍

Copy link
Contributor

@kodemartin kodemartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 🎈 great job to the author, and the reviewers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Issues related to the Infrastructure Team sc-platform Issues related to the Smart Contract Platform group.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement multi-part S3 upload in iota-data-ingestion
6 participants