Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Txpool: setup repo and implement batcher task #34

Closed
wants to merge 7 commits into from
Closed

Conversation

plafer
Copy link
Contributor

@plafer plafer commented Oct 12, 2023

Work towards: #35

This PR implements the "batcher task": Task that creates transaction batches. This only implements the core logic behind a trait; channels and gRPC will come in a later PR. Transaction verification will also come in a later PR.

The difficulty for testing was to simply generate a dummy StarkProof. I ended up copying the fib_small example from winterfell, and run the prover on each test. Ideally, winterfell would either expose a dummy StarkProof to use with tests, or derive Default. For now though, this method works fine.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is a great start. I left a few comments inline. The main one is about the overall structure covering both batch and block production (I did look at #37 but didn't review it in-depth).

Agreed that we need a better way to generate dummy proven transactions. I think the trick with using Winterfell is fine for now, but we'll need to come up with a better approach at some point.

Comment on lines 1 to 4
[package]
name = "txpool"
version = "0.1.0"
edition = "2021"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would copy the format of this section from how we do it in other places (e.g., here).

For the crate name, I'd use miden-txpool. But based on the fact that we are combining tx pool and block producer into a single component, I wonder if we should name this something else altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

58fa047

Changed the name to miden-txpool for now, as I see the block producer eventually going in its own crate. But I'll try to think of a better short-term name

#[cfg(test)]
pub mod test_utils;

pub type TxBatch = Vec<ProvenTransaction>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need to differentiate between "proposed transaction batch" and "proven transaction batch". Proposed batch is just a vector of transactions as you have here and I'm not sure we need a special type for it. The proven batch is a separate data structure (which we should probably flash out soon).

In my mind, the responsibility of the Batcher (though, maybe we should call it BatchProducer or BatchBuilder) is to take a vector of transactions and output a proven transaction batch (for now, we won't need do the actual proofs, but we still need to build the structure).

Comment on lines +10 to +30
/// Encapsulates the means the communicate with the Batcher task.
#[async_trait]
pub trait BatcherTaskHandle {
type SendError: Debug;
type ReceiveError: Debug;

/// Blocks until it is time to send a batch
/// Usually a `tokio::sync::Notify` will be behind
async fn wait_for_send_batch_notification(&self);

/// Blocks until a new transaction is received
/// Usually, either a channel or RPC endpoint will be behind
async fn receive_tx(&self) -> Result<ProvenTransaction, Self::ReceiveError>;

/// Send a batch.
/// This MUST only be called after being notified with `notify_send_batch`.
async fn send_batch(
&self,
txs: TxBatch,
) -> Result<(), Self::SendError>;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming I understood how this works correctly, I wonder if we should split this up a bit more.

In my mind BatchBulder could be a fairly simple interface - something like this:

pub trait BatchBuilder {
    /// This function would aggregate transaction into a batch, and in the future prove the batch.
    /// 
    /// This process would be "stateless" in that we won't need to connect to the Store component
    /// to get any extra info.
    /// 
    /// When we do the actual proving, batch builder would need to manage a set of distributed
    /// provers so that multiple batches could be proven concurrently on different machines.
    async fn build_batch(&self, txs: Vec<ProvenTransaction>) -> Result<TransactionBatch, SomeError>;
}

Similarly (and this is stepping into #37), I imagined BlockBuilder to look something like this:

pub trait BlockBuilder {
    /// This function would roughly do the following:
    /// - Connect to the Store component to get block inputs for building the next block;
    ///   this would include previous block header, partial Merkle trees for account and
    ///   nullifier DBs, maybe something else.
    /// - Create a new block, for now without the proof (but in the future it'll prove the
    ///   new block as well, probably delegating the proving a different process/machine).
    /// - Update the store component by calling `apply_block()` on it.
    /// 
    /// Basically, by the time this function returns, a new block is created and added to
    /// the Store.
    fn build_block(&self, batches: Vec<TransactionBatch>) -> Result<Block, SomeError>;
}

These components then could be composed into a higher-level struct (I'll call it Operator, though a better name is probably needed). This struct could look something like this:

pub struct Operator<B, P, V>
where
    B: BatchBuilder,
    P: BlockBuilder,
    V: StateView,
{
    /// A data structure which contains [ProvenTransaction]s together with their statuses;
    /// specifically, we need to have the ability to mark transactions as in the process of
    /// being aggregated (into a batch) without removing them from the queue.
    tx_queue: TransactionQueue,
    /// A data structure which contains [TransactionBatch]s together with their statuses;
    /// specifically, we need to have the ability to mark batches as being put into a block
    /// without removing them from the queue.
    batch_queue: BatchQueue,
    /// A component which, given a list of transactions, outputs a transaction batch.
    batch_builder: B,
    /// A component which, given a list of transaction batches, outputs a block.
    block_builder: P,
    /// A data structure which maintains a view into the account and nullifier databases
    /// assuming all currently in-flight transactions have been successfully applied.
    state: V,
}

impl<B, P, S> Operator<B, P, S>
where
    B: BatchBuilder,
    P: BlockBuilder,
    S: StateView,
{
    // To be called externally (e.g., from the RPC component)
    pub async fn add_transaction(&self, tx: ProvenTransaction) -> Result<(), SomeError> {
        // tries to add the transaction to self.state and if that succeeds, adds the transaction
        // to self.tx_queue
    }

    // To be called periodically
    async fn build_batch(&self) -> Result<(), SomeError> {
        // pick a set of transactions from self.tx_queue (mark them as taken, but don't remove
        // from the queue); then, use self.batch_builder to build a transaction batch:
        // - if this is successful, remove transactions from self.tx_queue and add the batch to
        //   self.batch_queue
        // - if this fails, "return" the transactions to self.tx_queue (i.e., update their status
        //   to mark them as available for batching)
    }

    // To be called periodically
   fn build_block(&self) -> Result<(), SomeError> {
        // pick a set of batches from self.batch_queue (mark them as taken, but don't remove from
        // the queue). then, use self.block_builder to build a block:
        // - if this succeeds, remove the batches from self.batch_queue and call apply_block() on
        //   self.state.
        // - if this fails, "return" the batches to self.batch_queue (i.e., update their statuses
        //   to mark them as available for block production)
    }
}

One important component in the above is StateView which could look something like this:

pub trait StateView {
    /// Adds a transaction to the state view. This would roughly do the following:
    /// - Check that tx.initial_account_hash matches the latest known account state; this may
    ///   require fetching latest known account state from the Store component (if it is not in
    ///   the local cache).
    /// - Check that none of the tx.consumed_notes have already been consumed; this will require
    ///   checking with the Store component, but also with the local cache.
    /// - Update the local cache so that tx.final_account_hash is now considered the latest
    ///   account state hash.
    /// - Add nullifiers from tx.consumed_notes to the local cache of nullifiers.
    fn add_transaction(&mut self, tx: &ProvenTransaction) -> Result<(), SomeError>;

    /// Removes the effect of the transaction from the local cache; we can probably skip this
    /// for now.
    fn remove_transaction(&mut self, tx: &ProvenTransaction) -> Result<(), SomeError>;

    /// Updates the state of the local cache (i.e., account and nullifier sets).
    fn apply_block(&mut self, block: &Block) -> Result<(), SomeError>;
}

The main purpose of the StateView is to keep consistent view into the state of the chain assuming all in-flight transactions are applied (regardless of whether transactions have been already batched or not).

I should say that the above code is mostly to illustrate various components and their relationships. There could be (and probably are) better ways of drawing boundaries between structs and traits and facilitating interactions between various components (e.g., maybe the task-based approach would work well here).

There could also be better ways of handling queue functionality (maybe based on something that tokio or related components already provide).

@plafer
Copy link
Contributor Author

plafer commented Oct 16, 2023

Closing as I will implement the TransactionQueue first. Salvaged parts from this PR are in #39

@plafer plafer closed this Oct 16, 2023
@hackaugusto hackaugusto deleted the plafer/txpool branch October 17, 2023 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants