Skip to content

Ethereum Data Extraction Library

Brecht Devos edited this page Apr 15, 2019 · 25 revisions

Introduction

We need an easy to use library to get access to the data published on-chain. For a general overview of some of the data structures see here.

Requirements

Allow Merkle tree recreation

A program/library needs to be able to easily recreate the Merkle tree by using the data provided by the library. This should be the most straightforward requirement. By playing back all the state changes from start to end, modifying the Merkle tree at every step, the recreated Merkle tree should match exactly with the on-chain Merkle tree for every block. In practice, recreatedMerkleTreeRoot == onchainMerkleTreeRoot for every block.

Allow Block Explorer integration

The library needs to provide all the necessary data needed for the creation of a block explorer for our sidechain. This one's more difficult because we don't want to start from scratch here, we want to integrate/modify one of the existing open source blockchain explorers. Depending on which software we'll use we'll need to communicate in some way with this library.

EthVM created by MEW looks interesting, but is still being developed. There is an alpha release.

Expose on-chain requests

The library can also be used by e.g. wallets/ring-matchers/operators to monitor the side-chain. In particular, operators need to closely monitor the on-chain deposit and withdrawal events. The library needs to use this information anyway so may as well expose this data.

Data Sources

The library will access the Ethereum block-chain to read in the necessary data:

  • Data available in the raw tx data from Ethereum blocks (from the operator (tx.origin))
  • Data from events emitted by the Exchange contract

Because it's not easy to call a function on a contract for an old state we should not rely on contract functions to get certain old data.

If we notice while creating this library that we can't easily access certain data we'll have to fix that.

Some data needs some simple additional processing to get the actual values used to update the Merkle tree. For example, we allow splitting the fee between the wallet and the ring-matcher. How much each party actually gets paid isn't directly available. As part of the on-chain data-availability we have the fee amount paid by the order owner and the wallet split percentage, it's trivial to calculate how much each party gets paid using that information.

On-chain data-availability format

Please see the docs in the code for commitBlock so that this information is documented in a single place.

Exchange Events

Please see here for all the events.

Most notable:

  • BlockCommitted: The tx data of the Ethereum block will need to be extracted to get all necessary data
  • BlockFinalized: The block was verified
  • Revert: All state changes starting from and including the reverted block need to be reverted
  • DepositRequested: The account info and the additional token amount for the given token
  • WithdrawalRequested: How much a user wants to withdraw (which can be more than his actual balance)

Not all events can be applied immediately. The DepositRequested and WithdrawalRequested events are only actually processed when they are included in a block (which block can be known by inspecting the data passed into commitBlock, but the data used for these updates is availably in these events).

Request logic

Some of this logic will be needed to reconstruct values that were used to update the Merkle tree.

The logic required is in general pretty simple. For code references, please refer to

  • The simulator code used for validation. This is the easiest to read, but does not contain code to generate the Merkle proofs.
  • The operator code used for creating the blocks in the tests. Contains all the code needed to create valid blocks.

API

The library will need to store some state because the complete exchange state is only known when all Ethereum blocks since the creation of the Exchange are inspected.

The API could look a bit like this. Other ways to expose the data may make more sense, but this is the kind of data we expect to be exposed by the library in some way:

//
//    Data structures
//

// A block
struct Block
{
    blockIdx: number;
    blockType: BlockType;
    blockState: BlockState;
    operator: number;
    transactions: Transactions[];
}

// A token transfer (in the Merkle tree)
struct TokenTransfer
{
    from: number;
    to: number;
    token: number;
    amount: BN;
}

// A trade history update (in the Merkle tree)
struct TradeHistoryUpdate
{
    filled: BN;
    cancelled: boolean;
    orderID: number;
}

// A Ring Settlement transaction
struct RingSettlementTransaction
{ 
    ring-matcher: number;
    fee-recipient: number;
    tokenTransfers: TokenTransfer[];
    tradeHistoryUpdates: TradeHistoryUpdate[2];
};

// A Deposit transaction
struct DepositTransaction
{
    accountID: number,
    tokenID: number,
    amount: BN,
    pubKeyX: string,
    pubKeyY: string,
};

// An on-chain withdrawal transaction
struct OnchainWithdrawalTransaction
{
    shutdown: boolean,
    accountID: number,
    tokenID: number,
    amount: BN,
};

// An off-chain withdrawal transaction
struct OffchainWithdrawalTransaction
{
    accountID: number,
    tokenID: number,
    amount: BN,
    tokenTransfers: TokenTransfer[];
};

// An off-chain withdrawal transaction
struct CancelTransaction
{
    accountID: number,
    tokenID: number,
    tradeHistoryUpdate: TradeHistoryUpdate;
    tokenTransfers: TokenTransfer[];
};

//
//    Functions
//

// ** State **

// Sets up the state for an exchange
initialize(exchangeContractAddress: string, exchangeCreationEthereumBlockIdx: number)

// Runs over all Ethereum blocks until the specified ending Ethereum block idx
// so the internal state can be updated. As the starting point the
// previous ending Ethereum block idx is used.
processEthereumBlocksUpTo(endingEthereumBlockIdx: number)

// ** Blocks **

// Returns the number of blocks
getNumBlocks()

// Returns the number of finalized blocks
getNumFinalizedBlocks()

// Returns the specified block
getBlock(blockIdx: number)

// ** Deposit requests **

// Returns the number of deposit requests
getNumDepositRequests()

// Returns the number of open deposit requests
getNumOpenDepositRequests()

// Gets the data for the specified deposit request
getDepositRequestInfo(depositIdx: number)

// ** Withdrawal requests **

// Returns the number of on-chain withdrawal requests
getNumWithdrawalRequests()

// Returns the number of open withdrawal requests
getNumOpenWithdrawalRequests()

// Gets the data for the specified withdrawal request
getWithdrawalRequestInfo(withdrawalIdx: number)

// ** Accounts **

// Returns the number of accounts
getNumAccounts()

// Returns the account info
getAccountInfo(accountId: number)

Testing

This library can be integrated into the existing tests so that it can be tested for all existing protocol test cases.

Another good way for the library to be tested is, of course, to use the data to recreate the Merkle tree.