Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

update ledger tool to restore cost table from blockstore #18489

Conversation

tao-stones
Copy link
Contributor

Problem

Instruction executing costs are persisted in blockstore as CF cost_table, ledger-tool's compute-slot-cost command could get more accurate results if restore that table from blockstore.

Summary of Changes

  • initialize cost_model from blockstore::cost_table before computing slot cost. No negative impact if cost-table is empty or doesn't exist in blockstore.

Fixes #

@tao-stones tao-stones requested a review from steviez July 7, 2021 16:06
@codecov
Copy link

codecov bot commented Jul 7, 2021

Codecov Report

Merging #18489 (12bf898) into master (5dd399d) will increase coverage by 0.0%.
The diff coverage is 91.3%.

@@           Coverage Diff           @@
##           master   #18489   +/-   ##
=======================================
  Coverage    82.5%    82.5%           
=======================================
  Files         436      436           
  Lines      121864   121868    +4     
=======================================
+ Hits       100628   100652   +24     
+ Misses      21236    21216   -20     

Copy link
Contributor

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to do this; however, this code looks almost identical to this function:
https://github.com/solana-labs/solana/blob/master/core/src/validator.rs#L901-L925

Maybe this function could be moved to cost_model.rs, and then could be used by both validator and ledger-tool to avoid duplicate implementations?

CostUpdateService::update_cost_model() is similar as well, but given that it takes in an ExecuteTimings and not a Vec<(Pubkey, u64)> slice, may not be doable to have this one piggyback on a common implementation

…sted and shared between validator and ledger-tool
@tao-stones
Copy link
Contributor Author

Makes sense to do this; however, this code looks almost identical to this function:
https://github.com/solana-labs/solana/blob/master/core/src/validator.rs#L901-L925

Maybe this function could be moved to cost_model.rs, and then could be used by both validator and ledger-tool to avoid duplicate implementations?

Good point, I actually copied the code from validator to ledger-tool. Should have it shared at first place :).
Pushed changes.

@tao-stones
Copy link
Contributor Author

CostUpdateService::update_cost_model() is similar as well, but given that it takes in an ExecuteTimings and not a Vec<(Pubkey, u64)> slice, may not be doable to have this one piggyback on a common implementation

I'll leave costUpdateService alone for now since it is on the other end of process - it updates cost table with executeTiming. Also it is possible it will be updating other info such as account lock time etc.

@tao-stones tao-stones requested a review from steviez July 7, 2021 19:51
steviez
steviez previously approved these changes Jul 7, 2021
Copy link
Contributor

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the (non-functional) suggestions on the new test; otherwise, ship it!

@tao-stones tao-stones added the automerge Merge this Pull Request automatically once CI passes label Jul 7, 2021
@mergify mergify bot dismissed steviez’s stale review July 7, 2021 21:06

Pull request has been modified.

@mergify mergify bot removed the automerge Merge this Pull Request automatically once CI passes label Jul 7, 2021
@mergify
Copy link
Contributor

mergify bot commented Jul 7, 2021

automerge label removed due to a CI failure

@tao-stones tao-stones added the automerge Merge this Pull Request automatically once CI passes label Jul 7, 2021
@mergify mergify bot removed the automerge Merge this Pull Request automatically once CI passes label Jul 8, 2021
@mergify
Copy link
Contributor

mergify bot commented Jul 8, 2021

automerge label removed due to a CI failure

@tao-stones tao-stones merged commit b6dff12 into solana-labs:master Jul 8, 2021
mergify bot pushed a commit that referenced this pull request Jul 15, 2021
* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test

(cherry picked from commit b6dff12)

# Conflicts:
#	core/src/cost_model.rs
#	core/src/validator.rs
#	ledger-tool/src/main.rs
@tao-stones tao-stones removed the v1.7 label Jul 15, 2021
tao-stones added a commit to tao-stones/solana that referenced this pull request Jul 16, 2021
…#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test
tao-stones added a commit to tao-stones/solana that referenced this pull request Jul 16, 2021
…#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test
tao-stones added a commit to tao-stones/solana that referenced this pull request Jul 17, 2021
…#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test
tao-stones added a commit to tao-stones/solana that referenced this pull request Sep 24, 2021
…#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test
tao-stones added a commit to tao-stones/solana that referenced this pull request Sep 29, 2021
…#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test
tao-stones added a commit to tao-stones/solana that referenced this pull request Sep 30, 2021
…#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test
tao-stones added a commit to tao-stones/solana that referenced this pull request Oct 6, 2021
…#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test
tao-stones added a commit that referenced this pull request Oct 6, 2021
* Cost Model to limit transactions which are not parallelizeable (#16694)

* * Add following to banking_stage:
  1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
  2. CostTracker which is shared between threads, tracks transaction costs for each block.

* replace hard coded program ID with id() calls

* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.

* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.

* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes

* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;

* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations

* replay stage feed back program cost (#17731)

* replay stage feeds back realtime per-program execution cost to cost model;

* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;

* changed cost unit to microsecond, using value collected from mainnet;

* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.

* investigate system performance test degradation  (#17919)

* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table

* Change mutex on cost_tracker to RwLock

* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.

* acquire and hold locks for block of TXs, instead of acquire and release per transaction;

* remove redundant would_fit check from cost_tracker update execution path

* refactor cost checking with less frequent lock acquiring

* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.

* create hashmap with new_capacity to reduce runtime heap realloc.

* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default

* address potential deadlock by acquiring locks one at time

* Persist cost table to blockstore (#18123)

* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`

* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup

* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.

* log warning when channel send fails (#18391)

* Aggregate cost_model into cost_tracker (#18374)

* * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions

* review fixes

* update ledger tool to restore cost table from blockstore (#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test

* manually fix merge conflicts

* Per-program id timings (#17554)

* more manual fixing

* solve a merge conflict

* featurize cost model

* more merge fix

* cost model uses compute_unit to replace microsecond as cost unit
(#18934)

* Reject blocks for costs above the max block cost (#18994)

* Update block max cost limit to fix performance regession (#19276)

* replace function with const var for better readability (#19285)

* Add few more metrics data points (#19624)

* periodically report sigverify_stage stats (#19674)

* manual merge

* cost model nits (#18528)

* Accumulate consumed units (#18714)

* tx wide compute budget (#18631)

* more manual merge

* ignore zerorize drop security

* - update const cost values with data collected by #19627
- update cost calculation to closely proposed fee schedule #16984

* add transaction cost histogram metrics (#20350)

* rebase to 1.7.15

* add tx count and thread id to stats (#20451)
each stat reports and resets when slot changes

* remove cost_model feature_set

* ignore vote transactions from cost model

Co-authored-by: sakridge <sakridge@gmail.com>
Co-authored-by: Jeff Biseda <jbiseda@gmail.com>
Co-authored-by: Jack May <jack@solana.com>
t-nelson pushed a commit that referenced this pull request Oct 6, 2021
* Cost Model to limit transactions which are not parallelizeable (#16694)

* * Add following to banking_stage:
  1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
  2. CostTracker which is shared between threads, tracks transaction costs for each block.

* replace hard coded program ID with id() calls

* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.

* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.

* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes

* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;

* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations

* replay stage feed back program cost (#17731)

* replay stage feeds back realtime per-program execution cost to cost model;

* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;

* changed cost unit to microsecond, using value collected from mainnet;

* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.

* investigate system performance test degradation  (#17919)

* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table

* Change mutex on cost_tracker to RwLock

* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.

* acquire and hold locks for block of TXs, instead of acquire and release per transaction;

* remove redundant would_fit check from cost_tracker update execution path

* refactor cost checking with less frequent lock acquiring

* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.

* create hashmap with new_capacity to reduce runtime heap realloc.

* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default

* address potential deadlock by acquiring locks one at time

* Persist cost table to blockstore (#18123)

* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`

* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup

* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.

* log warning when channel send fails (#18391)

* Aggregate cost_model into cost_tracker (#18374)

* * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions

* review fixes

* update ledger tool to restore cost table from blockstore (#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test

* manually fix merge conflicts

* Per-program id timings (#17554)

* more manual fixing

* solve a merge conflict

* featurize cost model

* more merge fix

* cost model uses compute_unit to replace microsecond as cost unit
(#18934)

* Reject blocks for costs above the max block cost (#18994)

* Update block max cost limit to fix performance regession (#19276)

* replace function with const var for better readability (#19285)

* Add few more metrics data points (#19624)

* periodically report sigverify_stage stats (#19674)

* manual merge

* cost model nits (#18528)

* Accumulate consumed units (#18714)

* tx wide compute budget (#18631)

* more manual merge

* ignore zerorize drop security

* - update const cost values with data collected by #19627
- update cost calculation to closely proposed fee schedule #16984

* add transaction cost histogram metrics (#20350)

* rebase to 1.7.15

* add tx count and thread id to stats (#20451)
each stat reports and resets when slot changes

* remove cost_model feature_set

* ignore vote transactions from cost model

Co-authored-by: sakridge <sakridge@gmail.com>
Co-authored-by: Jeff Biseda <jbiseda@gmail.com>
Co-authored-by: Jack May <jack@solana.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants