Reduce MAPPING_BASE_COST from 10_000 to 1_500 #2564

evanmarshall · 2024-10-25T09:29:10Z

Lower GET, GET_OR_USE, & CONTAINS finalize fees

The goal of this PR is to lower fees safely such that the impact would encourage more chain activity without impacting the safety of network.
Specifically, the PR only changes a single value: MAPPING_BASE_COST from 10_000 microcredits to 500 microcredits

Previous work

The most comphrensive work IMO is this analysis: https://github.com/iamalwaysuncomfortable/analysis/blob/main/aleo_finalize_opcode_benchmark_and_pricing_analysis.ipynb as part of this PR: #2281

Some key observations taken from this analysis:

SET & GET are the two most commonly used OPCodes
Most simple operations run in 1 to 5 µs and are priced at 500 microcredits
SET & GET have the same base cost of 10_000 credits with different per byte additional costs of 100 & 10 respectively.
The base assumption for the fee model is 1 second of single threaded runtime should cost 100 credits

New Benchmarks Approach

I could not find the actual benchmarks for SET/GET in the linked previous work but all of the other opcodes are included.
So I created these benchmarks for SET/GET . I decided not to include these benchmarks as part of this PR but it can be used again when revisiting the fee model and as other DB optimizations land.

These new benchmarks use rocksdb & specifically benchmark: finalize_store.update_key_value & finalize_store.get_value_speculative.
Although this is the core part of a SET & GET operation, other overhead exists which is not included in this benchmark but the overhead should be roughly equivalent between SET & GET.

There are probably signficant differences from running benchmarks and running on a live node under heavy load but some care has been taken to ensure realistic benchmarks including using unique keys for the gets & sets and using rocksdb instead of the test finalize store.

Results

Although priced similarly, GET & SET are extremely different in actual runtime.

For primitive types like a U64 a SET costs 11_200 and a GET costs 10_120. (A base cost of 10_000 and serialized, a U64 is 12 bytes)
The SET runtime for a U64 is 170µs and the GET runtime is 4.3µs

Benchmark Summary

Type: small_struct (422 bytes)
- Set Time: ~440.30 µs
- Get Time: ~79.75 µs
Type: large_struct (6456 bytes)
- Set Time: ~4690 µs
- Get Time: ~135.55 µs
Type: array_1x1 (43 bytes)
- Set Time: ~185.21 µs
- Get Time: ~4.21 µs
Type: array_32x32 (1407 bytes)
- Set Time: ~1.23 ms
- Get Time: ~17.41 µs
Type: U8
- Set Time: ~165.74 µs
- Get Time: ~4.27 µs
Type: U16
- Set Time: ~169.60 µs
- Get Time: ~4.48 µs
Type: U32
- Set Time: ~154.07 µs
- Get Time: ~4.45 µs
Type: U64
- Set Time: ~168.84 µs
- Get Time: ~4.34 µs
Type: U128
- Set Time: ~170.26 µs
- Get Time: ~4.49 µs

Conclusions

GET is currently priced similarly to a SET when the in many cases, there is a 40x difference in runtime.
We should lower the base price of GET to 500 to put more inline with other primitive operations.

The big outlier is structs which perhaps deserve their own pricing model but I wanted to keep this change as simple as possible.

Safety

Despite this change leading to a more consistent but still imperfect pricing model, we should consider the potential impact.

As a validator, we run a 32-core, 64GB memory computer to power our validator. Our CPU utilization for the node in the last 7 days has stayed between 5% and 10% despite relatively heavy load. It seem at current network activity levels, we are unlikely to risk overloading machines.

If there are other concerns, I'm unaware of please share.

Impact

For many functions that use public state, we can expect to reduce the transaction fee by 35% to 60%. As the team behind the Leo Wallet, the most frequent compliant we hear is about the high fees.

To pick a couple of programs, I'm familiar with:

token_registry.aleo/transfer_public: 140_173 -> 92_673
pondo_protocol.aleo/deposit_public_as_signer: 477_484 -> 211_484

Since higher fees are still accepted, migration for downstream programs and applications isn't required, only highly recommended.

It will be difficult to raise fees in the future, so we should be aware that if we expect the runtime of GET to become more expensive over time, then we should perhaps be more cautious with how much we lower the GET base fees.

vicsn · 2024-10-25T10:39:34Z

As this would be the first PR which changes which transactions (and their fees) are deemed valid, we'll have to add some "migration" logic. Would defining it as follows suffice? The new pricing can apply from this height onwards. CC @raychu86 @d0cd

diff --git a/console/network/src/canary_v0.rs b/console/network/src/canary_v0.rs
index f0a973acd..5c3770789 100644
--- a/console/network/src/canary_v0.rs
+++ b/console/network/src/canary_v0.rs
@@ -147,6 +147,8 @@ impl Network for CanaryV0 {
     const INCLUSION_FUNCTION_NAME: &'static str = MainnetV0::INCLUSION_FUNCTION_NAME;
     /// The maximum number of certificates in a batch.
     const MAX_CERTIFICATES: u16 = 100;
+    /// The block height from which new consensus rules apply.
+    const CONSENSUS_V2_HEIGHT: u32 = 1_000; // TODO: adjust based on canary height.
     /// The network name.
     const NAME: &'static str = "Aleo Canary (v0)";
 
diff --git a/console/network/src/lib.rs b/console/network/src/lib.rs
index b0694d43a..483157e15 100644
--- a/console/network/src/lib.rs
+++ b/console/network/src/lib.rs
@@ -203,6 +203,8 @@ pub trait Network:
 
     /// The maximum number of certificates in a batch.
     const MAX_CERTIFICATES: u16;
+    /// The block height from which new consensus rules apply.
+    const CONSENSUS_V2_HEIGHT: u32;
 
     /// The maximum number of bytes in a transaction.
     // Note: This value must **not** be decreased as it would invalidate existing transactions.
diff --git a/console/network/src/mainnet_v0.rs b/console/network/src/mainnet_v0.rs
index 6aadaa3f3..1aaaecc9e 100644
--- a/console/network/src/mainnet_v0.rs
+++ b/console/network/src/mainnet_v0.rs
@@ -158,6 +158,8 @@ impl Network for MainnetV0 {
     const INCLUSION_FUNCTION_NAME: &'static str = snarkvm_parameters::mainnet::NETWORK_INCLUSION_FUNCTION_NAME;
     /// The maximum number of certificates in a batch.
     const MAX_CERTIFICATES: u16 = 16;
+    /// The block height from which new consensus rules apply.
+    const CONSENSUS_V2_HEIGHT: u32 = 3_000_000; // TODO: adjust based on mainnet height.
     /// The network name.
     const NAME: &'static str = "Aleo Mainnet (v0)";
 
diff --git a/console/network/src/testnet_v0.rs b/console/network/src/testnet_v0.rs
index fd793055a..0f42c7c90 100644
--- a/console/network/src/testnet_v0.rs
+++ b/console/network/src/testnet_v0.rs
@@ -147,6 +147,8 @@ impl Network for TestnetV0 {
     const INCLUSION_FUNCTION_NAME: &'static str = MainnetV0::INCLUSION_FUNCTION_NAME;
     /// The maximum number of certificates in a batch.
     const MAX_CERTIFICATES: u16 = 100;
+    /// The block height from which new consensus rules apply.
+    const CONSENSUS_V2_HEIGHT: u32 = 1_000; // TODO:  adjust based on testnet height.
     /// The network name.
     const NAME: &'static str = "Aleo Testnet (v0)";

evanmarshall · 2024-10-25T16:20:07Z

As this would be the first PR which changes which transactions (and their fees) are deemed valid

Do we have to add a migration? All previously valid transactions will still be valid.

vicsn · 2024-10-25T16:48:54Z

As this would be the first PR which changes which transactions (and their fees) are deemed valid

Do we have to add a migration? All previously valid transactions will still be valid.

Good point. I still think it would be good for:

Defense in depth. I'm wondering whether this could makerejected transactions could potentially be considered valid (don't know the potential reasons for rejection by heart), and I want to make reviewers life easy by not having to think about this.
Avoiding a synchronous restart. If only some validators have the upgrade, it could lead to a fork.

raychu86 · 2024-10-25T17:42:22Z

Good point. I still think it would be good for:

Defense in depth. I'm wondering whether this could makerejected transactions could potentially be considered valid (don't know the potential reasons for rejection by heart), and I want to make reviewers life easy by not having to think about this.

Avoiding a synchronous restart. If only some validators have the upgrade, it could lead to a fork.

Technically there is no requirement to add additional migration logic for pricing reductions because previously, txes that have too few fees either never make it to DAG serialization or are aborted by honest validators. (Note that is not the case for pricing increases, because it could invalidate accepted txes.). However, because mainnet is live and we can not expect a coordinated restart of all nodes, we need to consider adding logic s.t. nodes automatically roll into the new change once it passes a future block X that we specify.
Any consensus change (which this is) should increment the snarkOS message versions, so nodes on older versions do not connect to nodes on the new version. This is our defense against forking, however it's a bit more nuanced.

TLDR: If we do a coordinated restart, then we can do without extra migration logic. Otherwise, migration logic may be necessary to keep the network rolling

raychu86 · 2024-10-25T20:04:19Z

In general, since I assume there will be a larger effort to reduce fees overall other parts of finalize/storage they should be implemented together to avoid multiple migration efforts. Additionally, I would advise a fee reduction discussion take place as an ARC so there is more weigh in regarding pain points (and to discuss the migration considerations as well).

evanmarshall · 2024-10-25T21:30:10Z

TLDR: If we do a coordinated restart, then we can do without extra migration logic. Otherwise, migration logic may be necessary to keep the network rolling

In any case, we should plan on a synchronous upgrade as well to minimize any chance of a chain halt because during an async upgrade, it's likely neither version will have quorum. If we plan on a synchronous upgrade, do you think the migration is necessary? I'm happy to add it but I'm not sure if it's preferred to keep that branching code around forever.

In general, since I assume there will be a larger effort to reduce fees overall other parts of finalize/storage they should be implemented together to avoid multiple migration efforts. Additionally, I would advise a fee reduction discussion take place as an ARC so there is more weigh in regarding pain points (and to discuss the migration considerations as well).

I absolutely think we should examine every part of fees in general but I strongly suggest we implement this change immediately while we consider other larger changes. I don't think it requires an ARC as none of the mechanisms, logic, or assumptions are changing. The only part that's changing is fixing the inconsistency of pricing GET/GET_OR_USE/CONTAINS.

vicsn · 2024-10-28T09:29:46Z

I agree with Evan that given this change does not change assumptions and is urgent, it is unlikely to warrant an ARC. Future bigger fee adjustment proposals likely will require broader discussion. But its up to Foundation to draw the line. CC @zkxuerb

I agree with Ray there should be gateway EVENT version bump. Want to add this in @evanmarshall ? Also removes the need for migration logic.

A router MESSAGE version bump would force a coordinated restart for all the clients in the ecosystem if they want to continue functioning, which sounds very difficult to coordinate. Other approaches would be:

Use migration logic, which gives clients a larger window to update.
Don't use migration logic nor do a MESSAGE version bump. Increasing the risk that old clients will temporarily relay transactions with incorrect fees (which is unfortunate but acceptable) or risk forking because of syncing with incorrect blocks (I did not analyse in depth yet).

d0cd · 2024-10-31T20:39:55Z

Do we have data (either in this PR or prior) that helps us understand:

How the performance of get varies as the size of the data type increases?
For gets of data type of varying sizes, how does the performance vary as the number of entries in the mapping increases?

I believe the current benchmarks are on small-sized gets, which runs us the risk of missing a DOS vector at larger sizes.

iamalwaysuncomfortable

@evanmarshall @d0cd @vicsn @raychu86 @zkxuerb

Here is a set of benchmarks of multiple types run roughly February of this year using this branch of SnarkVM.

According to these results, existing entries + datasize had a slight effect on GET time, but at most by 10k nanoseconds. This should be re-run independently to verify that this is the case with the latest branch of SnarkVM.

opcode	type	bytes	rt_nanosecs	rt_secs	#_entries
contains	field	32	16684.0833	0.000016684083	100000
contains	u64	8	19199.46656	0.000019199467	500000
contains	array,field,length,2,depth,0	64	21344.0327	0.000021344033	1000000
contains	array,field,length,8,depth,1	2048	20746.66533	0.000020746665	100000
contains	array,field,length,8,depth,0	256	18655.01273	0.000018655013	1
contains	group	64	19559.88021	0.000019559880	1000000
contains	array,field,length,2,depth,1	128	19795.68546	0.000019795685	100000
contains	array,field,length,2,depth,2	256	18790.78975	0.000018790790	1
contains	array,field,length,4,depth,1	512	19318.90495	0.000019318905	50000
contains	array,field,length,16,depth,1	8192	18548.67723	0.000018548677	100000
contains	array,field,length,16,depth,0	512	21658.67046	0.000021658670	1000000
contains	array,field,length,4,depth,2	2048	19744.94803	0.000019744948	100000
contains	array,field,length,4,depth,0	128	19628.12796	0.000019628128	500000
contains	array,field,length,8,depth,2	16384	19507.82929	0.000019507829	100000
contains	array,field,length,16,depth,2	131072	17602.99657	0.000017602997	1
get	array,field,length,2,depth,0	64	17632.54062	0.000017632541	1
get	array,field,length,16,depth,1	8192	19408.99476	0.000019408995	10000
get	field	32	17205.72254	0.000017205723	10000
get	array,field,length,8,depth,2	16384	18760.94211	0.000018760942	1
get	array,field,length,4,depth,0	128	18106.38733	0.000018106387	100000
get	array,field,length,8,depth,1	2048	19963.12907	0.000019963129	50000
get	array,field,length,2,depth,2	256	20535.13759	0.000020535138	500000
get	array,field,length,16,depth,0	512	22155.52666	0.000022155527	1000000
get	array,field,length,4,depth,1	512	23084.59778	0.000023084598	1000000
get	array,field,length,4,depth,2	2048	21478.06992	0.000021478070	500000
get	array,field,length,8,depth,0	256	18784.7789	0.000018784779	100000
get	array,field,length,2,depth,1	128	18501.47539	0.000018501475	1
get	group	64	18667.25578	0.000018667256	1
get	u64	8	19378.57399	0.000019378574	500000
get	array,field,length,16,depth,2	131072	18154.70924	0.000018154709	1
get.or.use	array,field,length,4,depth,1	512	24943.32693	0.000024943327	50000
get.or.use	array,field,length,2,depth,1	128	21589.8717	0.000021589872	1000000
get.or.use	array,field,length,2,depth,0	64	18829.94789	0.000018829948	200000
get.or.use	u64	8	18567.80415	0.000018567804	10000
get.or.use	array,field,length,4,depth,2	2048	25969.41762	0.000025969418	1000000
get.or.use	field	32	19384.96555	0.000019384966	500000
get.or.use	array,field,length,4,depth,0	128	18831.95341	0.000018831953	200000
get.or.use	group	64	20138.47141	0.000020138471	100000
get.or.use	array,field,length,8,depth,0	256	19632.41775	0.000019632418	200000
get.or.use	array,field,length,16,depth,0	512	19811.54192	0.000019811542	200000
get.or.use	array,field,length,8,depth,1	2048	23602.38492	0.000023602385	1000
get.or.use	array,field,length,2,depth,2	256	22148.29278	0.000022148293	200000
get.or.use	array,field,length,8,depth,2	16384	55572.94982	0.000055572950	50000
get.or.use	array,field,length,16,depth,1	8192	34934.46167	0.000034934462	200000
get.or.use	array,field,length,16,depth,2	131072	179140.3925	0.000179140392	1

evanmarshall · 2024-11-06T08:51:01Z

I ran a subset of the benchmarks provided by @iamalwaysuncomfortable on an updated snarkVM branch: https://github.com/demox-labs/snarkVM/tree/benches/updated-mappings

These benchmarks were run on an Apple M2 Max. I noticed a pretty big difference (+- 50%) in time depending on what I had open/running on my computer. Do we know if the original measurements were taken on a specific aws ec2 instance or what machine under what conditions they were run under?

When I ran the benchmarks and nothing else, I got:

get_field_1_entries_rocksdb
                        time:   [3.2964 µs 3.3324 µs 3.3700 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching set_field_1_entries_rocksdb
set_field_1_entries_rocksdb
                        time:   [142.65 µs 145.04 µs 147.84 µs]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) high mild
  1 (10.00%) high severe

Benching get_field_1000000_entries_rocksdb
get_field_1000000_entries_rocksdb
                        time:   [4.5343 µs 4.5907 µs 4.6217 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild

Benching set_field_1000000_entries_rocksdb
set_field_1000000_entries_rocksdb
                        time:   [146.09 µs 148.43 µs 150.67 µs]

Benching get_u64_1_entries_rocksdb
get_u64_1_entries_rocksdb
                        time:   [4.7081 µs 4.7722 µs 4.8537 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching set_u64_1_entries_rocksdb
set_u64_1_entries_rocksdb
                        time:   [128.64 µs 130.35 µs 134.57 µs]
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) high severe

Benching get_u64_1000000_entries_rocksdb
get_u64_1000000_entries_rocksdb
                        time:   [5.6874 µs 5.7532 µs 5.8719 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching set_u64_1000000_entries_rocksdb
set_u64_1000000_entries_rocksdb
                        time:   [132.95 µs 136.37 µs 140.74 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching get_field_array_depth_2_length_8_1_entries_rocksdb
get_field_array_depth_2_length_8_1_entries_rocksdb
                        time:   [5.0197 µs 5.8679 µs 6.7498 µs]
Found 3 outliers among 10 measurements (30.00%)
  1 (10.00%) low mild
  2 (20.00%) high severe

Benching set_field_array_depth_2_length_8_1_entries_rocksdb
set_field_array_depth_2_length_8_1_entries_rocksdb
                        time:   [12.715 ms 12.787 ms 12.855 ms]

Some Observations:

How full the mapping is does matter but not by a lot.
Set grows more expensive faster per byte than the 100 mc vs 10 mc currently in the code implies.
If we use lto=true in snarkVM release profile, we can get significant performance improvements.

d0cd · 2024-11-06T15:31:55Z

Looking at the benchmarks for get, it seems like a random index is used.
Wouldn't this result in a DB access that is likely to miss, meaning that we're not measuring the full cost of an access?
Furthermore, it looks like we only benching on large value types, we should also be benching for large key types.

evanmarshall · 2024-11-06T16:59:11Z

The benchmarks I created in the original comment ensured that each get hit a value that was previously set.

I'll add some benchmarks with different key sizes though. I think that's a great idea to test.

iamalwaysuncomfortable · 2024-11-08T15:43:06Z

@evanmarshall @raychu86 @d0cd @vicsn

Here are benchmarks of values with large keys. Analysis to follow.

Mappings Benched

mapping m_field_array_depth_2_length_4:
    key as [[[field; 4u32]; 4u32]; 4u32].public;
    value as [[[field; 4u32]; 4u32]; 4u32].public;

mapping m_field_array_depth_2_length_8:
    key as [[[field; 8u32]; 8u32]; 8u32].public;
    value as [[[field; 8u32]; 8u32]; 8u32].public;

Setup Notes

RocksDB mappings were benched
All keys were benched with keys randomly selected from the available set of keys that were used on insert of the data into the RocksDB Mappings.
Inserts of m_field_array_depth_2_length_8: to 1_000_000 entries in serial took a full 13 hours

Benchmark Code

benches/enhanced-mappings

Results

Mappings with Large Keys & Values

Mapping Key	Mapping Value	Opcode	Number of RocksDB Entries	Avg µs	OS	Chip	RAM
`[[[field; 4u32]; 4u32]; 4u32]`	`[[[field; 4u32]; 4u32]; 4u32]`	contains	1	39.714	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 4u32]; 4u32]; 4u32]`	`[[[field; 4u32]; 4u32]; 4u32]`	get	1	50.166	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 4u32]; 4u32]; 4u32]`	`[[[field; 4u32]; 4u32]; 4u32]`	get.or.use	1	59.001	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 4u32]; 4u32]; 4u32]`	`[[[field; 4u32]; 4u32]; 4u32]`	contains	1_000_000	95.158 µs	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 4u32]; 4u32]; 4u32]`	`[[[field; 4u32]; 4u32]; 4u32]`	get	1_000_000	112.49	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 4u32]; 4u32]; 4u32]`	`[[[field; 4u32]; 4u32]; 4u32]`	get.or.use	1_000_000	129.89	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 4u32]; 8u32]; 8u32]`	`[[[field; 8u32]; 8u32]; 4u32]`	contains	1	220.94	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 8u32]; 8u32]; 8u32]`	`[[[field; 8u32]; 8u32]; 4u32]`	get	1	269.33	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 4u32]; 8u32]; 8u32]`	`[[[field; 8u32]; 8u32]; 4u32]`	get.or.use	1	318.92	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 4u32]; 8u32]; 8u32]`	`[[[field; 8u32]; 8u32]; 4u32]`	contains	1_000_000	474.99	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 8u32]; 8u32]; 8u32]`	`[[[field; 8u32]; 8u32]; 4u32]`	get	1 _000_000	645.68	OSX Sonoma 14.5	M1 MAX	64 GB
`[[[field; 4u32]; 8u32]; 8u32]`	`[[[field; 8u32]; 8u32]; 4u32]`	get.or.use	1_000_000	673.76	OSX Sonoma 14.5	M1 MAX	64 GB

Atomic Aleo Instructions (`field`, `u64`)

Mapping Key	Mapping Value	Opcode	Number of RocksDB Entries	Avg µs	OS	Chip	RAM
field	field	contains	1	6.6172 µs	OSX Sonoma 14.5	M1 MAX	64 GB
field	field	contains	1_000_000	18.352 µs	OSX Sonoma 14.5	M1 MAX	64 GB
field	field	get	1	6.7656 µs	OSX Sonoma 14.5	M1 MAX	64 GB
field	field	get	1_000_000	20.799 µs	OSX Sonoma 14.5	M1 MAX	64 GB
field	field	get.or_use	1	7.6780 µs	OSX Sonoma 14.5	M1 MAX	64 GB
field	field	get.or_use	1_000_000	19.735 µs	OSX Sonoma 14.5	M1 MAX	64 GB
u64	u64	contains	1	8.8390 µs	OSX Sonoma 14.5	M1 MAX	64 GB
u64	u64	contains	1_000_000	16.045 µs	OSX Sonoma 14.5	M1 MAX	64 GB
u64	u64	get	1	9.4391 µs	OSX Sonoma 14.5	M1 MAX	64 GB
u64	u64	get	1_000_000	14.766 µs	OSX Sonoma 14.5	M1 MAX	64 GB
u64	u64	get.or_use	1	13.608 µs	OSX Sonoma 14.5	M1 MAX	64 GB
u64	u64	get.or_use	1_000_000	16.849 µs	OSX Sonoma 14.5	M1 MAX	64 GB

Raw Results

Large Key & Value Types

`[[[field; 4u32]; 4u32]; 4u32]` - 1 entry

contains_field_array_depth_2_length_4_1_entries_rocksdb
                        time:   [33.617 µs 39.714 µs 48.348 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe

Benching get_field_array_depth_2_length_4_1_entries_rocksdb
get_field_array_depth_2_length_4_1_entries_rocksdb
                        time:   [49.137 µs 50.166 µs 51.037 µs]

Benching get.or_use_field_array_depth_2_length_4_1_entries_rocksdb
get.or_use_field_array_depth_2_length_4_1_entries_rocksdb
                        time:   [55.262 µs 59.001 µs 63.156 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

`[[[field; 4u32]; 4u32]; 4u32]` - 1_000_000 entries

Benching contains_field_array_depth_2_length_4_1000000_entries_rocksdb
contains_field_array_depth_2_length_4_1000000_entries_rocksdb
                        time:   [89.188 µs 95.158 µs 101.90 µs]

Benching get_field_array_depth_2_length_4_1000000_entries_rocksdb
get_field_array_depth_2_length_4_1000000_entries_rocksdb
                        time:   [105.67 µs 112.49 µs 124.89 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe

Benching get.or_use_field_array_depth_2_length_4_1000000_entries_rocksdb
get.or_use_field_array_depth_2_length_4_1000000_entries_rocksdb
                        time:   [117.60 µs 129.89 µs 141.77 µs]

`[[[field; 8u32]; 8u32]; 8u32]` - 1 entry

contains_field_array_depth_2_length_8_1_entries_rocksdb
                        time:   [205.45 µs 220.94 µs 236.34 µs]
                        change: [-13.076% -5.8931% +1.6257%] (p = 0.17 > 0.05)
                        No change in performance detected.
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low severe
  1 (10.00%) high severe

Benching get_field_array_depth_2_length_8_1_entries_rocksdb
get_field_array_depth_2_length_8_1_entries_rocksdb
                        time:   [258.91 µs 269.33 µs 286.08 µs]
                        change: [-26.703% -23.411% -19.214%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching get.or_use_field_array_depth_2_length_8_1_entries_rocksdb
get.or_use_field_array_depth_2_length_8_1_entries_rocksdb
                        time:   [292.21 µs 318.92 µs 355.19 µs]
                        change: [+31.223% +44.052% +58.086%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) high mild
  1 (10.00%) high severe

`[[[field; 8u32]; 8u32]; 8u32]` - 1_000_000 entries

contains_field_array_depth_2_length_8_1000000_entries_rocksdb
                        time:   [411.35 µs 474.99 µs 570.07 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching get_field_array_depth_2_length_8_1000000_entries_rocksdb
get_field_array_depth_2_length_8_1000000_entries_rocksdb
                        time:   [548.96 µs 645.68 µs 712.91 µs]

Benching get.or_use_field_array_depth_2_length_8_1000000_entries_rocksdb
get.or_use_field_array_depth_2_length_8_1000000_entries_rocksdb
                        time:   [622.53 µs 673.76 µs 721.37 µs]

Atomic Key & Value Types

`field` 1 - entry

contains_field_1_entries_rocksdb
                        time:   [5.9539 µs 6.6172 µs 7.3990 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching get_field_1_entries_rocksdb
get_field_1_entries_rocksdb
                        time:   [6.4490 µs 6.7656 µs 7.0782 µs]

Benching get.or_use_field_1_entries_rocksdb
get.or_use_field_1_entries_rocksdb
                        time:   [6.8831 µs 7.6780 µs 9.3108 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe

`field` - 1_000_000 entries

contains_field_1000000_entries_rocksdb
                        time:   [15.130 µs 18.352 µs 21.536 µs]

Benching get_field_1000000_entries_rocksdb
get_field_1000000_entries_rocksdb
                        time:   [16.891 µs 20.799 µs 25.687 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching get.or_use_field_1000000_entries_rocksdb
get.or_use_field_1000000_entries_rocksdb
                        time:   [15.632 µs 19.735 µs 25.047 µs]

`u64` - 1 entry

Mapping Name: m_u64, Mapping Type: u64, Mapping Length: 0
Inserting key #0 into m_u64...
Benching contains_u64_1_entries_rocksdb
contains_u64_1_entries_rocksdb
                        time:   [7.9271 µs 8.8390 µs 9.7301 µs]

Benching get_u64_1_entries_rocksdb
get_u64_1_entries_rocksdb
                        time:   [7.8072 µs 9.4391 µs 11.672 µs]

Benching get.or_use_u64_1_entries_rocksdb
get.or_use_u64_1_entries_rocksdb
                        time:   [10.476 µs 13.608 µs 16.395 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

`u64` - 1_000_000 entries

contains_u64_1000000_entries_rocksdb
                        time:   [14.524 µs 16.045 µs 17.262 µs]

Benching get_u64_1000000_entries_rocksdb
get_u64_1000000_entries_rocksdb
                        time:   [12.859 µs 14.766 µs 17.214 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching get.or_use_u64_1000000_entries_rocksdb
get.or_use_u64_1000000_entries_rocksdb
                        time:   [13.454 µs 16.849 µs 23.615 µs]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) high mild
  1 (10.00%) high severe

iamalwaysuncomfortable

As requested by @d0cd benchmarks of large keys were performed of get, get.or_use and contains

Analysis of the data

The worst case seen was for this mapping (keys and values of 2048 bytes) where access times were in the range of 645.68 µs or 0.64568 ms.

mapping m_field_array_depth_2_length_8:
    key as [[[field; 8u32]; 8u32]; 8u32].public;
    value as [[[field; 8u32]; 8u32]; 8u32].public;

Pricing at the 100 credits per 1 second of runtime standard, this would lead to a price of 64_568 microcredits that this call should be priced at if the above benchmarks are to be believed.

Initial Reasoning about the Results

Doing a napkin correlation with the results posted above, it seems that large key sizes ARE a significant contributor to the GET speeds and contribute moreso to the speed of access than large values do. Reasoning about this knowing LSM-trees are data structure behind RocksDB storage, it's likely that large keys create serious write and read amplification over much smaller key sizes.

Impacts

So what does this mean for pricing? We do already price per byte of the Key, so this byte size would be accounted for in the pricing as contains, get, get.or_use pricing is priced with the formula MAPPING_BASE_COST + KEY_BYTES*MAPPING_PER_BYTE_COST.

Thus, @evanmarshall @fulltimemike several explicit expectation tests of large key sizes to ENSURE this is being accounted in the pricing for should be done.

However, the question is BASE cost here and it seems that for smaller-byte-keys basal runtime is ~0.000017 seconds. Multiplied by 100 credits per runtime and converting that to microcredits (i.e. 0.000017100 credits1_000_000) = 1700. Thus it seems to me that the base price might be priced anywhere from 1_000 to 2_000 credits.

The other major question is what will happen one we lower the price. It's likely that GET usage will go up and so our worry won't be how long each individual call takes, it'll be the fact that there are many many more GETS on the network. To account for this, perhaps the compromise is indeed the pricing range of 1_000 and 2_000.

Ultimately this is a tradeoff analysis between inviting people to use the chain for interesting applications and safety, and so I'd invite everyone else involved in this discussion to comment on this tradeoff.

d0cd · 2024-11-08T19:54:24Z

@iamalwaysuncomfortable thanks for adding the additional data points!
I would agree that the base cost should be around 1000 to 2000.
The reason is that gets and contains are approximately ~(2-3) times as slow when the mapping is "full" (roughly 1M entries). This is assuming that we believe that 1M entries is a reasonable representation of a production-sized mapping.

raychu86 · 2024-11-08T20:15:41Z

Great analysis @iamalwaysuncomfortable!

In general, I would have liked a larger effort for repricing overall costs via an ARC rather than one-off PRs; especially since it's clear that accurate pricing is more of a moving target based on current network assumptions.

If there is a strong push to land this PR in the next release, I would advocate for a more conservative change as suggested by @iamalwaysuncomfortable and @d0cd. Pricing down in the future is much easier to do than pricing up.

evanmarshall · 2024-11-09T11:32:10Z

We can do a more conservative approach to reducing the fee.

I do think we should agree on a standard machine for running benchmarks. I'd suggest a c7a.8xlarge from AWS (32 core, 64GB memory) as it slightly undershoots the snarkOS client recommendations.

Running the benchmarks on this machine returns much lower numbers:

Mapping Name: m_field, Mapping Type: field, Mapping Length: 0
Inserting key #0 into m_field...
Benching contains_field_1_entries_rocksdb
contains_field_1_entries_rocksdb
                        time:   [3.4270 µs 3.4316 µs 3.4360 µs]

Benching get_field_1_entries_rocksdb
get_field_1_entries_rocksdb
                        time:   [3.6822 µs 3.6882 µs 3.6956 µs]

Benching get.or_use_field_1_entries_rocksdb
get.or_use_field_1_entries_rocksdb
                        time:   [3.8189 µs 3.8258 µs 3.8350 µs]

Benching contains_field_1000000_entries_rocksdb
contains_field_1000000_entries_rocksdb
                        time:   [13.748 µs 13.763 µs 13.787 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe

Benching get_field_1000000_entries_rocksdb
get_field_1000000_entries_rocksdb
                        time:   [14.028 µs 14.058 µs 14.087 µs]

Benching get.or_use_field_1000000_entries_rocksdb
get.or_use_field_1000000_entries_rocksdb
                        time:   [14.312 µs 14.329 µs 14.343 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Mapping Name: m_u64, Mapping Type: u64, Mapping Length: 0

Benching contains_u64_1_entries_rocksdb
contains_u64_1_entries_rocksdb
                        time:   [4.5086 µs 4.5123 µs 4.5142 µs]

Benching get_u64_1_entries_rocksdb
get_u64_1_entries_rocksdb
                        time:   [4.7126 µs 4.7198 µs 4.7296 µs]

Benching get.or_use_u64_1_entries_rocksdb
get.or_use_u64_1_entries_rocksdb
                        time:   [4.8637 µs 4.8681 µs 4.8720 µs]

contains_u64_1000000_entries_rocksdb
                        time:   [11.210 µs 11.219 µs 11.234 µs]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benching get_u64_1000000_entries_rocksdb
get_u64_1000000_entries_rocksdb
                        time:   [11.369 µs 11.387 µs 11.413 µs]

Benching get.or_use_u64_1000000_entries_rocksdb
get.or_use_u64_1000000_entries_rocksdb
                        time:   [11.831 µs 11.842 µs 11.859 µs]

…ate-mapping-base-cost

vicsn

Awesome, thank you for adding the tests!

Some CI failing, e.g. due to clippy.
Others failing as expected due to CI resource issues, if you think some of those may be impacted you can run the tests locally.

synthesizer/src/vm/execute.rs

synthesizer/process/src/cost.rs

synthesizer/src/vm/execute.rs

synthesizer/process/src/cost.rs

synthesizer/src/vm/execute.rs

…alues

synthesizer/process/src/cost.rs

raychu86 · 2024-11-11T22:56:06Z

Have any devnets been run with the --test feature + tx generation?

We need to simulate the migration for fee requirements s.t.:

the transactions with the original fee amounts are valid before and after the V2 consensus block height.
The transactions with the reduced fee amount are valid only after V2 consensus block height.

iamalwaysuncomfortable

In general, I would have liked a larger effort for repricing overall costs via an ARC rather than one-off PRs; especially since it's clear that accurate pricing is more of a moving target based on current network assumptions.

I echo this sentiment. I think another Arc needs to be introduced for a standard migration path as well. My fear about versioned constants is that they're ultimately a bit of a hardcoded migration path and therefore technical debt. However we don't have a better answer right now and so this is what we proceed with.

I believe the current one-off price migration is worthy to include in testnet and canary testing since it's a core opcode.

We should start preparing some work on a longer term migration (and potentially work on true Ratifications?) path that standardizes where nodes and other consumers of SnarkVM check for migration information and what constant sets should be provided.

Something like check_migration_paths that checks perhaps an enum of potential change areas, change timing (via block height) and handles the response to those.

Secondly, I do think a next immediate priority AFTER this migration is start examining price structures generally. We do want to make the chain accessible for builders to build applications with the chain, but significant analysis of what it means for pricing to be accessible and what a pricing model would be for encouraging this while discouraging chain abuse is key.

synthesizer/process/src/cost.rs

synthesizer/src/vm/verify.rs

raychu86 · 2024-11-12T17:31:15Z

Have any devnets been run with the --test feature + tx generation?

I ran a quick Devnet with the following instructions:

Use this PR's imports in snarkOS git = "https://github.com/demox-labs/snarkVM.git" rev = "a5847df"
Add "test" to snarkVM features.
Run cargo install --path .
Run devnet with ./devnet.sh on testnet
Generate and send transactions.
Observe sent transactions and the validator generated transactions to monitor fees.

Fees from transactions generated after block 10 did seem to transition to lower fees correctly. However, this should be done again (and confirmed in PR comments) after the requested changes have been made.

evanmarshall · 2024-11-13T03:04:16Z

I've updated to all PR feedback. I'm running a devnet now and I'll update this comment when everything looks good.

Edit:

I've tested on a devnet. I ran the tests as suggested by @raychu86 . Additionally, I deployed a program with nested calls (the program, nested_call.aleo from the execute.rs tests) and saw the fees lower as expected post the consensus v2 height.

zkxuerb

LGTM

vicsn

Sorry for the late substantial new review, I was sick the past few days.

The new changes look good to me. Sharing some of my learnings for other reviewers:

Just for context, this recently merged PR stores deployments to disk. Upon loading a deployment and after a restart it does construct fresh new Stacks with updated finalize fee. credits.aleo is kept in memory at all times.
Validators will have consistent views on fees:
- validator on V1 height without this PR: believes the old fees are correct.
- validator on V1 height with this PR: believes the old fees are correct by using execution_cost_v1 and skipping the cached value in Stack.
- honest validators on V2 height are assumed to have this PR and believe the new fees are correct by using execution_cost_v2 and the updated value in Stack.
For future migrations we should also consider whether to evict the partially_verified_transaction cache. For this PR its not relevant because we don't cache fee verification and moreover this PR is a strict loosening of the validation criteria.
I will prioritize improving the unit testing framework latest in Q1 2025, which currently is very slow by regenerating so much duplicate data, and its also hard to see at which "level" of abstraction adjustments have to be tested exactly.

Reduce MAPPING_BASE_COST from 10_000 to 500

37e9043

raychu86 added 7 commits October 31, 2024 17:29

Introduce block_reward_v1 and block_reward_v2

e1309dd

Introduce CONSENSUS_V2_HEIGHT to network traits

7bde13d

Use new block_reward impl

6010a29

clippy

ebf7075

Introduce coinbase_reward_v2

54c6b79

Use new coinbase_reward impl

a3c3023

nits

9098995

iamalwaysuncomfortable reviewed Nov 5, 2024

View reviewed changes

raychu86 added 2 commits November 5, 2024 16:08

Update CONSENSUS_V2_HEIGHTS

47d80b0

Update documentation

44740f0

iamalwaysuncomfortable reviewed Nov 8, 2024

View reviewed changes

evanmarshall added 3 commits November 11, 2024 11:27

Merge remote-tracking branch 'provable/feat/block_reward_v2' into upd…

7376744

…ate-mapping-base-cost

Add fee migration to lower fees

91af2fd

Add tests for fee migration

d9602c5

vicsn requested changes Nov 11, 2024

View reviewed changes

evanmarshall added 2 commits November 12, 2024 01:27

Address pr feedback & fix clippy errors

06b39f7

Fix fee migration to use old fees for imports instead of new cached v…

a5847df

…alues

raychu86 reviewed Nov 11, 2024

View reviewed changes

synthesizer/process/src/cost.rs Outdated Show resolved Hide resolved

raychu86 reviewed Nov 11, 2024

View reviewed changes

synthesizer/process/src/cost.rs Outdated Show resolved Hide resolved

iamalwaysuncomfortable reviewed Nov 12, 2024

View reviewed changes

zosorock added enhancement New feature or request v1.1.4 Canary release v1.1.4 and removed v1.1.4 Canary release v1.1.4 labels Nov 12, 2024

raychu86 reviewed Nov 12, 2024

View reviewed changes

synthesizer/process/src/cost.rs Outdated Show resolved Hide resolved

zosorock requested review from lukenewman, niklaslong, asharma13524, fulltimemike, zkxuerb, zklimaleo and a team November 12, 2024 17:10

raychu86 reviewed Nov 12, 2024

View reviewed changes

synthesizer/src/vm/verify.rs Outdated Show resolved Hide resolved

Address pr feedback

3da00b4

Fix verify tx test

4cdb52f

zkxuerb approved these changes Nov 13, 2024

View reviewed changes

vicsn approved these changes Nov 13, 2024

View reviewed changes

zosorock merged commit 79402b5 into ProvableHQ:staging Nov 13, 2024
84 checks passed

zosorock changed the title ~~Reduce MAPPING_BASE_COST from 10_000 to 500~~ Reduce MAPPING_BASE_COST from 10_000 to 1_500 Nov 13, 2024

zosorock mentioned this pull request Nov 13, 2024

Canary release week 24.46 - v1.1.4 ProvableHQ/snarkOS#3436

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce MAPPING_BASE_COST from 10_000 to 1_500 #2564

Reduce MAPPING_BASE_COST from 10_000 to 1_500 #2564

evanmarshall commented Oct 25, 2024

vicsn commented Oct 25, 2024

evanmarshall commented Oct 25, 2024

vicsn commented Oct 25, 2024

raychu86 commented Oct 25, 2024

raychu86 commented Oct 25, 2024

evanmarshall commented Oct 25, 2024

vicsn commented Oct 28, 2024 •

edited

Loading

d0cd commented Oct 31, 2024 •

edited

Loading

iamalwaysuncomfortable left a comment •

edited

Loading

evanmarshall commented Nov 6, 2024

d0cd commented Nov 6, 2024 •

edited

Loading

evanmarshall commented Nov 6, 2024

iamalwaysuncomfortable commented Nov 8, 2024 •

edited

Loading

iamalwaysuncomfortable left a comment •

edited

Loading

d0cd commented Nov 8, 2024

raychu86 commented Nov 8, 2024

evanmarshall commented Nov 9, 2024

vicsn left a comment

raychu86 commented Nov 11, 2024

iamalwaysuncomfortable left a comment

raychu86 commented Nov 12, 2024 •

edited

Loading

evanmarshall commented Nov 13, 2024 •

edited

Loading

zkxuerb left a comment

vicsn left a comment

Reduce MAPPING_BASE_COST from 10_000 to 1_500 #2564

Reduce MAPPING_BASE_COST from 10_000 to 1_500 #2564

Conversation

evanmarshall commented Oct 25, 2024

Lower GET, GET_OR_USE, & CONTAINS finalize fees

Previous work

New Benchmarks Approach

Results

Benchmark Summary

Conclusions

Safety

Impact

vicsn commented Oct 25, 2024

evanmarshall commented Oct 25, 2024

vicsn commented Oct 25, 2024

raychu86 commented Oct 25, 2024

raychu86 commented Oct 25, 2024

evanmarshall commented Oct 25, 2024

vicsn commented Oct 28, 2024 • edited Loading

d0cd commented Oct 31, 2024 • edited Loading

iamalwaysuncomfortable left a comment • edited Loading

Choose a reason for hiding this comment

evanmarshall commented Nov 6, 2024

d0cd commented Nov 6, 2024 • edited Loading

evanmarshall commented Nov 6, 2024

iamalwaysuncomfortable commented Nov 8, 2024 • edited Loading

Mappings Benched

Setup Notes

Benchmark Code

Results

Mappings with Large Keys & Values

Atomic Aleo Instructions (field, u64)

Raw Results

Large Key & Value Types

[[[field; 4u32]; 4u32]; 4u32] - 1 entry

[[[field; 4u32]; 4u32]; 4u32] - 1_000_000 entries

[[[field; 8u32]; 8u32]; 8u32] - 1 entry

[[[field; 8u32]; 8u32]; 8u32] - 1_000_000 entries

Atomic Key & Value Types

field 1 - entry

field - 1_000_000 entries

u64 - 1 entry

u64 - 1_000_000 entries

iamalwaysuncomfortable left a comment • edited Loading

Choose a reason for hiding this comment

Analysis of the data

Initial Reasoning about the Results

Impacts

d0cd commented Nov 8, 2024

raychu86 commented Nov 8, 2024

evanmarshall commented Nov 9, 2024

vicsn left a comment

Choose a reason for hiding this comment

raychu86 commented Nov 11, 2024

iamalwaysuncomfortable left a comment

Choose a reason for hiding this comment

raychu86 commented Nov 12, 2024 • edited Loading

evanmarshall commented Nov 13, 2024 • edited Loading

zkxuerb left a comment

Choose a reason for hiding this comment

vicsn left a comment

Choose a reason for hiding this comment

vicsn commented Oct 28, 2024 •

edited

Loading

d0cd commented Oct 31, 2024 •

edited

Loading

iamalwaysuncomfortable left a comment •

edited

Loading

d0cd commented Nov 6, 2024 •

edited

Loading

iamalwaysuncomfortable commented Nov 8, 2024 •

edited

Loading

Atomic Aleo Instructions (`field`, `u64`)

`[[[field; 4u32]; 4u32]; 4u32]` - 1 entry

`[[[field; 4u32]; 4u32]; 4u32]` - 1_000_000 entries

`[[[field; 8u32]; 8u32]; 8u32]` - 1 entry

`[[[field; 8u32]; 8u32]; 8u32]` - 1_000_000 entries

`field` 1 - entry

`field` - 1_000_000 entries

`u64` - 1 entry

`u64` - 1_000_000 entries

iamalwaysuncomfortable left a comment •

edited

Loading

raychu86 commented Nov 12, 2024 •

edited

Loading

evanmarshall commented Nov 13, 2024 •

edited

Loading