near-vm: use a recycling pool of shared code memories instead of a in-memory cache of loaded artifacts #9244

nagisa · 2023-06-22T11:05:16Z

This gives us potential to manage our memory consumption and worst-case function call expense in a much better managed way.

This may, however be a decrease in continuous param estimator outputs for function call costs, depending on how and what workload it tests.

nagisa · 2023-06-22T11:08:44Z

As I was preparing this PR, I remembered there is another kind of memory that would benefit from the same sort of treatment – the wasm memory memory. These are allocated for each instance, but are also a little more tricky in that they have strict zeroing requirements and possibly other changes around trapping on oob accesses would be good to make too.

jakmeier

Tricky code, and that in parts of the code base I never really looked at...

I've done a review now and left some questions. But I'd be more comfortable if @Ekleog-NEAR would also take a look.

runtime/near-vm/engine/src/universal/builder.rs

runtime/near-vm/engine/src/universal/code_memory.rs

runtime/near-vm-runner/src/near_vm_runner.rs

Ekleog-NEAR

LGTM! Though there are quite a few comments to handle, so I’ll have another look once the discussions calm down :)

runtime/near-vm-runner/src/near_vm_runner.rs

runtime/near-vm/engine/src/universal/builder.rs

runtime/near-vm/engine/src/universal/code_memory.rs

Ekleog-NEAR · 2023-06-23T12:42:02Z

runtime/near-vm/engine/src/universal/code_memory.rs

@@ -2,170 +2,254 @@
 // Attributions: https://github.com/wasmerio/wasmer/blob/master/ATTRIBUTIONS.md

 //! Memory management for executable code.
-use near_vm_compiler::{CustomSectionRef, FunctionBodyRef};
-use near_vm_vm::{Mmap, VMFunctionBody};


As you say it’d be nice, later, to handle wasm memory in the same way; as it’ll allow us to remove ~300loc of custom mmap implementation :)

runtime/near-vm/engine/src/universal/code_memory.rs

runtime/near-vm/engine/src/universal/engine.rs

Ekleog-NEAR

LGTM! I think the only comment that needs addressing before landing is the one on alignment computation, though it’d also be nice to make sure the kernel doesn’t just mmap overcommitted memory by writing to one byte each page_size of the code memories upon creation :)

Ekleog-NEAR · 2023-06-28T14:09:45Z

runtime/near-vm/engine/src/universal/builder.rs

-            super::LimitedMemoryPool::new(1, 10 * rustix::param::page_size()).unwrap()
-        });
+        let pool =
+            self.pool.unwrap_or_else(|| panic!("Universal::code_memory_pool was not set up!"));


Why not .expect("...")?

It doesn’t read right! The text would need to be negated, but that then doesn’t make sense to the person seeing this message.

Ekleog-NEAR · 2023-06-28T14:11:53Z

runtime/near-vm/engine/src/universal/code_memory.rs

-    executable_size: usize,
+    /// Addresses `0..executable_end` contain executable memory.
+    ///
+    /// This is always be page-aligned.


Nit: the data section must start page-aligned, we don’t have any code that makes executable_end itself be page-aligned

Ekleog-NEAR · 2023-06-28T14:16:17Z

runtime/near-vm-runner/src/near_vm_runner.rs

                // plausibly be. This is not necessarily true – there may be WebAssembly
                // instructions that expand by more than 4 times in terms of instruction size after
                // a conversion to x86_64, In that case a re-allocation will occur and executing
                // that particular function call will be slower. Not to mention there isn't a
                // strong guarantee on the upper bound of the memory that the contract runtime may
                // require.
-                LimitedMemoryPool::new(32, 16 * 1024 * 1024).unwrap_or_else(|e| {
+                LimitedMemoryPool::new(8, 64 * 1024 * 1024).unwrap_or_else(|e| {


Especially with 64MiB mmaps, it might make sense to poke each page of each mapping to make sure the kernel didn’t overcommit them. It’d make initial startup a tad slower, but we’d avoid the variance of pagefaulting and needing to reallocate

Faulting in maps like this is somewhat of a bogus thing to do. Kernel can freely swap out pages from memory even after they have been faulted in. So in a memory-starved situations the same cliff still exists. The work-around to this behaviour is to use mlock, however using that opens its own can of worms – systems aren’t typically set up for a huge amount of locked memory. For instance on my system 8MiB is the maximum you can mlock without further (rlimit) changes.

Good point, I had thought of the kernel just not allocating the memory, not of the kernel swapping it out.

Ekleog-NEAR · 2023-06-28T14:19:33Z

runtime/near-vm/engine/src/universal/engine.rs

-            .fold(0, |acc, data| round_up(acc + data.bytes.len(), DATA_SECTION_ALIGNMENT.into()));
+        let total_len = 0;
+        let total_len = function_bodies.iter().fold(total_len, |acc, func| {
+            round_up(acc + function_allocation_size(*func), ARCH_FUNCTION_ALIGNMENT.into())


I don’t think this is correct. Here you’re first adding the allocation and then rounding up to ARCH, while ARCH is the alignment for the allocation itself and not for the next allocation. In practice it’ll work because we only have a single alignment type before aligning to page and then switching to another alignment type, but I think the proper way of doing this (more future-proof should we add other alignments later on) would be:

round_up(acc, ARCH_FUNCTION_ALIGNMENT.into()) + function_allocation_size(*func)

Hm, so this really depends on who does one considers the padding bytes to belong to.

In assembly due to how labels work the padding bytes will usually appear as-if they belong to the preceding function:

func1: add div ret ud2 ; padding ud2 ; padding func2: ...

Which sort-of matches the current code a little better. But I don’t have a strong opinion either way.

My issue with the initial writing is, if we ever were to have first items with align of 8 and then items with align of 16, then the alignment for the first of the items of the second type would be wrong. Because we align after func1 with the alignment of func1 and not of func2, to reuse your example.

Creating new memory maps is expensive, and so is changing their permissions. Doing this every time a function is called is expensive, and we don't really _need_ to do it with just a fairly basic pool of reusable resources (so long as they are fairly meticulously managed).

Taking out `source_pool` before a failure means that the memory won't be returned to the pool even though it still may be usable?

… of a in-memory cache of loaded artifacts (#9244)" This reverts commit ad67e6b.

… of a in-memory cache of loaded artifacts (near#9244)" This reverts commit ad67e6b.

This possibly mitigates the RocksDB slowness of loading contracts from the storage after the another in-memory cache fronting RocksDB was removed in #9244. As part of the 1.35 release[^1] it turned out that this has caused a significant bimodality in chunk application times – mostly attributed to storage layer. This, we suspected, had started causing increased validator kick out numbers. Original LRU cache fronting RocksDB stored 128 units of loaded contracts. At a rough estimate of 4MiB of machine code per contract, this comes out to 512MiB. Of course the two caches are still very different – this stores uncompressed blocks containing compiled machine code. The previous cache stored contracts loaded into memory. I'm confident that the overhead of loading the contract code into memory is not particularly notable at this point, though, so the difference is something we can live with. [^1]: https://near.zulipchat.com/#narrow/stream/297873-pagoda.2Fnode/topic/release.201.2E36/near/379169243

This possibly mitigates the RocksDB slowness of loading contracts from the storage after the another in-memory cache fronting RocksDB was removed in near#9244. As part of the 1.35 release[^1] it turned out that this has caused a significant bimodality in chunk application times – mostly attributed to storage layer. This, we suspected, had started causing increased validator kick out numbers. Original LRU cache fronting RocksDB stored 128 units of loaded contracts. At a rough estimate of 4MiB of machine code per contract, this comes out to 512MiB. Of course the two caches are still very different – this stores uncompressed blocks containing compiled machine code. The previous cache stored contracts loaded into memory. I'm confident that the overhead of loading the contract code into memory is not particularly notable at this point, though, so the difference is something we can live with. [^1]: https://near.zulipchat.com/#narrow/stream/297873-pagoda.2Fnode/topic/release.201.2E36/near/379169243

… of a in-memory cache of loaded artifacts (near#9244)" This reverts commit ad67e6b.

… of a in-memory cache of loaded artifacts (#9244)" (#10788) This reverts commit ad67e6b.

…d of a in-memory cache of loaded artifacts (near#9244)" (near#10788) This reverts commit bb02713.

…d of a in-memory cache of loaded artifacts (#9244)" (#10788) This reverts commit bb02713.

nagisa requested review from jakmeier and Ekleog-NEAR June 22, 2023 11:05

nagisa requested a review from a team as a code owner June 22, 2023 11:05

jakmeier reviewed Jun 22, 2023

View reviewed changes

Ekleog-NEAR reviewed Jun 23, 2023

View reviewed changes

Ekleog-NEAR approved these changes Jun 28, 2023

View reviewed changes

nagisa added the S-automerge label Jun 28, 2023

Ekleog-NEAR approved these changes Jun 28, 2023

View reviewed changes

nagisa added 9 commits June 28, 2023 21:07

near-vm: set up a code memory pool and delete memory cache

0cc2d88

fmt

36aadef

clippy

01df046

review part 1

2b004ad

review 2

5556ca5

review 3

fd047f0

fix resize operation when munmap fails

ea17607

Taking out `source_pool` before a failure means that the memory won't be returned to the pool even though it still may be usable?

review 4

8c458af

nagisa force-pushed the nagisa/codememorypool branch from 43d1e69 to 8c458af Compare June 28, 2023 18:08

Merge refs/heads/master into nagisa/codememorypool

d9b5e36

near-bulldozer bot merged commit ad67e6b into master Jun 28, 2023

near-bulldozer bot deleted the nagisa/codememorypool branch June 28, 2023 20:28

ppca added a commit that referenced this pull request Aug 15, 2023

Revert "near-vm: use a recycling pool of shared code memories instead…

01c386e

… of a in-memory cache of loaded artifacts (#9244)" This reverts commit ad67e6b.

nagisa mentioned this pull request Sep 1, 2023

Base gas costs for function calls are too high #7741

Open

nagisa added a commit that referenced this pull request Sep 13, 2023

Revert "near-vm: use a recycling pool of shared code memories instead…

4d5e246

… of a in-memory cache of loaded artifacts (#9244)" This reverts commit ad67e6b.

ppca pushed a commit that referenced this pull request Sep 19, 2023

Revert "near-vm: use a recycling pool of shared code memories instead…

454b824

… of a in-memory cache of loaded artifacts (#9244)" This reverts commit ad67e6b.

robin-near pushed a commit to robin-near/nearcore that referenced this pull request Oct 22, 2023

Revert "near-vm: use a recycling pool of shared code memories instead…

07aeefc

… of a in-memory cache of loaded artifacts (near#9244)" This reverts commit ad67e6b.

This was referenced Jan 10, 2024

store: increase the size of the CachedContractCode column cache #10402

Closed

Known problems in the 1.37 release #10376

Open

nagisa added a commit to nagisa/nearcore that referenced this pull request Mar 13, 2024

Revert "near-vm: use a recycling pool of shared code memories instead…

c6bc4ed

… of a in-memory cache of loaded artifacts (near#9244)" This reverts commit ad67e6b.

nagisa added a commit that referenced this pull request Mar 14, 2024

Revert "near-vm: use a recycling pool of shared code memories instead…

bb02713

… of a in-memory cache of loaded artifacts (#9244)" (#10788) This reverts commit ad67e6b.

posvyatokum pushed a commit that referenced this pull request Mar 14, 2024

Revert "near-vm: use a recycling pool of shared code memories instead…

1435100

… of a in-memory cache of loaded artifacts (#9244)" (#10788) This reverts commit ad67e6b.

nagisa added a commit to nagisa/nearcore that referenced this pull request Mar 20, 2024

Reapply "near-vm: use a recycling pool of shared code memories instea…

2348cc9

…d of a in-memory cache of loaded artifacts (near#9244)" (near#10788) This reverts commit bb02713.

VanBarbascu pushed a commit that referenced this pull request Mar 26, 2024

Reapply "near-vm: use a recycling pool of shared code memories instea…

6d989ef

…d of a in-memory cache of loaded artifacts (#9244)" (#10788) This reverts commit bb02713.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

near-vm: use a recycling pool of shared code memories instead of a in-memory cache of loaded artifacts #9244

near-vm: use a recycling pool of shared code memories instead of a in-memory cache of loaded artifacts #9244

nagisa commented Jun 22, 2023

nagisa commented Jun 22, 2023

jakmeier left a comment

Ekleog-NEAR left a comment

Ekleog-NEAR Jun 23, 2023

Ekleog-NEAR left a comment

Ekleog-NEAR Jun 28, 2023

nagisa Jun 28, 2023

Ekleog-NEAR Jun 28, 2023

Ekleog-NEAR Jun 28, 2023

nagisa Jun 28, 2023

Ekleog-NEAR Jun 28, 2023

Ekleog-NEAR Jun 28, 2023

nagisa Jun 28, 2023

Ekleog-NEAR Jun 28, 2023 •

edited

Loading

near-vm: use a recycling pool of shared code memories instead of a in-memory cache of loaded artifacts #9244

near-vm: use a recycling pool of shared code memories instead of a in-memory cache of loaded artifacts #9244

Conversation

nagisa commented Jun 22, 2023

nagisa commented Jun 22, 2023

jakmeier left a comment

Choose a reason for hiding this comment

Ekleog-NEAR left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ekleog-NEAR left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ekleog-NEAR Jun 28, 2023 • edited Loading

Choose a reason for hiding this comment

Ekleog-NEAR Jun 28, 2023 •

edited

Loading