Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] use sbrk instead of emscripten mmap or malloc when loading bytes into heap #100106

Merged
merged 4 commits into from
Apr 9, 2024

Conversation

kg
Copy link
Member

@kg kg commented Mar 21, 2024

Emscripten's implementations of mmap and malloc have a lot of overhead, particularly time spent calling memset to zero brand new pages it got by growing the heap. During startup we lose a lot of time to this, because we have to reserve space for things like the sgen card tables and the contents of all our dll files (they live in the heap for metadata decoding).

This PR tries using sbrk instead in the hot paths for improved startup, and based on my measurements, it makes most of the time spent in memset go away. It's theoretically a breaking change though since someone could have been relying on the ability to _free the result of mono_wasm_load_bytes_into_heap'd preloaded assets, and our code might be relying on the ability to release pages to the "OS" in wasm?

EDIT: PR updated to not replace all our uses of mmap, and only replace load_bytes_into_heap for now - the rest of them will take more work to replace.

@kg kg added the arch-wasm WebAssembly architecture label Mar 21, 2024
Copy link
Contributor

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

@kg
Copy link
Member Author

kg commented Mar 22, 2024

/azp run runtime

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kg
Copy link
Member Author

kg commented Mar 22, 2024

As suspected it looks like mono_vfree does need to work on wasm, even though there's not really much point in handing pages back (the heap can't ever shrink and there's no wasm equivalent of MADV_DONTNEED).

@kg kg changed the title [wasm] use sbrk instead of emscripten mmap where possible [wasm] use sbrk instead of emscripten mmap or malloc when loading bytes into heap Mar 22, 2024
@kg kg marked this pull request as ready for review March 22, 2024 22:59
@kg kg requested review from lewing and pavelsavara as code owners March 22, 2024 22:59
@SingleAccretion
Copy link
Contributor

SingleAccretion commented Mar 25, 2024

mmap and malloc have a lot of overhead, particularly time spent calling memset to zero brand new pages it got by growing the heap

It is expected that the mmap emulation needs to zero pages, but why does malloc do this? It sounds like a bug. malloc doesn't need to return zeroed memory.

@kg
Copy link
Member Author

kg commented Mar 25, 2024

mmap and malloc have a lot of overhead, particularly time spent calling memset to zero brand new pages it got by growing the heap

It is expected that the mmap emulation needs to zero pages, but why does malloc do this? It sounds like a bug. malloc doesn't need to return zeroed memory.

During startup malloc == mmap

EDIT: And looking at another profile, dlcalloc also calls memset, and we use it for our malloc0 (this is another case where it's technically right to memset, but unnecessary if the pages were just created by sbrk)

@kg kg requested review from lambdageek and steveisok as code owners March 27, 2024 01:00
@kg
Copy link
Member Author

kg commented Mar 27, 2024

Experimenting with reconfiguring dlmalloc on wasm to use sbrk instead of emscripten mmap/munmap, since they're fake.

@kg
Copy link
Member Author

kg commented Mar 27, 2024

Since we determined that we shouldn't be using dlmalloc at all, removing the dlmalloc change. It did work, though.

kg added 3 commits March 27, 2024 10:32
…avoid the overhead of emscripten's mmap

HACK: Use sbrk instead of mmap to allocate wasm pages, to skip memset. Disable vfree.
Fix build
Revert mmap experiment
@pavelsavara
Copy link
Member

I can see --zero-filled-memory in wasm-opt args on some of my CI lane, maybe that makes difference ?

Wasm.Console.Bench.Sample -> /__w/1/s/src/mono/sample/wasm/browser-bench/Console/bin/browser-wasm/Wasm.Console.Bench.Sample.dll
  Compiling native assets with /usr/local/emscripten/emsdk/upstream/emscripten/emcc with -Oz. This may take a while ...
  [1/4] corebindings.c -> corebindings.o [took 0.32s]
  [2/4] pinvoke.c -> pinvoke.o [took 0.33s]
  [3/4] driver.c -> driver.o [took 0.34s]
  [4/4] runtime.c -> runtime.o [took 0.36s]
  Linking for initial memory $(WasmInitialHeapSize)=33554432 bytes. Set this msbuild property to change the value.
  Linking with emcc with -O2. This may take a while ...
   "/usr/local/emscripten/emsdk/node/14.18.2_64bit/bin/node" /usr/local/emscripten/emsdk/upstream/emscripten/src/compiler.js /tmp/tmpiu3cpv7n.json --symbols-only
   "/usr/local/emscripten/emsdk/upstream/bin/wasm-ld" -o /__w/1/s/artifacts/obj/mono/Wasm.Console.Bench.Sample/browser.wasm.Release/browser-wasm/wasm/for-build/dotnet.native.wasm /__w/1/s/artifacts/obj/mono/Wasm.Console.Bench.Sample/browser.wasm.Release/browser-wasm/wasm/for-build/pinvoke.o /__w/1/s/artifacts/obj/mono/Wasm.Console.Bench.Sample/browser.wasm.Release/browser-wasm/wasm/for-build/driver.o /__w/1/s/artifacts/obj/mono/Wasm.Console.Bench.Sample/browser.wasm.Release/browser-wasm/wasm/for-build/corebindings.o /__w/1/s/artifacts/obj/mono/Wasm.Console.Bench.Sample/browser.wasm.Release/browser-wasm/wasm/for-build/runtime.o /__w/1/s/artifacts/bin/microsoft.netcore.app.runtime.browser-wasm/Release/runtimes/browser-wasm/native/libicudata.a /__w/1/s/artifacts/bin/microsoft.netcore.app.runtime.browser-wasm/Release/runtimes/browser-wasm/native/libicui18n.a /__w/1/s/artifacts/bin/microsoft.netcore.app.runtime.browser-wasm/Release/runtimes/browser-wasm/native/libicuuc.a /__w/1/s/artifacts/bin/microsoft.netcore.ap...
   "/usr/local/emscripten/emsdk/node/14.18.2_64bit/bin/node" /usr/local/emscripten/emsdk/upstream/emscripten/src/compiler.js /tmp/tmp1vo_lkp4.json
   "/usr/local/emscripten/emsdk/upstream/bin/wasm-opt" --post-emscripten -O2 --low-memory-unused --zero-filled-memory --pass-arg=directize-initial-contents-immutable --strip-producers /__w/1/s/artifacts/obj/mono/Wasm.Console.Bench.Sample/browser.wasm.Release/browser-wasm/wasm/for-build/dotnet.native.wasm -o /__w/1/s/artifacts/obj/mono/
...

@kg kg merged commit 001d60a into dotnet:main Apr 9, 2024
31 of 36 checks passed
@pavelsavara
Copy link
Member

I'm bit sad we didn't see this to pass
image

@kg
Copy link
Member Author

kg commented Apr 9, 2024

I'm bit sad we didn't see this to pass image

It passed a few times before, so I was pretty comfortable not re-running it

matouskozak pushed a commit to matouskozak/runtime that referenced this pull request Apr 30, 2024
…es into heap (dotnet#100106)

Use sbrk to allocate persistent space for large load_bytes_into_heap calls to avoid the startup overhead of emscripten's malloc and memset for the new space
@github-actions github-actions bot locked and limited conversation to collaborators May 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture area-VM-meta-mono
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants