fix: Workaround for mmap
crash under Emscripten
#20418
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When running Polars in Pyodide, using
read_csv()
(and possibly other functions reading from file) causes a crash duringmmap
-- a huge chunk of memory is allocated as the length of the file is incorrectly provided tommap
by the file's metadata.This PR implements a workaround that applies only when running under Emscripten so that the file length is derived by seeking to the end of the file, and setting the
mmap
size explicitly in the options before mapping.With this, things work under Pyodide again:
Emscripten is currently working on reimplementing the virtual filesystem from scratch (WasmFS). Once that work has been merged and hits Pyodide, it's likely that we can remove this workaround.
cc @ritchie46