Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

pvf: ensure enough stack space #5712

Merged
merged 3 commits into from
Jun 24, 2022
Merged

pvf: ensure enough stack space #5712

merged 3 commits into from
Jun 24, 2022

Conversation

pepyakin
Copy link
Contributor

This PR fixes a potential stack overflow.

Exploiting this is not trivial. We impose a limit of 65536 logical items on the wasm value stack. If we assume that each value was spilled on the stack (which is not trivial to achieve), then that would be only 512 KiB1. In case this was exploited the process which hosts the execution worker will be aborted and the PVF host anticipates that2.

Because of not trivial exploitation I've not included the test. In order to test locally, I had to lower the limits to actually trigger this condition.

Footnotes

  1. which makes me think that this number could be bumped.

  2. although it can still be problematic if core dumps are enabled.

@pepyakin pepyakin added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. A1-needsburnin D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit. labels Jun 21, 2022
@pepyakin pepyakin requested review from koute and eskimor June 21, 2022 18:18
node/core/pvf/Cargo.toml Outdated Show resolved Hide resolved
node/core/pvf/Cargo.toml Outdated Show resolved Hide resolved
node/core/pvf/src/error.rs Outdated Show resolved Hide resolved
node/core/pvf/src/executor_intf.rs Outdated Show resolved Hide resolved
Co-authored-by: Andronik <write@reusable.software>
Comment on lines 151 to 153
// The simplest way to fix that is to spawn a new thread with the desired stack limit. One may
// think that it may be too expensive, but I disagree. The cost of creating a new thread is
// measured in microseconds, negligible compared to other costs.
Copy link
Contributor

@koute koute Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be made essentially free if we'd just keep the threads around instead of spawning them from scratch every time, e.g. by constructing a rayon::ThreadPool and using its scope to spawn the task (so the code wouldn't even have to change significantly). But as you said it might not matter considering other costs.

Also, it's good to remember that spawning a new thread has also other indirect costs besides the cost of just creating a new thread, e.g. assuming that full 256MB of stack space could be used from within WASM it'd mean that (since we're spawning a new thread from scratch every time) those 256MB of stack space would have to be mmap'd (and zero`d) from scratch every time; there are also a few other cases where it affects things indirectly (e.g. by triggering process-wide kernel locks whose effect you can only see if there's something else running in the process concurrently). So a general assertion of "creating new threads is cheap" might be somewhat misleading depending on the situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will answer here @eskimor , @ordian to kill two birds with one stone, and also will provide more context for @koute :

How I see it work: a stack would be allocated with mmap, yes. However, it will allocate only a mapping. The pages will be not backed by anything until they are first accessed. They are just mappings at that point and will stay that until the first access. The zeroing will also happen then. So, it's not precisely "from scratch every time". Moreover, since we are talking about Linux, glibc has a caching layer that recycles stacks of the same size, AFAIK. Between recycles, I expect madvise(DONTNEED) to be called. Thus, the memory will be marked as free and can be reused as needed and the zeroing will happen after the first access.

However, we can assume that in the worst case all of the stack will be dirty.

In the "spawn-and-forget" approach, this will release the memory between the executions.

In the thread pool approach, the stacks are not returned and not cleared. That means that if at least one execution dirtied 256 MiB of stack then it will be reserved for the rest of the lifetime of that thread. That also implies the stack won't have to be zeroed which improves performance and security-wise it's OK. Besides that, to my knowledge, stacks are mmaped without any special flags like MAP_LOCKED which would allow it to be swapped out. Thanks to that, if there was memory pressure, I would expect the unused parts will gradually be swapped out. On the other hand, accessing those pages leads to restoring from swap and that can be a hit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, the ideal approach would be to use a thread pool and then madvise(DONT_NEED) the upper portion of the stack. That allows us to not bother about the stack clearing during the execution of benign PVFs1, but in the degenerate cases, we will only have to zero pages instead of swapping them in and out.

Footnotes

  1. Remember, we are reserving so much stack space only to account for the cases where a malicious PVF managed to overstep our wasm deterministic stack instrumentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that this limit should essentially be never hit in normal circumstances, and even if it's hit with the default limit (from what you've said in the other comment) of only 2 threads it'd be only ~512MB in the worst case. I think that's totally fine as-is.

Copy link
Contributor

@koute koute left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really like to see some numbers whether this has any performance impact, but otherwise looks good to me.

Copy link
Member

@eskimor eskimor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sensible. Burnin on Versi would be good before merge. (Just to make sure no crazy amounts of physical memory are acquired or stuff like that)

@eskimor
Copy link
Member

eskimor commented Jun 22, 2022

@tdimitrov can you take care of burning this in on a validator group on Versi please?

@pepyakin
Copy link
Contributor Author

FWIW, just pushed a change with a thread pool as requested by @koute.

@tdimitrov
Copy link
Contributor

tdimitrov commented Jun 23, 2022

@tdimitrov can you take care of burning this in on a validator group on Versi please?

Sure! @pepyakin anything specific to look for in the logs/metrics? (besides the memory usage of course)

@pepyakin
Copy link
Contributor Author

Can't think of anything else besides that.

@eskimor
Copy link
Member

eskimor commented Jun 23, 2022

Yeah, if nodes don't crash and memory usage does not spike - everything should be fine.

@tdimitrov
Copy link
Contributor

I don't see anything suspicious in terms of crashes/memory usage:

image

@eskimor eskimor merged commit 88e75c6 into master Jun 24, 2022
@eskimor eskimor deleted the pep-pvf-stk-ovf branch June 24, 2022 11:16
@eskimor
Copy link
Member

eskimor commented Jun 24, 2022

Thanks @tdimitrov !

ordian added a commit that referenced this pull request Jun 25, 2022
* master:
  Bump `wasmtime` to 0.38.0 and `zstd` to 0.11.2 (companion for substrate#11720) (#5707)
  pvf: ensure enough stack space (#5712)
  Bump generic-array from 0.12.3 to 0.12.4 in /bridges/fuzz/storage-proof (#5648)
  pvf: unignore `terminates_on_timeout` test (#5722)
ordian added a commit that referenced this pull request Jun 28, 2022
* master:
  zombienet: try to fix parachains upgrade test (#5724)
  Update dependencies (companion for substrate#11722) (#5731)
  Update metric name and doc (#5716)
  Bump reqwest from 0.11.10 to 0.11.11 (#5732)
  add release-engineering to CI files' reviewers (#5733)
  Bump parity-scale-codec from 3.1.2 to 3.1.5 (#5720)
  Add checklist item (#5715)
  Fix 5560: add support for a new `staking-miner info` command (#5577)
  Bump `wasmtime` to 0.38.0 and `zstd` to 0.11.2 (companion for substrate#11720) (#5707)
  pvf: ensure enough stack space (#5712)
  Bump generic-array from 0.12.3 to 0.12.4 in /bridges/fuzz/storage-proof (#5648)
  pvf: unignore `terminates_on_timeout` test (#5722)
  Bump proc-macro2 from 1.0.39 to 1.0.40 (#5719)
  pass $COMPANION_OVERRIDES to check_dependent_project (#5708)
  Bump thread_local from 1.1.0 to 1.1.4 in /bridges/fuzz/storage-proof (#5687)
  Bump quote from 1.0.18 to 1.0.19 (#5700)
  Rococo: add new pallet-beefy-mmr API (companion for substrate#11406) (#5516)
  Update metric before bailing out (#5706)
  Add publish docker staking-miner (#5710)
al3mart pushed a commit that referenced this pull request Jul 14, 2022
* pvf: ensure enough stack space

* fix typos

Co-authored-by: Andronik <write@reusable.software>

* Use rayon to cache the thread

Co-authored-by: Andronik <write@reusable.software>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants