-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PVF worker: consider fork instead of thread for better isolation #574
Comments
If the memory usage in the parent is low, I believe fork should end up being equally cheap then threads. The only difference should be COW semantics on memory pages, but if there was very little (mutable) memory used by the parent then this should not really matter. |
@jpserrat If you are still keen on contributing, this would be a great way to make a big impact. It shouldn't be an overly demanding task, while still being interesting. |
@mrcnski I'm going to start working on this now. |
Here there is a section called "Why not fork?". But I didn't really understand the concerns and am ending for the day. |
My concern about the thread stuff pep mentioned was, what if we go back to multi-threaded execution at some point. A while back we found that multi-threaded had significantly better performance (at the cost of determinism): We also have the memory metrics and cpu time metrics running in separate threads. Those would now have to be on the forked process. But, I think with a separate process we can just have the parent call |
Great points. Regarding your concern, it has a high chance of happening, so I will make this implementation with this in mind, ensuring it's easy to switch if necessary. What do you think? Just to make sure that I got it all right. This change is going to be on the worker executor entrypoint right? When we spawn a new thread |
Here's the problem I see with the threads:
Yep, and also in the prepare worker entrypoint. :) But we can focus on execute right now to keep things simple. |
Sorry for things being up in the air while we figure this out. 😅 I am currently wondering if the threads can be removed:
(I appreciate your help with threads in the past. I just hadn't realized that they are not isolated and therefore not secure.) Unfortunately @s0me0ne-unkn0wn is on vacation and I'm waiting for his input. :) |
No problem, I'm looking at the code to get a better understanding during this time. :) |
Good question! Yeah, |
Now I get it, thanks! |
Ah, you mean how do we stop the child process when it times out? Good point, we still need a thread on the parent side which works in much the same way as |
Sorry for being late to the party. I've read through the discussion but haven't got a strong opinion yet. Here are some thoughts. First of all, threads are not evil by themselves. We were concerned about inter-job isolation, and that's why job processes are better than job threads, but threads running inside a job (and cleaned up with a job) are not something bad by definition. We could still use them if need be. Further, you consider the situation when the malicious code breaks out and falsifies the CPU/memory measurement done by some thread running inside a job. That's a valid point for execution (I still believe it's not so valid for preparation, and anyway, it's solved by the limiting global allocator that doesn't require threads). Yes, that may happen. But is that different from a situation when the code breaks out and falsifies validation results themselves, resulting in the inclusion of an invalid parachain block? IIUC, the latter carries a much higher risk than just reporting invalid CPU/RAM measurements, and it has nothing to do with threads at all. I mean, we shouldn't get rid of threads just to get rid of threads. Observing the resource consumption from the outside is a good idea (we still need to figure out if those measurements would be deterministic enough). We just don't have to push with eliminating threads without a clear need. |
Thanks @s0me0ne-unkn0wn, good points! About the measurements from the child, keeping the attack surface as low as possible is a good idea. If attackers falsify the validation result, that might be detectable if the attack fails on other machines and a dispute is raised - but falsified measurements would not be detectable. So I still say the child should do only the job it's spawned to do, just to avoid providing any additional attack surface however small it may be. About threads, you may be right, I was just concerned about pep's points linked above in "Why not fork". Thinking about it more, it may be preferable that no threads are present while
For CPU time it shouldn't affect determinism, since we are already relying on the kernel to get this stat - doing it from the parent instead should change nothing. For memory, what we have now is a polling-based method which is already not very deterministic at all. Using strace or or your global allocator approach would both be better in this regard. |
I don't think it's a big concern:
|
It does answer my question, thanks. Just one more point about this.
About the concern with thread and fork, as mentioned by @mrcnski the CPU and memory threads will start after the fork, do we need to worry about this? If there are other points of concern with this we can do as @s0me0ne-unkn0wn suggested with |
Oh, right... I hate Linux. 😂 We could check
I don't think But wait... now that we are forking, we can have the forked process set its own CPU time limit right before it runs any untrusted code. I believe it can be done with Great point @jpserrat ⭐️
With the above solution we won't need the CPU time thread. So execute jobs won't need any threads. For prepare jobs, the memory thread can be put on the side of the forked child. We decided that an attacker manipulating this value is not a concern in our threat model. Eventually it might be replaced with a global allocator tracker, but that remains an ongoing area of research. So for now, let's have a memory thread on the forked child and a thread that performs the prepare job, but no CPU time thread. |
In general case, we cannot assume that, but maybe we could just mimic its behavior? The parent process must have all the needed permissions iiuc. It's somewhere over there: https://elixir.bootlin.com/linux/v4.14/source/fs/proc/array.c#L390
That's true, but following your worst-case logic, the untrusted code can break out and reset that limit with |
That seems really hacky. 😂
Not so! Assuming the untrusted code can't escalate its privilege, which is much harder than just running arbitrary code:
|
Okay, I have an idea of how to do that in not so hacky way, but it introduces some overhead. We could build some matrioshka doll.
I believe the only overhead here is the second Just an idea and should be evaluated. |
Interesting idea! I think it has the issue noted above, where we can't get the CPU time with I approve of thinking out-of-the-box and Russian-doll approaches are always a win in my book. But what's wrong with |
Probably nothing's wrong with it. Somehow, I was sure you need root privilege to set the hard limit, but manpage says you can lower the limit without privileges, so it should be okay. I just tend to trust our resource-limiting code more than OS's code 🙂 Because our code is right before our eyes in the first place. But I understand that we shouldn't invent too many wheels as well. |
Phew! @jpserrat I hope everything is clear now. BTW, I'd be more than happy to provide mentorship. Let me know if you would like to get on a call to discuss more, or do some pair programming. My email's on my GitHub profile. |
That would be great @mrcnski, thanks! |
Ah, I think the concerns are not valid because we are calling This means, the forked child job process can have threads if it wants. And I think it will be necessary to monitor CPU time again, because setrusage has a little caveat I didn't know about. We can still use getrusage(CHILDREN) to avoid trusting the time metric reported by the child job process. However, we can no longer rely on the process getting killed when it surpasses the time limit. This means malicious code can take up validator resources up until the (very long) lenient timeout is reached. It's unfortunate, but I don't think it's a security issue. We could leave a followup to investigate e.g. |
Right now each PVF job runs in its own thread, but doing a
fork
may be better:In the end the thread spawning would just be replaced by a fork() call. Reason this might be good: It is just better isolation: The process can now only mess with its own address space and can not possible affect execution of upcoming candidates for example.
Credit: @eskimor
The text was updated successfully, but these errors were encountered: