Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-118209: Add structured exception handling to mmap module #118213

Merged
merged 35 commits into from
May 10, 2024

Conversation

Dobatymo
Copy link
Contributor

@Dobatymo Dobatymo commented Apr 24, 2024

If DONT_USE_SEH is not defined, catch page errors and access violations when reading/writing the mmap pointer and convert it into a Python exception.

Methods done:

  • mmap_read_method
  • mmap_read_byte_method
  • mmap_read_line_method
  • mmap_write_method
  • mmap_write_byte_method
  • mmap_item
  • mmap_subscript
  • mmap_ass_item
  • mmap_ass_subscript
  • mmap_move_method
  • mmap_gfind (hopefully I didn't miss any resource handling here)

Other comments:

  • LsaNtStatusToWinError is equivalent to RtlNtStatusToDosError (see comment below). So I am using this since it's easier to call.
  • one other tiny bug is fixed in mmap_read_line_method. Previously the file pointer would still be advanced in case of a memory allocation error.

Issues with the PR:

  • other methods should be handled as well, but I want to wait on some feedback first
    • mmap_resize_method

I don't know how to fix:

  • mmap_buffer_getbuf: this creates a buffer object without actually reading from memory (yet). This will cause issues when the buffer is read in other parts of the code. The only way to fix this would be make all buffer methods SEH aware also... :(

See

Copy link

cpython-cla-bot bot commented Apr 24, 2024

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Apr 24, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented Apr 24, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link

bedevere-app bot commented Apr 24, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Copy link
Member

@zooba zooba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks good so far, though as this is a draft I'll leave it with you for now.

One other possibility may be to protect PyBytes_FromString internally, which could cut off an entire class of crashes instead of just one. Thoughts?

@Dobatymo
Copy link
Contributor Author

One other possibility may be to protect PyBytes_FromString internally, which could cut off an entire class of crashes instead of just one. Thoughts?

That's a possibility too. However I don't know if there's a big performance impact if the exception handling is used there. PyBytes_FromString seems to be a quite commonly used function. On the other hand, the MapViewOfFile family are the only functions I can find which reference the EXCEPTION_IN_PAGE_ERROR exception.

@zooba
Copy link
Member

zooba commented Apr 24, 2024

I don't know if there's a big performance impact if the exception handling is used there.

Native exceptions are zero cost until an exception is actually thrown, so I wouldn't worry about that. The compiler generates a static table that the OS refers to in order to figure out which handler to invoke.

That said, a number of the other functions you mentioned won't go through that path, so there's probably going to be a need to add it directly into mmap, particularly for writing.

Maybe a helper function to do a protected memcpy is the way to go? That would let us entirely scope the exception handling and make sure we aren't mismanaging any state inside of it (which is the usual issue). It also means we can have a plain implementation on non-Windows (for now) and keep the code that uses it looking fairly consistent.

@Dobatymo
Copy link
Contributor Author

Dobatymo commented Apr 24, 2024

Maybe a helper function to do a protected memcpy is the way to go?

I think that's probably a good idea. However there are still some rogue pointer derefs (like the one-character optimization mentioned above) which can still throw.

EDIT: This is out-of-scope for this PR. But Linux probably has similar issues. I couldn't find good docs regarding the situtation (I am more familiar with Windows), but Linux will probably send some signal like SIGBUS/SIGSEV in this scenario. Are you supposed to handle them when mmapping?

@zooba
Copy link
Member

zooba commented Apr 24, 2024

Linux probably has similar issues. I couldn't find good docs regarding the situtation (I am more familiar with Windows), but Linux will probably send some signal like SIGBUS/SIGSEV in this scenario

I'm in the same position as you, and agree that it's out of scope for this PR. But if we set things up so that it's easy to multiplex per-platform, we'll at least set up someone who does know to be able to do it.

What that means to me is that we want to contain __try/__except to helper functions that don't interact with Python objects at all (you can assume that the caller has "pinned" any memory that gets passed in) and have an error result that basically means "invalid pointer". So if safe_memcpy(...) or safe_read_char(char *dest, const char *src) return < 0 (typical for our APIs), then we can raise an error on all platforms, and eventually maybe they get implementations that are more complex than just *dest = *src; return 0;

@eryksun
Copy link
Contributor

eryksun commented Apr 24, 2024

Linux will probably send some signal like SIGBUS/SIGSEV in this scenario. Are you supposed to handle them when mmapping?

Yes, on POSIX SIGBUS and SIGSEGV are raised synchronously on the executing thread. I mentioned this on the linked issue. I think if this crash scenario is to be handled by the mmap type, it should be addressed on POSIX as well as Windows.

@Dobatymo
Copy link
Contributor Author

Dobatymo commented Apr 25, 2024

I used LsaNtStatusToWinError to convert the error code which should do they same as RtlNtStatusToDosError but easier to load. I created a function call safe_memcpy which returns 0 if it succeeds and -1 otherwise and restructured mmap_read_byte_method and mmap_read_method to use it.

It will be very difficult to handle all the functions in this way though. For example there would need to be additional functions like safe_memchr for mmap_read_line_method and for other functions I need to go even deeper to see if they need any cleanup or can simply be wrapped (like mmap_find_method->mmap_gfind->_PyBytes_Find->stringlib_find->???)

rename HandlePageException to filter_page_exception

Co-authored-by: Eryk Sun <eryksun@gmail.com>
@Dobatymo
Copy link
Contributor Author

By the way regarding the ntstatus to win32 error conversion
I ran

# pip install ctypes-windows-sdk tqdm
from cwinsdk.um.winternl import RtlNtStatusToDosError
from cwinsdk.um.ntsecapi import LsaNtStatusToWinError

from tqdm import trange

for i in trange(-(2**31), 2**31):
    a = RtlNtStatusToDosError(i)
    b = LsaNtStatusToWinError(i)
    if a != b:
        raise AssertionError(f"NTSTATUS={i}, RtlNtStatusToDosError={a}, LsaNtStatusToWinError={b}")

and it shows that the output of both functions is the same for the full range of ints. LsaNtStatusToWinError doesn't have to be manually loaded like RtlNtStatusToDosError though.

@eryksun
Copy link
Contributor

eryksun commented Apr 25, 2024

it shows that the output of both functions is the same for the full range of ints. LsaNtStatusToWinError doesn't have to be manually loaded like RtlNtStatusToDosError though.

I forgot about that one. It's not exactly surprising that it behaves the same. 😉

0:005> u advapi32!LsaNtStatusToWinError l1
ADVAPI32!LsaNtStatusToWinError:
00007ff9`32869740 48ff2569430700  jmp     qword ptr [ADVAPI32!_imp_RtlNtStatusToDosError (00007ff9`328ddab0)]
0:005> dq 00007ff9`328ddab0 l1
00007ff9`328ddab0  00007ff9`32e23140
0:005> ln 00007ff9`32e23140
(00007ff9`32e23140)   ntdll!RtlNtStatusToDosError   |  (00007ff9`32e232e0)   ntdll!RtlSetLastWin32Error
Exact matches:
    ntdll!RtlNtStatusToDosError (void)

Copy link
Member

@zooba zooba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how this is looking. Let us know when you'd like CI run and we can hit the button to get it going, and feel free to start adapting other cases in this file.

Dobatymo and others added 3 commits April 26, 2024 12:27
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Co-authored-by: Eryk Sun <eryksun@gmail.com>
@zooba
Copy link
Member

zooba commented May 7, 2024

!buildbot .windows.

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @zooba for commit 9737f22 🤖

The command will test the builders whose names match following regular expression: .*windows.*

The builders matched are:

  • ARM64 Windows Non-Debug PR
  • AMD64 Windows10 PR
  • AMD64 Windows Server 2022 NoGIL PR
  • AMD64 Windows11 Non-Debug PR
  • AMD64 Windows11 Bigmem PR
  • AMD64 Windows11 Refleaks PR
  • ARM64 Windows PR

@zooba
Copy link
Member

zooba commented May 7, 2024

I'm torn between just merging (to make beta 1) and being more thorough, but I think we should be more thorough. This can easily go in for beta 2, so the only urgency is to make sure we get good test coverage.

@zooba
Copy link
Member

zooba commented May 7, 2024

Looks like we may need to use or switch to __restrict instead of restrict based on the compiler version (or the level of C99 support) - restrict is not one of our listed requirements, and it seems easy enough to avoid:

#if defined(_MSC_VER) && (!defined(__STDC_VERSION__) || (__STDC_VERSION__+0) < 201112L)
#define restrict __restrict
#endif

(Possibly that condition has to be broken up, but I think it ought to work.)

@Dobatymo
Copy link
Contributor Author

Dobatymo commented May 7, 2024

We can also simply remove restrict. I only used it because it was part of the signature for memcpy here for C99 https://en.cppreference.com/w/c/string/byte/memcpy

@zooba
Copy link
Member

zooba commented May 7, 2024

We can also simply remove restrict.

Works for me.

We'll get this in for beta 2 at this stage.

@Dobatymo
Copy link
Contributor Author

Dobatymo commented May 8, 2024

@zooba @eryksun Thank you so much for all your help! This is my first PR for CPython and I had a very good experience.

@zooba zooba merged commit e85e8de into python:main May 10, 2024
36 checks passed
@zooba
Copy link
Member

zooba commented May 10, 2024

Congratulations!

@zooba zooba added the needs backport to 3.13 bugs and security fixes label May 10, 2024
@miss-islington-app
Copy link

Thanks @Dobatymo for the PR, and @zooba for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 10, 2024
…dule (pythonGH-118213)

(cherry picked from commit e85e8de)

Co-authored-by: Dobatymo <Dobatymo@users.noreply.github.com>
@bedevere-app
Copy link

bedevere-app bot commented May 10, 2024

GH-118887 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label May 10, 2024
zooba pushed a commit that referenced this pull request May 10, 2024
…H-118213)

(cherry picked from commit e85e8de)

Co-authored-by: Dobatymo <Dobatymo@users.noreply.github.com>
@Dobatymo Dobatymo deleted the mmap-seh branch June 15, 2024 05:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants