-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid force_reset to make patching thread-safe #213
Conversation
Before, force_reset was used twice when instantiating the original baseclasses. It was necessary to unpatch things, since the classes were subclasses of patched classes *and* because the subclass referred to the super class explicitly by name. That is, httplib has code like: class HTTPConnection: ... class HTTPSConnection(HTTPConnection): "This class allows communication via SSL." default_port = HTTPS_PORT def __init__(self, host, port=None, ...): HTTPConnection.__init__(self, host, port, ...) Creating a real HTTPSConnection instance is now impossible as long as httplib.HTTPConnection is patched. To get around this, force_reset was used to temporary unpatch things. However, this means modifying global state and is thus not thread-safe. Modifying the classes once when vcr is loaded is better since no further patching and unpatching needs to be done after that. This code in this commit uses byteplay, which is not Python 3 compatible.
Wow, this is pretty mindblowing. I have to admit I haven't thought much about making VCR.py threadsafe. Is this all that is needed for it to work? I'm obviously not too excited about adding another dependency / doing bytecode-level patching. The goal is noble though. |
Sigh. This is why you are supposed to use super.
That would be a dealbreaker for its inclusion in this library. We could do something sort of hacky where we just patch the function with an exact rewrite of the existing function except with real super calls instead of explicit superclass calls (instead of the force reset). Is this really the only reason we do a force reset here? I feel like I vaguely remember some other reason.
This is only threadsafety assuming that there is exactly one active cassette at any given moment, but i do believe it would give us threadsafety in thise case. |
Even better: they are old-style classes so they actually can't use super. |
@kevin1024 wrote:
Thanks, that's a much better response than I had hoped for :-) @IvanMalison wrote:
Yeah, I would call the patch a proof of concept at best. I plan to rewrite it as a standalone piece of code that uses the I think making vcr.py threadsafe would be a good and reasonable thing. After all, making parallel HTTP requests is the prime use case of packages like
I think it is the main reason. One
errors, albeit in the single-threaded case. |
Cool, I'd love to take a look at that when you are done
yeah agreed, I just want to make sure we do so in a way that doesn't break any existing features/compatibility |
In the end I simply went with a bit of code that is imported at the top of my tests: import byteplay
def _patch_name(func, old, new):
"""Replace references to old with new in func bytecode.
This effectively renames variables and attributes in the code for
the function passed. When the function used to access 'old', it
will now instead access 'new'.
"""
bytecode = byteplay.Code.from_code(func.func_code)
wanted_opcodes = [byteplay.LOAD_GLOBAL, byteplay.LOAD_ATTR]
for i, (opcode, arg) in enumerate(bytecode.code):
if opcode in wanted_opcodes and arg == old:
bytecode.code[i] = (opcode, new)
func.func_code = bytecode.to_code()
# Make the original HTTPSConnection class refer to the original
# HTTPConnection class, always. This avoids "unbound method __init__()
# must be called with VCRHTTPConnection<something>.yaml instance as
# first argument (got HTTPSConnection instance instead)" errors.
#
# Please see https://github.com/kevin1024/vcrpy/issues/212 for more
# background and discussion.
httplib._HTTPConnection = httplib.HTTPConnection
httplib._HTTPSConnection = httplib.HTTPSConnection
_patch_name(httplib._HTTPSConnection.__init__.im_func,
'HTTPConnection', '_HTTPConnection')
_patch_name(httplib._HTTPSConnection.connect.im_func,
'HTTPConnection', '_HTTPConnection') Since the code is safe to run always (it doesn't change the behavior when no patching is done), I decided that was the easiest way. The |
FYI there is now a python3 library for byteplay: |
Before,
force_reset
was used twice when instantiating the original baseclasses. It was necessary to unpatch things, since the classes were subclasses of patched classes and because the subclass referred to the super class explicitly by name.That is,
httplib
has code like:Creating a real
HTTPSConnection
instance is now impossible as long ashttplib.HTTPConnection
is patched.To get around this,
force_reset
was used to temporary unpatch things. However, this means modifying global state and is thus not thread-safe. Modifying the classes once when vcr is loaded is better since no further patching and unpatching needs to be done after that.This code in this commit uses byteplay, which is not Python 3 compatible. I think byteplay could be upsed with Python 3 compatibility, but I haven't looked much into it. The project also seems semi-abandoned with the last release done in 2010.
Also, I will agree that this kind of bytecode level patching is at best controversial :-) However, since I only use it to fix a reference to a super-class, I think it could be acceptable here.
Fixes #212 and makes the test program there run stable.