Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-35810: Incref heap-allocated types in PyObject_Init #11661

Merged
merged 10 commits into from
Mar 27, 2019

Conversation

eduardo-elizondo
Copy link
Contributor

@eduardo-elizondo eduardo-elizondo commented Jan 23, 2019

https://bugs.python.org/issue35810

Correctly incref an intance's type. Currently, if a heap-allocated type creates an instance through PyObject_{,GC}_New{Var}, the type won't incref. This adds a change to pull the incref to the core PyObject_Init{INIT} function to correctly incref heap-allocated types.

This now means that heap-allocated types that add a custom tp_dealloc, should decref the instance types - just like the default tp_dealloc does.

Currently there are 10 heap-allocated types in CPython:

  • PyCursesPanel_Type in _curses_panel.c
  • sslerror_type in _ssl.c
  • Example_Type in _testmultiphase.c
  • Str_Type in _testmultiphase.c
  • Tkapp_Type in _tkinter.c
  • Tktt_Type in _tkinter.c
  • PyTclObject_Type in _tkinter.c
  • Xxo_Type in xxlimited.c
  • Str_Type in xxlimited.c
  • Null_Type in xxlimited.c
  • Struct Sequences in structseq.c

Out of those only the following 5 types allocate instances through PyObject_{GC}_New{Var}:

  • PyCursesPanel_Type. Action: Added a decref in its tp_dealloc
  • Tkapp_Type. Action: Removed the manual incref after allocation.
  • Tktt_Type. Action: Removed the manual incref after allocation.
  • PyTclObject_Type. Action: Removed the manual incref after allocation.
  • Xxo_Type. No Action: It inherits the default tp_dealloc which already decrefs the type.
  • Struct Sequences. Action: Removed decref the type in the deallocation function.

@eduardo-elizondo
Copy link
Contributor Author

Previous reference: #10854 (comment)

cc @vstinner

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but this change looks the obviously break the backward compatibility on all heap-type used in the wild. I don't think that it's a good idea to break the backward compatibility like that. Or maybe I misunderstood something.

cc @encukou

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@vstinner
Copy link
Member

@eduardo-elizondo
Copy link
Contributor Author

eduardo-elizondo commented Jan 25, 2019

I have made the requested changes; please review again

Hi @vstinner, is there any particular scenario that you are concerned about?

This is indeed backwards compatible. The single case where behavior is not preserved causes a type to become immortal. This only arises under the following set of conditions: 1) A user manually jams NULL into the tp_new slot of a heap-allocated type. 2) The user manually increfs the type after object instantiation, and 3) The tp_dealloc slot decrefs the type. Outside of that, behavior is preserved. This is a very rare set of circumstances which most likely do not exist in the wild.

Anyways, let me know what you think! If you'd like to, I'll be happy to follow-up with a more thorough explanation on the correctness of this change.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@vstinner: please review the changes made to this pull request.

@eduardo-elizondo
Copy link
Contributor Author

cc @encukou

@encukou
Copy link
Member

encukou commented Feb 5, 2019

Please add an entry in Doc/whatsnew/3.8.rst (Build and C API Changes). This is something we'll definitely want extension authors to know about.

I do believe this is a good change to make, but we need to be very careful with it.

I ran some refleak tests (-R :) on Linux and found leaks in posix.stat_result. Reproducer:

>>> import os, sys, gc
>>> sys.getrefcount(os.stat_result)
149
>>> os.stat('/')
...
>>> gc.collect()
62
>>> sys.getrefcount(os.stat_result)
150

I assume this happens with all PyStructSequence types.
If you have time to test PyStructSequence C API, that would be great (but it's a bit out of scope of this issue/PR).

It's good that the refleak tests found a failure. I'm less happy about the fact that we'll need to run them to find OS-specific failures. Not sure if our CI will cover enough ground here. (@vstinner, do you know?)

I'm almost out of time today but keeping this on my agenda.

@eduardo-elizondo
Copy link
Contributor Author

eduardo-elizondo commented Feb 6, 2019

I have made the requested changes; please review again

Woops, I was originally only checking cpython/Modules for heap-allocated types! I forgot that I recently modified cpython/Objects/structseq.c to also use PyType_FromSpec. To answer your question - no, this only happens with StructSequence Types that are allocated through PyStructSequence_NewType. Probably posixmodule is the only one using this updated C-API. Also, the the refleak test that failed was the one that I added back then (#9665) - so it's already tested.

Anyways, the fix was quick and easy - It's actually covered under the 4 scenarios that discussed here: #10854 (comment). Refleak is happy now 👍

whatsnew/3.8.rst has been updated as well.


On another note, @encukou if you want to, on another PR, I can add more documentation on moving types from PyType_Ready to PyType_FromSpec. There are different scenarios that might lead to leaks and/or crashes. I've already documented some of the scenarios in the comment that I just pointed you to but maybe it should be moved to somewhere more discoverable. I think this will greatly reduce the debugging time for extension authors. Just let me know.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@vstinner: please review the changes made to this pull request.

@encukou
Copy link
Member

encukou commented Feb 6, 2019

On another note, @encukou if you want to, on another PR, I can add more documentation on moving types from PyType_Ready to PyType_FromSpec. There are different scenarios that might lead to leaks and/or crashes. I've already documented some of the scenarios in the comment that I just pointed you to but maybe it should be moved to somewhere more discoverable. I think this will greatly reduce the debugging time for extension authors. Just let me know.

That sounds great! Two things I'd expect from a HOWTO for the documentation:

  • It should say why heap types are better, and in which cases, not just that everyone should switch
  • Let's not document bugs as features: if there's something wrong in the current implementation, track it in issues instead :)

But even notes about the current state would be useful.

I will get to this next week again. Thank you very much for the effort, and I hope my limited time is not too discouraging.

@eduardo-elizondo
Copy link
Contributor Author

eduardo-elizondo commented Feb 11, 2019

That sounds great! Two things I'd expect from a HOWTO for the documentation:

Perfect! I'll get something nice written up this week!

I hope my limited time is not too discouraging.

Not at all! Really appreciate the reviews 😄

Looking forward to see what you think about moving this forward!

@vstinner
Copy link
Member

This change is going to break basically all C extensions which implement types. I would prefer to see a discussion on python-dev to make sure that we want to break the backward compatibility on purpose.

The strict minimum would be to test a few C extensions to check how many are broken by your PR. Examples: numpy, pandas, Cython, Pillow, lxml, etc.

@vstinner
Copy link
Member

@scoder: Would you mind to review this change?

@vstinner
Copy link
Member

I'm not strictly against "fixing" the C API. But any backward incompatible change must be *very carefully prepared and organized: communicate with maintainers of major C extensions, communicate properly on the change, maybe provide tooling to detect bugs and/or do the migration.

@eduardo-elizondo
Copy link
Contributor Author

It seems that all the issues have already been addressed as well as adding more documentation for porting into 3.8.

Let me know if there are any other pending changes.

cc @vstinner

@eduardo-elizondo
Copy link
Contributor Author

I have made the requested changes; please review again

I seems that everyone is in agreement about this change in https://bugs.python.org/issue35810. Let's try to push this change forward! cc @vstinner @encukou

@bedevere-bot
Copy link

Thanks for making the requested changes!

@vstinner: please review the changes made to this pull request.

@eduardo-elizondo
Copy link
Contributor Author

eduardo-elizondo commented Feb 26, 2019

Turns out that using the builtin github merge tool is a pain 😞 . Fixing

@encukou
Copy link
Member

encukou commented Mar 7, 2019

Hi!
I hope you don't mind pushing to the PR directly. I reworded the release notes documentation to hopefully make them clearer – it's IMO more efficient that way than giving you a list of nitpicks to sort out.
Let me know what you think.

@eduardo-elizondo
Copy link
Contributor Author

eduardo-elizondo commented Mar 7, 2019

@encukou looks great, it reads much better now. Thank you! 👍

By the way, it looks like this is ready to be merged!

@encukou
Copy link
Member

encukou commented Mar 14, 2019

  • discussion on python-dev: happened, discussion was directed to the issue.
  • test a few C extensions to check how many are broken:
    • numpy: a few unrelated tests fail (even with CPython master):
      • TestPickling.test_highest_available_pickle_protocol (numpy assumes 3.8 has PEP 574; we are not there yet)
      • TestRandomDist.test_binomial (deprecation warning treated as error)
      • TestRandomDist.test_hypergeometric (deprecation warning treated as error)
      • TestRandomDist.test_scalar_exception_propagation (related to hypergeometric)
    • pandas: tests coredump for me (even with CPython master) :(
    • Cython: build of dependencies fails on 3.8 PyThreadState changes (even with CPython master)
    • Pillow: tests pass
    • lxml: tests pass
  • communicate with maintainers of major C extensions: As it turns out the major C extensions do not use heap-allocated types (yet). AFAICS:
    • Numpy doesn't.
    • Pandas doesn't
    • Cython doesn't (but Stefan is on board here).
    • Pillow doesn't
    • lxml doesn't
    • PySide does (and Christian is aware).
      Anything more?
  • communicate properly on the change: I think the What's New entries are now adequate.
  • maybe provide tooling to detect bugs and/or do the migration - while tooling to detect refcount bugs would definitely be great, this sounds like overkill to me.
    • Extension authors should really be running refleak checks now. They mostly aren't, and that's partly on us for not providing the tools, but I think that's out of scope for this issue.
    • The migration is not difficult (see the Wat's New entry), and rare (heap types aren't widely used, and the brave early adopters who do use them seem prepared for some ironing out of issues).

@vstinner, do you have any more reservations?

@encukou encukou requested a review from vstinner March 14, 2019 11:12
@vstinner vstinner dismissed their stale review March 14, 2019 16:57

Sorry, I don't have the bandwidth to review/follow this PR. I just remove my previous review.

@scoder
Copy link
Contributor

scoder commented Mar 14, 2019

Cython: build of dependencies fails on 3.8 PyThreadState changes (even with CPython master)

Note sure what you mean here. Cython doesn't have any dependencies. Its test suite uses some libraries when they are installed (line profiler, numpy, coverage, jupyter), but without them it simply disables a couple of tests and that's it.

@scoder
Copy link
Contributor

scoder commented Mar 15, 2019

I just noticed that I forgot to express a big "Thank You!" to @encukou for digging up the references above and providing an excellent ground for estimating the risks.

FWIW, I ran Cython's test suite with this branch and it didn't show any issues. This doesn't guarantee that there won't be issues in third party Cython code, but at least (as expected) Cython itself doesn't seem to produce any problems here, which makes it very unlikely that many Cython implemented packages will suffer from this change. People sometimes do hackish C stuff in their Cython code, but probably not in this particular corner.

foo_struct *foo = PyObject_GC_New(foo_struct, (PyTypeObject *) type);
if (foo == NULL)
return NULL;
#if PY_VERSION_HEX < 0x03080000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Cython, we normally use the exact alpha/beta version where the change was introduced (to support testing during the CPython release cycle), but we obviously don't know that yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example is not really written for projects with Cython's level of involvement. Most projects will only test against a nightly build or the latest alpha/beta (if even that). The few others should know to adjust the example. We don't explicitly tell you to rename foo_new, either :)

Py_TYPE(op) = tp;
_Py_NewReference((PyObject *)op);
Py_SIZE(op) = size;
PyObject_Init((PyObject *)op, tp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this just call _PyObject_INIT() ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, actually, shouldn't PyObject_Init() just call _PyObject_INIT() ? Duplicating that code across two files seems unnecessary, maybe even a bit error prone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyObject_Init is defined in the same file, so the compiler should have everything it needs to inline as it sees fit.

As for code duplication, that's a valid point, but I'd call that out of scope for this PR. Let's keep orthogonal issues revertable separately.

@eduardo-elizondo
Copy link
Contributor Author

Would it make sense to request a review on python-dev? Otherwise, @encukou how would you feel about reviewing the PR?

@encukou
Copy link
Member

encukou commented Mar 27, 2019

Sorry for the delay! I missed the notification about Victor dismissing his review :(

I am available today to watch the buildbots, so I'll merge.

@encukou encukou merged commit 364f0b0 into python:master Mar 27, 2019
miss-islington pushed a commit that referenced this pull request Nov 5, 2019
After #9665, this moves the remaining types in posixmodule to be heap-allocated to make it compatible with PEP384 as well as modifying all the type accessors to fully make the type opaque.

The original PR that got messed up a rebase: #10854. All the issues in that commit have now been addressed since #11661 got committed.

This change also removes any state from the data segment and onto the module state itself.


https://bugs.python.org/issue35381



Automerge-Triggered-By: @encukou
jacobneiltaylor pushed a commit to jacobneiltaylor/cpython that referenced this pull request Dec 5, 2019
After python#9665, this moves the remaining types in posixmodule to be heap-allocated to make it compatible with PEP384 as well as modifying all the type accessors to fully make the type opaque.

The original PR that got messed up a rebase: python#10854. All the issues in that commit have now been addressed since python#11661 got committed.

This change also removes any state from the data segment and onto the module state itself.


https://bugs.python.org/issue35381



Automerge-Triggered-By: @encukou
shihai1991 pushed a commit to shihai1991/cpython that referenced this pull request Jan 31, 2020
After python#9665, this moves the remaining types in posixmodule to be heap-allocated to make it compatible with PEP384 as well as modifying all the type accessors to fully make the type opaque.

The original PR that got messed up a rebase: python#10854. All the issues in that commit have now been addressed since python#11661 got committed.

This change also removes any state from the data segment and onto the module state itself.


https://bugs.python.org/issue35381



Automerge-Triggered-By: @encukou
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants