Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable recursive classattribute #982

Closed
kngwyu opened this issue Jun 19, 2020 · 9 comments
Closed

Enable recursive classattribute #982

kngwyu opened this issue Jun 19, 2020 · 9 comments
Assignees

Comments

@kngwyu
Copy link
Member

kngwyu commented Jun 19, 2020

We can use Self as a classattribute, like this:

#[pyclass]
struct Foo {}
#[pymethods] 
impl Foo {
    #[classattr] 
    const FOO: Foo = Foo {};
}

.
However, by #975, this is temporally prohibited due to a soundness problem, and thus this code panics.
We should be able to enable this by changing the strategy for initializing tp_dict as suggested by @scalexm in the comment.

@scalexm
Copy link
Contributor

scalexm commented Jun 19, 2020

I’ll take this one.

@kngwyu
Copy link
Member Author

kngwyu commented Jun 19, 2020

Thanks, I assigned you.

@scalexm
Copy link
Contributor

scalexm commented Jun 21, 2020

So I did a bit of archeology. As @davidhewitt noted, the tp_dict documentation features contradictory information about modifying the tp_dict after PyType_Ready has been called.

A git blame on the warning text shows that it's more recent than the text saying that extra attributes can be added to the tp_dict. It points to the following issue: #12719. There's not much information on that issue, but it points to another one, of which the former issue seems to be a duplicate: #1878.

On the latter issue, it is explained that adding attributes to the tp_dict can interact negatively with the method cache, although not in a memory unsafe way, but rather by exhibiting inconsistent results when accessing class attributes. It seems indeed that both #1878 and #12719 were initially about inconsistent reading of class attributes modified through the tp_dict.

In #1878, it's explained that one can disable this interaction with the method cache by unsetting the Py_TPFLAGS_HAVE_VERSION_TAG flag on the type object. Reading the _PyType_Lookup implementation, along with that comment, I understand that Py_TPFLAGS_HAVE_VERSION_TAG simply enables/disables the method cache altogether.

After reading the implementation, I believe that there is another method: PyType_Modified invalidates the method cache, and according to the documentation, its purpose is exactly that of modifying class attributes through the tp_dict.

We have an example in cpython where the tp_dict is changed: in the sys module, an attribute is deleted right after PyType_Ready is called: https://github.com/python/cpython/blob/6989af0bc7ea1e9a1acea16794e6f723d7b44110/Python/sysmodule.c#L2745-L2757

The implementation doesn't bother with Py_TPFLAGS_HAVE_VERSION_TAG or PyType_Modified because no one had the time to populate the method cache yet (the GIL is held), so there is no need to invalidate it.

In our case, we're going to fill the tp_dict by calling into user code, which may try to access the class attributes (and the tp_dict may appear empty since we're in the middle of filling it). So in our case, I believe that we should either disable Py_TPFLAGS_HAVE_VERSION_TAG, but this will also disable the method cache, or simply call PyType_Modified right after we've filled the tp_dict and are still holding the GIL.

I have a working implementation which I'll post on monday (I wrote the code at work and don't have access to it right now).

@davidhewitt
Copy link
Member

Thanks for the the really thorough investigation here!

The implementation doesn't bother with Py_TPFLAGS_HAVE_VERSION_TAG or PyType_Modified because no one had the time to populate the method cache yet (the GIL is held), so there is no need to invalidate it.

Can't recursive class attributes potentially do things like create objects of the type and call methods on them? I might be tempted to call PyType_Modified just to be sure.

@kngwyu
Copy link
Member Author

kngwyu commented Jun 22, 2020

Wow, thank you for the detailed survey!

Though I still don't understand the context completely, it looks that we don't have to use PyType_Modified since it does nothing when Py_TPFLAGS_VALID_VERSION_TAG is not set. See https://github.com/python/cpython/blob/3.8/Objects/typeobject.c#L268.

I addition, thanks to ripgrep, I found another instance where tp_dict is modified after PyType_Ready is called: https://github.com/python/cpython/blob/3.8/Modules/pyexpat.c#L1654.
It looks like this function also assumes that Py_TPFLAGS_VALID_VERSION_TAG is not set 🤔

So I guess it's not problematic to modify tp_dict because now we don't use Py_TPFLAGS_VALID_VERSION_TAG, though I haven't investigated what this flag is.

@kngwyu
Copy link
Member Author

kngwyu commented Jun 22, 2020

So I guess it's not problematic to modify tp_dict because now we don't use Py_TPFLAGS_VALID_VERSION_TAG, though I haven't investigated what this flag is.

Ah sorry, this is wrong. Py_TPFLAGS_VALID_VERSION_TAG is set by _PyType_Lookup.
But I still think we don't need to disable Py_TPFLAGS_HAVE_VERSION_TAG when we don't call _PyType_Lookup after PyType_Ready, before modifying tp_dict.
So...

In our case, we're going to fill the tp_dict by calling into user code, which may try to access the class attributes

How about calling the user code before calling PyType_Ready?

@scalexm
Copy link
Contributor

scalexm commented Jun 22, 2020

Can't recursive class attributes potentially do things like create objects of the type and call methods on them? I might be tempted to call PyType_Modified just to be sure.

Yes, that’s my thought too.

I addition, thanks to ripgrep, I found another instance where tp_dict is modified after PyType_Ready is called: https://github.com/python/cpython/blob/3.8/Modules/pyexpat.c#L1654.
It looks like this function also assumes that Py_TPFLAGS_VALID_VERSION_TAG is not set 🤔

I guess this is the same reasoning as with the sys module: the tp_dict is changed right after PyType_Ready has been called, and the GIL is held the whole time, so nobody was able to fill the method cache yet hence it doesn’t need to be invalidated.

How about calling the user code before calling PyType_Ready?

I think we agreed that it was potentially unsound to start using the type object from user code before PyType_Ready has been called (many things are left uninitialized, for example the tp_dict itself), and we’d need to use it to do the allocation of the class in the case of a class attribute recursively using the class itself.

@kngwyu
Copy link
Member Author

kngwyu commented Jun 22, 2020

I think we agreed that it was potentially unsound to start using the type object from user code before PyType_Ready has been called

Ah, sorry, I actually forgot that we're tackling the 'recursive' case 😓 Then I think PyType_Modified works well.

@kngwyu
Copy link
Member Author

kngwyu commented Jun 30, 2020

Closed via #995

@kngwyu kngwyu closed this as completed Jun 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants