Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derive Macro for IntoPyDict #3350

Closed
wants to merge 57 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
f64aa86
better
Jul 28, 2023
c35b598
better
Jul 28, 2023
077e470
better
Jul 28, 2023
c87345d
finished non generic macro
Jul 28, 2023
533992d
macro exapnsion works
Jul 28, 2023
b77e9e1
cleanup
Jul 28, 2023
2362538
Merge branch 'PyO3:main' into main
vidur2 Jul 28, 2023
b86bf14
more stuff
Jul 28, 2023
66a3442
tests better?
Jul 28, 2023
4056990
committing files
Jul 28, 2023
c4bf9db
pre merge
Jul 28, 2023
774864e
Merge remote-tracking branch 'unforked/main'
Jul 28, 2023
b9215d4
added to newsfragments directory
Jul 28, 2023
3d8e257
initial
Jul 28, 2023
fd1d6f4
updated trait
Jul 28, 2023
40c7636
added formatting
Jul 28, 2023
6e5601b
better?
Jul 28, 2023
e0264c3
formatted
Jul 28, 2023
509942e
changed crate structure
Jul 28, 2023
25f28ca
formatting
Jul 28, 2023
70468f3
better?
Jul 28, 2023
a047c3d
better?
Jul 28, 2023
afb6f12
trying to fix ci
Jul 28, 2023
67891b6
more ci fixes
Jul 28, 2023
31be92d
fixed tmp issues
Jul 28, 2023
e6f18ba
better?
Jul 28, 2023
b26e64d
removed clippy error
Jul 28, 2023
7e08475
changed to pass msvri test
Jul 28, 2023
33f09a3
works
Jul 28, 2023
695a6ad
better?
Jul 28, 2023
04dead8
reformat
Jul 28, 2023
4f99a11
changed to match msrv
Jul 28, 2023
ed271fb
reformat
Jul 28, 2023
7d3c2f5
fixed closure
Jul 28, 2023
6ff9870
reformatted
Jul 28, 2023
0ef5f7c
removed unnessescary return
Jul 28, 2023
0641561
added more testing coverage
Jul 28, 2023
ef0d86b
fixed issue
Jul 28, 2023
1c88606
removed bloat
Jul 28, 2023
3065ba6
fixed clippy issue
Jul 29, 2023
14f8b01
put final body into tokens
Jul 29, 2023
8a933b7
better
Jul 29, 2023
be9a6a6
error handling for unallowed types
Jul 29, 2023
81bd87d
more extensive error handling
Jul 29, 2023
31b4e7f
fixed double generic parsing issue
Jul 29, 2023
08366df
edited error message to be clearer
Jul 29, 2023
d014295
added more error handling
Jul 29, 2023
dd594b8
added compile test
Jul 29, 2023
3f35396
added test suite for invalid
Jul 30, 2023
896f483
fixed invalidation tests
Jul 30, 2023
8385b1b
reorg of testing suit
Jul 30, 2023
f89ba6a
added support for py name option
Jul 30, 2023
1415231
less convoluted
Jul 30, 2023
ee19677
map code better
Jul 30, 2023
2adfbee
Merge branch 'main' of https://github.com/PyO3/pyo3
Jul 30, 2023
20c90c4
removed duplicate test compile fail
Jul 31, 2023
d8bc842
removed need to imoprt pyo3::types::IntoPyDict outside of macro if no…
Jul 31, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions newsfragments/3350.added.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added derive macro for ```IntoPyDict```
366 changes: 366 additions & 0 deletions pyo3-macros-backend/src/intopydict.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,366 @@
use quote::{quote, TokenStreamExt};
use std::{collections::HashMap, ops::AddAssign};

use proc_macro2::{Span, TokenStream};
use syn::{
parse::{Parse, ParseStream},
DeriveInput, Error, Generics,
};

const COL_NAMES: [&str; 8] = [
"BTreeSet",
"BinaryHeap",
"Vec",
"HashSet",
"LinkedList",
"VecDeque",
"BTreeMap",
"HashMap",
];
Comment on lines +10 to +19
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel quite strongly that instead of special-casing this macro on names of types the implementation should emit code based on traits. In this case, if we go with the proposal of deep IntoPyDict, we should use autoref-specialization which prefers IntoPyDict if implemented and otherwise falls back on IntoPy<PyObject>.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this apply to collections as well (ie Vecs, Sets etc)? These would get stored as a PyObject instead of a PyList?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And how do you recommend I use autoref specialization? I don't have much experience with it. Is there a crate which makes it easier?

Copy link
Member

@adamreichold adamreichold Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate crate usually doesn't help because abstracting the pattern away is often more complex than just applying it.

In this case, the general technique is described at https://github.com/dtolnay/case-studies/blob/master/autoref-specialization/README.md and you can find an example of its application in PyO3's existing macros by looking for usages of the type PyClassImplCollector defined at

pub struct PyClassImplCollector<T>(PhantomData<T>);

and for example its relation to the trait PyClassNewTextSignature defined at

pub trait PyClassNewTextSignature<T> {

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add the specialization feature as a crate attribute?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; the specialization feature is incomplete and unsound. Autoref specialization is a technique which works on stable Rust to get a specialization-like result for some limited cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be some issues with reference moving when I try to do autoref specialization. Sorry it took me a while for this, I was finishing up an internship


#[derive(Debug, Clone)]
enum Pyo3Type {
Primitive,
NonPrimitive,
CollectionSing(Box<crate::intopydict::Pyo3Type>),
// Map(
// Box<crate::intopydict::Pyo3Type>,
// Box<crate::intopydict::Pyo3Type>,
// ),
}

#[derive(Debug, Clone)]
pub struct Pyo3DictField {
name: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be an Ident or &Ident, not String.

attr_type: Pyo3Type,
attr_name: Option<String>,
}

impl Pyo3DictField {
pub fn new(name: String, type_: &str, span: Span, attr_name: Option<String>) -> Self {
Self {
name,
attr_type: Self::check_primitive(type_, span),
attr_name,
}
}

fn check_primitive(attr_type: &str, span: Span) -> Pyo3Type {
for collection in COL_NAMES {
if attr_type.starts_with(collection) {
let attr_type = attr_type.replace('>', "");
let attr_list: Vec<&str> = attr_type.split('<').collect();
let out = Self::handle_collection(&attr_list, span);

return out.unwrap();
}
}

match attr_type {
"i8" | "i16" | "i32" | "i64" | "i128" | "isize" | "u8" | "u16" | "u32" | "u64"
| "u128" | "usize" | "f32" | "f64" | "char" | "bool" | "&str" | "String" => {
Pyo3Type::Primitive
}
_ => Pyo3Type::NonPrimitive,
}
}

fn handle_collection(attr_type: &[&str], span: Span) -> syn::Result<Pyo3Type> {
match attr_type[0] {
"BTreeSet" | "BinaryHeap" | "Vec" | "HashSet" | "LinkedList" | "VecDeque" => {
Ok(Pyo3Type::CollectionSing(Box::new(
Self::handle_collection(&attr_type[1..], span).unwrap(),
)))
}
"BTreeMap" | "HashMap" => {
Err(Error::new(span, "Derive currently doesn't support map types. Please use a custom implementation for structs using a map type like HashMap or BTreeMap"))
}
"i8" | "i16" | "i32" | "i64" | "i128" | "isize" | "u8" | "u16" | "u32" | "u64"
| "u128" | "usize" | "f32" | "f64" | "char" | "bool" | "&str" | "String" => {
Ok(Pyo3Type::Primitive)
}
_ => Ok(Pyo3Type::NonPrimitive),
}
}
}

impl Parse for Pyo3Collection {
fn parse(input: ParseStream<'_>) -> syn::Result<Self> {
let tok_stream: TokenStream = input.parse()?;
let mut binding = tok_stream
.to_string()
.as_str()
.replace(|c| c == ' ' || c == '{' || c == '}' || c == '\n', "");

if !binding.contains(':') {
return Ok(Pyo3Collection(Vec::new()));
}

if binding.as_bytes()[binding.len() - 1] as char != ',' {
binding.push(',');
}

let name_map = split_struct(binding);

let mut field_collection: Vec<Pyo3DictField> = Vec::new();

for (field_name, (field_val, dict_key)) in &name_map {
field_collection.push(Pyo3DictField::new(
field_name.to_string(),
field_val,
input.span(),
dict_key.clone(),
))
}

Ok(Pyo3Collection(field_collection))
}
}

fn split_struct(binding: String) -> HashMap<String, (String, Option<String>)> {
let mut stack: Vec<char> = Vec::new();
let mut start = 0;
let binding = binding.replace('\n', "");
let mut name_map: HashMap<String, (String, Option<String>)> = HashMap::new();

for (i, char_val) in binding.chars().enumerate() {
if char_val == ',' && stack.is_empty() {
if binding[start..i].starts_with('#') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might want to take a look at take_pyo3_options to see how the rest of the crate parses this.

let new_name = get_new_name(binding.clone(), start, i);
let var_string = &binding[start..i].split(']').collect::<Vec<&str>>()[1];
let info_parsed = var_string.split(':').collect::<Vec<&str>>();
name_map.insert(
info_parsed[0].to_string(),
(info_parsed[1].to_string(), Some(new_name)),
);
} else {
let info_parsed = binding[start..i].split(':').collect::<Vec<&str>>();
name_map.insert(
info_parsed[0].to_string(),
(info_parsed[1].to_string(), None),
);
}
start = i + 1;
} else if i == binding.len() - 1 {
let info_parsed = binding[start..].split(':').collect::<Vec<&str>>();
name_map.insert(
info_parsed[0].to_string(),
(info_parsed[1].to_string(), None),
);
}

if char_val == '<' || char_val == '(' {
stack.push(char_val);
}

if char_val == '>' || char_val == ')' {
stack.pop();
}
}

// if !name_map.is_empty() {
// let mut last = tok_split.last().unwrap().clone();
// for i in stack {
// last.push(i)
// }
// let len = tok_split.len();
// tok_split[len - 1] = last;
// }

name_map
}

fn get_new_name(binding: String, start: usize, i: usize) -> String {
let fragments: Vec<&str> = binding[start..i].split("name=").collect();
let mut quote_count = 0;
let mut start = 0;
for (j, char_val_inner) in fragments[1].chars().enumerate() {
if char_val_inner == '"' {
quote_count += 1;

if quote_count == 1 {
start = j + 1;
}
}

if quote_count == 2 {
return fragments[1][start..j].to_string();
}
}

String::new()
}

#[derive(Debug, Clone)]
pub struct Pyo3Collection(pub Vec<Pyo3DictField>);

impl AddAssign for Pyo3Collection {
fn add_assign(&mut self, rhs: Self) {
self.0.extend(rhs.0);
}
}

pub fn build_derive_into_pydict(dict_fields: Pyo3Collection) -> TokenStream {
let mut body = quote! {
let mut pydict = pyo3::types::PyDict::new(py);
};

for field in &dict_fields.0 {
let ident: &String;
if let Some(ref val) = field.attr_name {
ident = val;
} else {
ident = &field.name;
}
let ident_tok: TokenStream = field.name.parse().unwrap();
if !ident.is_empty() {
match_tok(field, &mut body, ident, ident_tok);
}
}
body.append_all(quote! {
return pydict;
});

body
}

fn match_tok(
field: &Pyo3DictField,
body: &mut TokenStream,
ident: &String,
ident_tok: TokenStream,
) {
match field.attr_type {
Pyo3Type::Primitive => {
body.append_all(quote! {
pydict.set_item(#ident, self.#ident_tok).expect("Bad element in set_item");
});
}
Pyo3Type::NonPrimitive => {
body.append_all(quote! {
pydict.set_item(#ident, self.#ident_tok.into_py_dict(py)).expect("Bad element in set_item");
});
}
Pyo3Type::CollectionSing(ref collection) => {
let non_class_ident = ident.replace('.', "_");
body.append_all(handle_single_collection_code_gen(
collection,
&format!("self.{}", ident_tok),
&non_class_ident,
0,
));

let ls_name: TokenStream = format!("pylist0{}", ident).parse().unwrap();
body.append_all(quote! {
pydict.set_item(#ident, #ls_name).expect("Bad element in set_item");
});
} // Pyo3Type::Map(ref key, ref val) => {
// if let Pyo3Type::NonPrimitive = key.as_ref() {
// panic!("Key must be a primitive type to be derived into a dict. If you want to use non primitive as a dict key, use a custom implementation");
// }

// match val.as_ref() {
// Pyo3Type::Primitive => todo!(),
// Pyo3Type::NonPrimitive => todo!(),
// Pyo3Type::CollectionSing(_) => todo!(),
// Pyo3Type::Map(_, _) => todo!(),
// }
// }
};
}

fn handle_single_collection_code_gen(
py_type: &Pyo3Type,
ident: &str,
non_class_ident: &str,
counter: usize,
) -> TokenStream {
let curr_pylist: TokenStream = format!("pylist{}{}", counter, non_class_ident)
.parse()
.unwrap();
let next_pylist: TokenStream = format!("pylist{}{}", counter + 1, non_class_ident)
.parse()
.unwrap();
let ident_tok: TokenStream = ident.parse().unwrap();
match py_type {
Pyo3Type::Primitive => {
quote! {
let mut #curr_pylist = pyo3::types::PyList::empty(py);
for i in #ident_tok.into_iter() {
#curr_pylist.append(i).expect("Bad element in set_item");
};
}
}
Pyo3Type::NonPrimitive => {
quote! {
let mut #curr_pylist = pyo3::types::PyList::empty(py);
for i in #ident_tok.into_iter() {
#curr_pylist.append(i.into_py_dict(py)).expect("Bad element in set_item");
};
}
}
Pyo3Type::CollectionSing(coll) => {
let body =
handle_single_collection_code_gen(coll.as_ref(), "i", non_class_ident, counter + 1);
quote! {
let mut #curr_pylist = pyo3::types::PyList::empty(py);
for i in #ident_tok.into_iter(){
#body
#curr_pylist.append(#next_pylist).expect("Bad element in set_item");
};
}
}
}
}

pub fn parse_generics(generics: &Generics) -> String {
if !generics.params.is_empty() {
let mut generics_parsed = "<".to_string();

for param in &generics.params {
match param {
syn::GenericParam::Lifetime(lt) => {
generics_parsed += ("'".to_string() + &lt.lifetime.ident.to_string()).as_str()
}
syn::GenericParam::Type(generic_type) => {
generics_parsed += generic_type.ident.to_string().as_str()
}
syn::GenericParam::Const(const_type) => {
generics_parsed +=
("const".to_string() + const_type.ident.to_string().as_str()).as_str()
}
}

generics_parsed += ",";
}

generics_parsed = generics_parsed[0..generics_parsed.len() - 1].to_string();
generics_parsed += ">";
generics_parsed
} else {
String::new()
}
}
Comment on lines +316 to +343
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Ill keep that in mind


pub fn check_type(input: &DeriveInput) -> syn::Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should not exist. Instead match on the DeriveInput and forward to a function that handles structs specifically. Something like...

pub fn handle_deriveinput(input: &DeriveInput) -> syn::Result<???> {
    match input.data {
        syn::Data::Struct(strukt @ syn::Datastruct { fields: syn::Fields::Named(_), ..}) => {
                handle_struct(strukt)
        }
        _ => Err(syn::Error::new(
                    // span,
                    "IntoPyDict derive is only possible for structs with named fields",
                ))
    }
}
 
pub fn handle_struct(input: &syn::DataStruct) -> syn::Result<???>{
    // ...
}

Instead of having a bunch of "check" functions, parse, don't validate. Did you spot the bug/oversight in your code that this code does not have? What about structs without fields? 🙃

match input.data {
syn::Data::Struct(ref info) => {
if let syn::Fields::Unnamed(_) = info.fields {
return Err(syn::Error::new(
info.struct_token.span,
"No support for tuple structs currently. Please write your own implementation for the struct.",
));
}

Ok(())
}
syn::Data::Enum(ref info) => Err(syn::Error::new(
info.brace_token.span.close(),
"No support for enums currently. Please write your own implementation for the enum.",
)),
syn::Data::Union(ref info) => Err(syn::Error::new(
info.union_token.span,
"No support for unions currently. Please write your own implementation for the union.",
)),
}
}
2 changes: 2 additions & 0 deletions pyo3-macros-backend/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ mod utils;
mod attributes;
mod deprecations;
mod frompyobject;
mod intopydict;
mod konst;
mod method;
mod module;
Expand All @@ -22,6 +23,7 @@ mod pymethod;
mod quotes;

pub use frompyobject::build_derive_from_pyobject;
pub use intopydict::{build_derive_into_pydict, check_type, parse_generics, Pyo3Collection};
pub use module::{process_functions_in_module, pymodule_impl, PyModuleOptions};
pub use pyclass::{build_py_class, build_py_enum, PyClassArgs};
pub use pyfunction::{build_py_function, PyFunctionOptions};
Expand Down
Loading