-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] feat: Introduce Truthy
and Falsy
to allow more precise typing
#13665
Changes from all commits
5226225
7e4176a
29c6ab4
4a33cca
8bc2928
2028c0e
54ac97f
4ce78b9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -245,6 +245,13 @@ pub enum Type<'db> { | |||||||||
Unbound, | ||||||||||
/// The None object -- TODO remove this in favor of Instance(types.NoneType) | ||||||||||
None, | ||||||||||
/// Type for narrowing a type to the subset of that type that is truthy (bool(x) == True). | ||||||||||
/// Mainly used in intersections: `X & Truthy`. Example: for `list` it's all lists of `len > 0` | ||||||||||
Truthy, | ||||||||||
/// Type for narrowing a type to the subset of that type that is falsy (bool(x) == False). | ||||||||||
/// Mainly used in intersections: `X & Falsy`. Example: for `int` it's the singleton | ||||||||||
/// `IntLiteral[0]`. | ||||||||||
Comment on lines
+251
to
+253
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
Falsy, | ||||||||||
/// Temporary type for symbols that can't be inferred yet because of missing implementations. | ||||||||||
/// Behaves equivalently to `Any`. | ||||||||||
/// | ||||||||||
|
@@ -393,6 +400,72 @@ impl<'db> Type<'db> { | |||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
/// Return the `Truthy` subset of this type (intersection with Truthy). | ||||||||||
#[must_use] | ||||||||||
pub fn truthy(&self, db: &'db dyn Db) -> Type<'db> { | ||||||||||
IntersectionBuilder::build_truthy(db, *self) | ||||||||||
} | ||||||||||
|
||||||||||
/// Return the `Falsy` subset of this type (intersection with Falsy). | ||||||||||
#[must_use] | ||||||||||
pub fn falsy(&self, db: &'db dyn Db) -> Type<'db> { | ||||||||||
self.falsy_set(db) | ||||||||||
.unwrap_or_else(|| IntersectionBuilder::build_falsy(db, *self)) | ||||||||||
Comment on lines
+412
to
+413
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we do this short-circuit through As commented below, I think that the To be honest, I probably wouldn't include these helper methods at all, unless it's really clear that these will be very common operations and it's important that we make them as ergonomic as possible. I'd rather just go with a general |
||||||||||
} | ||||||||||
|
||||||||||
/// Returns, if it can be expressed, the set of values that are falsy in this type. This is | ||||||||||
/// defined for some builtin types (e.g. int, str, ...) | ||||||||||
#[must_use] | ||||||||||
pub fn falsy_set(&self, db: &'db dyn Db) -> Option<Type<'db>> { | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This method is really implementing "simplification of intersection with Falsy", which is logic I would expect to find in |
||||||||||
match self { | ||||||||||
Self::Instance(class) => match class.known(db) { | ||||||||||
// For the following types, we know the logic for falsiness and can represent | ||||||||||
// specific instances of that type, so we can give back a precise subset. | ||||||||||
Some(KnownClass::Bool) => Some(Self::BooleanLiteral(false)), | ||||||||||
Some(KnownClass::Int) => Some(Self::IntLiteral(0)), | ||||||||||
Some(KnownClass::Bytes) => Some(Self::BytesLiteral(BytesLiteralType::empty(db))), | ||||||||||
Some(KnownClass::Str) => Some(Self::StringLiteral(StringLiteralType::empty(db))), | ||||||||||
Some(KnownClass::Tuple) => Some(Self::Tuple(TupleType::empty(db))), | ||||||||||
Comment on lines
+425
to
+428
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately, because all four of these types (int, bytes, str, tuple) are subclassable, and the subclass is free to override |
||||||||||
_ => match self.bool(db) { | ||||||||||
// When a `__bool__` signature is `AlwaysFalse`, it means that the falsy set is | ||||||||||
// the whole type (that we had as input), and the truthy set is empty. | ||||||||||
Truthiness::AlwaysFalse => Some(*self), | ||||||||||
Truthiness::AlwaysTrue => Some(Type::Never), | ||||||||||
Truthiness::Ambiguous => None, | ||||||||||
}, | ||||||||||
}, | ||||||||||
Type::LiteralString => Some(Self::StringLiteral(StringLiteralType::empty(db))), | ||||||||||
_ => match self.bool(db) { | ||||||||||
Truthiness::AlwaysFalse => Some(*self), | ||||||||||
Truthiness::AlwaysTrue => Some(Type::Never), | ||||||||||
Truthiness::Ambiguous => None, | ||||||||||
}, | ||||||||||
Comment on lines
+429
to
+442
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this duplication can be avoided by having a two-layer structure, where an inner method (or an initial match statement within one method) handles only the specific known cases, and if it returns |
||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
/// Returns, if it can be expressed, the set of values that are truthy in this type. This is | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The "can be expressed" formulation, here and above, has an implicit "without an explicit intersection type" caveat: all intersections with truthy/falsy "can be expressed" as an intersection. This is another reason why I think these methods should rather be internal implementation details of |
||||||||||
/// rarely defined, with the exception of the `builtins.bool` type. | ||||||||||
#[must_use] | ||||||||||
pub fn truthy_set(&self, db: &'db dyn Db) -> Option<Type<'db>> { | ||||||||||
match self { | ||||||||||
Type::Instance(class) => match class.known(db) { | ||||||||||
Some(KnownClass::Bool) => Some(Type::BooleanLiteral(true)), | ||||||||||
_ => match self.bool(db) { | ||||||||||
// When a `__bool__` signature is `AlwaysTrue`, it means that the truthy set is | ||||||||||
// the whole type (that we had as input), and the falsy set is empty. | ||||||||||
Truthiness::AlwaysTrue => Some(*self), | ||||||||||
Truthiness::AlwaysFalse => Some(Type::Never), | ||||||||||
Truthiness::Ambiguous => None, | ||||||||||
}, | ||||||||||
}, | ||||||||||
_ => match self.bool(db) { | ||||||||||
Truthiness::AlwaysFalse => Some(*self), | ||||||||||
Truthiness::AlwaysTrue => Some(Type::Never), | ||||||||||
Truthiness::Ambiguous => None, | ||||||||||
}, | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
/// Return true if the type is a class or a union of classes. | ||||||||||
pub fn is_class(&self, db: &'db dyn Db) -> bool { | ||||||||||
match self { | ||||||||||
|
@@ -486,6 +559,8 @@ impl<'db> Type<'db> { | |||||||||
} | ||||||||||
Type::Unknown => Type::Unknown, | ||||||||||
Type::Unbound => Type::Unbound, | ||||||||||
Type::Truthy => Type::Unknown, | ||||||||||
Type::Falsy => Type::Unknown, | ||||||||||
Comment on lines
+562
to
+563
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These should be |
||||||||||
Type::None => { | ||||||||||
// TODO: attribute lookup on None type | ||||||||||
Type::Todo | ||||||||||
|
@@ -542,14 +617,17 @@ impl<'db> Type<'db> { | |||||||||
Type::Any | Type::Todo | Type::Never | Type::Unknown | Type::Unbound => { | ||||||||||
Truthiness::Ambiguous | ||||||||||
} | ||||||||||
Type::None => Truthiness::AlwaysFalse, | ||||||||||
Type::Function(_) => Truthiness::AlwaysTrue, | ||||||||||
Type::Module(_) => Truthiness::AlwaysTrue, | ||||||||||
Type::None | Type::Falsy => Truthiness::AlwaysFalse, | ||||||||||
Type::Function(_) | Type::Module(_) | Type::Truthy => Truthiness::AlwaysTrue, | ||||||||||
Type::Class(_) => { | ||||||||||
// TODO: lookup `__bool__` and `__len__` methods on the class's metaclass | ||||||||||
// More info in https://docs.python.org/3/library/stdtypes.html#truth-value-testing | ||||||||||
Truthiness::Ambiguous | ||||||||||
} | ||||||||||
// Temporary special case for `FunctionType` until we handle generic instances | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using the word "generic" like this is ambiguous, sounds like it could mean generic types.
Suggested change
|
||||||||||
Type::Instance(class) if class.is_known(db, KnownClass::FunctionType) => { | ||||||||||
Truthiness::AlwaysTrue | ||||||||||
} | ||||||||||
Comment on lines
+627
to
+630
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Was this necessary to unblock something else in this PR? If so, that's fine. If not, why should we add a temporary special case for this? Maybe we just leave it out and wait for the general Type::Instance handling? |
||||||||||
Type::Instance(_) => { | ||||||||||
// TODO: lookup `__bool__` and `__len__` methods on the instance's class | ||||||||||
// More info in https://docs.python.org/3/library/stdtypes.html#truth-value-testing | ||||||||||
|
@@ -570,9 +648,23 @@ impl<'db> Type<'db> { | |||||||||
} | ||||||||||
first_element_truthiness | ||||||||||
} | ||||||||||
Type::Intersection(_) => { | ||||||||||
// TODO | ||||||||||
Truthiness::Ambiguous | ||||||||||
Type::Intersection(intersection) => { | ||||||||||
// The truthiness of the intersection is the intersection of the truthiness of its | ||||||||||
// elements: | ||||||||||
// - `Ambiguous` & `Truthy` == `Truthy` | ||||||||||
// - `Ambiguous` & `Falsy` == `Falsy` | ||||||||||
// - `Truthy` & `Falsy` == `Never` -- this should be impossible to build | ||||||||||
intersection | ||||||||||
// Negative elements (what this intersection should not be) do not have an | ||||||||||
// influence on the truthiness of the intersection. | ||||||||||
// Or if they should, this should be simplified at build time. | ||||||||||
.positive(db) | ||||||||||
.iter() | ||||||||||
.map(|ty| ty.bool(db)) | ||||||||||
// Stop at the first `Truthy`or `Falsy` since an intersection containing both | ||||||||||
// should simplify to the empty intersection at build time. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
.find(|truthiness| !truthiness.is_ambiguous()) | ||||||||||
Comment on lines
+664
to
+666
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment is more narrow than what the code actually does; What this means is that this logic is only correct if we do eagerly check truthiness of every type added to an intersection and resolve to Never if we find both an always-truthy type and an always-falsy type in the intersection. I think it's correct to do this, I'm a bit worried about the performance impact. But I think we should probably try doing it and see if it actually becomes a performance bottleneck. |
||||||||||
.unwrap_or(Truthiness::Ambiguous) | ||||||||||
} | ||||||||||
Type::IntLiteral(num) => Truthiness::from(*num != 0), | ||||||||||
Type::BooleanLiteral(bool) => Truthiness::from(*bool), | ||||||||||
|
@@ -718,10 +810,12 @@ impl<'db> Type<'db> { | |||||||||
Type::Todo => Type::Todo, | ||||||||||
Type::Unknown => Type::Unknown, | ||||||||||
Type::Unbound => Type::Unknown, | ||||||||||
Type::Truthy | Type::Falsy => Type::Unknown, | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should also result in a diagnostic; we could group it under the below TODO comment for that, for now |
||||||||||
Type::Never => Type::Never, | ||||||||||
Type::Class(class) => Type::Instance(*class), | ||||||||||
Type::Union(union) => union.map(db, |element| element.to_instance(db)), | ||||||||||
// TODO: we can probably do better here: --Alex | ||||||||||
// TODO: case of `type[X] & Truthy` or `type[X] & Falsy` should be straightforward | ||||||||||
Comment on lines
817
to
+818
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think the added comment here is necessary, nor was the pre-existing comment, now that we have
Suggested change
|
||||||||||
Type::Intersection(_) => Type::Todo, | ||||||||||
// TODO: calling `.to_instance()` on any of these should result in a diagnostic, | ||||||||||
// since they already indicate that the object is an instance of some kind: | ||||||||||
|
@@ -745,6 +839,7 @@ impl<'db> Type<'db> { | |||||||||
match self { | ||||||||||
Type::Unbound => Type::Unbound, | ||||||||||
Type::Never => Type::Never, | ||||||||||
Type::Truthy | Type::Falsy => Type::Unknown, | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be |
||||||||||
Type::Instance(class) => Type::Class(*class), | ||||||||||
Type::Union(union) => union.map(db, |ty| ty.to_meta_type(db)), | ||||||||||
Type::BooleanLiteral(_) => KnownClass::Bool.to_class(db), | ||||||||||
|
@@ -1452,24 +1547,88 @@ pub struct IntersectionType<'db> { | |||||||||
negative: FxOrderSet<Type<'db>>, | ||||||||||
} | ||||||||||
|
||||||||||
impl<'db> IntersectionType<'db> { | ||||||||||
/// If this is an intersection of `X & Truthy` (generalized to `(X & Y) & Truthy`), returns the | ||||||||||
/// left side of the intersection (the type that intersects with `Truthy`). | ||||||||||
fn truthy_of(self, db: &'db dyn Db) -> Option<Type<'db>> { | ||||||||||
if self.positive(db).contains(&Type::Truthy) { | ||||||||||
let builder = self | ||||||||||
.positive(db) | ||||||||||
.iter() | ||||||||||
.filter(|ty| *ty != &Type::Truthy) | ||||||||||
.fold(IntersectionBuilder::new(db), |builder, ty| { | ||||||||||
builder.add_positive(*ty) | ||||||||||
}); | ||||||||||
Some( | ||||||||||
self.negative(db) | ||||||||||
.iter() | ||||||||||
.fold(builder, |builder, ty| builder.add_negative(*ty)) | ||||||||||
.build(), | ||||||||||
) | ||||||||||
} else { | ||||||||||
None | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
/// If this is an intersection of `X & Falsy` (generalized to `(X & Y) & Falsy`), returns the | ||||||||||
/// left side of the intersection (the type that intersects with `Falsy`). | ||||||||||
fn falsy_of(self, db: &'db dyn Db) -> Option<Type<'db>> { | ||||||||||
if self.positive(db).contains(&Type::Falsy) { | ||||||||||
let builder = self | ||||||||||
.positive(db) | ||||||||||
.iter() | ||||||||||
.filter(|ty| *ty != &Type::Falsy) | ||||||||||
.fold(IntersectionBuilder::new(db), |builder, ty| { | ||||||||||
builder.add_positive(*ty) | ||||||||||
}); | ||||||||||
Some( | ||||||||||
self.negative(db) | ||||||||||
.iter() | ||||||||||
.fold(builder, |builder, ty| builder.add_negative(*ty)) | ||||||||||
.build(), | ||||||||||
) | ||||||||||
} else { | ||||||||||
None | ||||||||||
} | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
Comment on lines
+1550
to
+1595
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think if we generalize our approach in the intersection builder, we shouldn't need these methods; I think the approach using these methods is more special-cased than what we ideally want. |
||||||||||
#[salsa::interned] | ||||||||||
pub struct StringLiteralType<'db> { | ||||||||||
#[return_ref] | ||||||||||
value: Box<str>, | ||||||||||
} | ||||||||||
|
||||||||||
impl<'db> StringLiteralType<'db> { | ||||||||||
pub fn empty(db: &'db dyn Db) -> Self { | ||||||||||
Self::new(db, Box::default()) | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
#[salsa::interned] | ||||||||||
pub struct BytesLiteralType<'db> { | ||||||||||
#[return_ref] | ||||||||||
value: Box<[u8]>, | ||||||||||
} | ||||||||||
|
||||||||||
impl<'db> BytesLiteralType<'db> { | ||||||||||
pub fn empty(db: &'db dyn Db) -> Self { | ||||||||||
Self::new(db, Box::default()) | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
#[salsa::interned] | ||||||||||
pub struct TupleType<'db> { | ||||||||||
#[return_ref] | ||||||||||
elements: Box<[Type<'db>]>, | ||||||||||
} | ||||||||||
|
||||||||||
impl<'db> TupleType<'db> { | ||||||||||
pub fn empty(db: &'db dyn Db) -> Self { | ||||||||||
Self::new(db, Box::default()) | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
#[cfg(test)] | ||||||||||
mod tests { | ||||||||||
use super::{ | ||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like for these doc comments to mostly stick to a factual description of what the type represents. Details of how it is used and examples of its use (especially in this case, as we figure out this feature over time) are likely to change, and too much detail here is just likely to go out of date.