-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust Structure and Implementation "Embedding" Brainstorm #2431
Comments
Some relevant info:
What I don't recall any preexisting discussion about is whether or not "embedding" could or should exist alongside a complete virtual struct solution. |
Hi Ixrec.
Yep. That seems to be the case, more or less. Neither of us is too familiar with Go, so thanks for pointing that out. While embedding data into structs this way isn't terribly different than simply adding a field and delegating to the field, in my opinion there is an important distinction to be made: a field that's embedded from another struct would be accessed by As such, I think these two concepts (struct "embedding" and implementation "delegation") are actually very relevant to each other in the context of this particular code reuse schema. For example: trait Consume {
fn eat(&self);
fn drink();
}
impl Consume for Animal {
fn eat(&self) {
println!("I'm eating some {}!", self.favorite_food);
}
fn drink() {
println!("I'm drinking now!");
}
}
impl Consume for Dog {
..Animal
} In this simple example, Dog would be delegating/inheriting/embedding all implementations from Animal. As a result, the In this case, I would argue that embedding
Great, we'll check it out and join the ongoing conversation over there. Again, thanks for the heads up.
Without a doubt, the "delegating/embedding" syntax that we're talking about here wasn't suggested as a full or comprehensive implementation of traditional inheritance with virtual methods, implicit upcasting, struct layout guarantees, etc. Instead, we're talking about a pretty clean syntax for basic code reuse that Rust is lacking as of now. You'd be able to easily share implementation among structures that share common traits, without complex syntax or bad coding practices, but it's not full c++/java style OOP nor is it meant to be. One of the things that I think we can all agree makes Rust great is that it has been designed in a way that helps us to write good code and avoid many kinds of bugs - well, I think we can also all agree that copy-pasting code, which is sloppy and error-prone, is really not in line with that core philosophy! At any rate, if Rust is looking for a traditional and comprehensive solution to OOP/inheritance then I agree that may demand a much more complex solution. But when you look at this through the lens of quick-and-easy code reuse, I think this idea and syntax still has merit especially since it could potentially co-exist with a "true" inheritance implementation. |
I just want to add that I've changed the title of this RFC to use the term "embedding" instead of "inheritance" to help put the focus on code reuse instead of OOP. |
If we were to do some sort of type embedding, I would prefer some syntax a bit more loud than reusing FRU syntax. Perhaps: struct Foo {
x: usize,
use Bar,
} There's also issues around visibility that needs to be worked out, for example:
|
Hello Centril. The trait Consume {
fn eat(&self);
fn drink();
}
struct Animal {
favorite_food: &'static str,
}
impl Consume for Animal {
fn eat(&self) {
println!("I'm eating some {}!", self.favorite_food);
}
fn drink() {
println!("I'm drinking now!");
}
}
//================= Basic Embedding
struct Dog {
name : &'static str,
use Animal, //Dog 'embeds' any fields that aren't already defined in this scope from Animal
}
impl Consume for Dog {
use Animal; //Dog 'embeds' Animal's entire Consume trait implementation.
}
//================= "Override" Embedding
impl Consume for Dog {
fn drink() { //Dog implements its own drink()
println!("Woof! I'm drinking out of a dog bowl!");
}
use Animal; //But then 'embeds' any remaining Consume trait functions from Animal.
}
//================= Multiple Embedding
impl Consume for SnakeDog {
fn eat(&self) use Snake; //SnakeDog eats like a Snake!
fn drink() use Dog; //But drinks like a Dog! Imagine that!
} That makes sense, is really quick, and it also reads very clearly, in my opinion! As for structure embedding with In the case of generics, for example, I think it'd be visually cleaner if structs which embed a generic struct have to be explicitly generic themselves. For example: struct Gen<T> {
x: T,
}
struct Data {
//...//,
use Gen<T>,
} //Bad, Unclear, Compile error!
struct Data<T> {
//...//,
use Gen<T>,
} //Readable! OK! The compiler simply tries to copy-paste the Gen fields into Data, but in order for |
Here's a description of embedding that I wrote in the delegation thread #2393. I think it probably makes sense to paste it here too:
|
Note that this is essentially at least a subset of inheritance, even if you phrase it in a very roundabout way. In particular, it brings up the many issues that come with "traditional" OO inheritance which considers fields too. For example, it gives rise to the well-known (and dreaded) diamond problem. It also looks painfully non-orthogonal or non-uniform in Rust's rich type system: opposed to the existing trait inheritance, this embedding would only work with structs, and not with other types such as enums or primitives. Rust had more OO-like Finally, one more very specific piece of criticism:
This looks very scary. If embedding ever gets implemented, this situation should definitely provoke a compiler error. Silently ignoring duplicate fields in the very definition of a type is dangerous, it can lead to extremely subtle and surprising errors. |
Good points. Somehow I'd missed that "impl embedding" does so much with so little syntax that it sneakily reintroduces all the usual gotchas and corner cases of traditional inheritance, like the diamond problem. It's only struct embedding that has the very simple, straightforward semantics that make it work so well in Go (and even that is partially because Go lacks complications like generics). Therefore, I'm convinced that, whether or not we should have struct embedding, it should be a separate feature from delegation. So we're probably at the point where, as with many other requests for sugar syntax, whoever wants to see this feature happen needs to produce some compelling examples of realistic code (not Cat/Dog/Animal toy examples) where this would be a significant improvement. |
I posted this on the forum: |
We originally used the term "inheritance", but that's also a term with a lot of assumptions and baggage behind it. As this doesn't do all of the things that many people expect traditional inheritance to do, I'm drifting further towards calling it "embedding" of data and implementations. At any rate, I don't mind whatever it's called.
Hmm. I'm not really seeing this. Sure, if "blanket" embedding of impl blocks was the only thing that's being discussed here then maybe that would be the case, but being able to use specific and cherry-picked embedding on a per-function basis seems to eliminate that problem by allowing the user to tell the compiler exactly which implementation they would like. Attempting to blanket embed two trait could simply be disallowed, resulting in a compiler error that suggests that the programmer use: // This compiles!
impl Consume for Dog {
use Animal;
}
// This does not!
impl Consume for SnakeDog{
use Snake;
use Dog;
} // Something like "error: can only blanket embed from one implementation..."
// However this would compile and (unless I'm missing something) avoids the diamond problem.
impl Consume for SnakeDog {
fn eat(&self) use Snake;
fn drink() use Dog;
} // No ambiguity. Specific 'cherry-picked' embeds from multiple other implementations. (I'll probably be using the
I'll admit I haven't thought of this as being universal as of yet. Having said that, just off the top of my head I can't see any reason why something similar couldn't exist for enums. Outside of structs and trait implementations I haven't given the bigger picture much thought or made any attempts to generalize this idea further, so I'll just have to think about it in other contexts.
Can you be more specific about the type of errors that ignoring duplicate field embeds would bring about? As I see it, if a piece of data has both the same name and type, it is effectively the same piece of data. If one structure has In the event that the same name is used but the types are clashing, I think a compile error would be 100% appropriate. But, I'm not really seeing the potential harm involved in ignoring duplicate fields which share both name and type. A compiler warning, maybe, but a completely show-stopping error? I would need to know more about the potential errors to get behind this. At any rate, thanks for the comment H2CO3, and I hope you don't mistake my elaborate responses as defensiveness. We've labelled this as a "brainstorming" session and nothing more, so all input and criticisms are very welcome. I've been a compiler user but not a compiler developer, so I'll be the first to admit that I'm looking at this from an idealistic perspective that really demands a dose of pragmatic critique and skepticism. I don't know whether this is a good fit for Rust and the Rust community, but I'm glad to have a chance to work with people who are smarter than myself to flesh out this idea in order to see where it goes.
Ixrec, other than the diamond problem which I don't think truly applies here thanks to specific implementation cherry-picking (see my above example), what other gotchas and corner cases of traditional inheritance do you argue that this suffers from? I'm more than happy to try to dig into the specifics in order to work out the details - in fact, that's really why I'm here at all!
For the record, I still disagree with this for the same reasons that I listed above. I see this as "compiler-assisted copy-pasting" that only does very slightly different things in different contexts. So, we'll probably have to agree to disagree on this one, especially since one of H2CO3's criticisms seems to be that this isn't general enough to Rust's other contexts (enums, primitive types, etc).
Again, not to be defensive, but we saw this "toy example" as a necessarily clear and concise example that models a very basic and intuitive relationship just like you'd see in almost any discussion of inheritance. It's definitely more concrete than I'll be happy to work on a more detailed and concrete example of this in the near future. But even this simple example has already raised a lot of important questions and confusion that has been (in my opinion) quite helpful. Anyway, I don't see this as too far afield from the types of inheritance hierarchies that a game developer or GUI developer might come across, no? Let me know what type of relationship you want to see modelled in this way and I'll try do my best to make something more concrete. |
Me neither; my problem is not nomenclature, but semantics.
How do you embed the data of one
The interpretation of embedding in this context is not immediately obvious to say the least; I would go so far as thinking it would be necessarily surprising and/or illogical, because sum types by definition are not known at compile time to contain all of their data — that is the point in enums. So what can we do? Embed
Sure, the worst one that immediately strikes me as wrong in inheritance is that it violates encapsulation. Make a field or method so private that "subclasses" can't use it? It will be useless for inheritance. Make it "protected", so that only subclasses can access it? That provides a false sense of security, since at that point a subclass can just re-export it through a public method. As an unpleasant but related side effect, it also makes reasoning about visibility a nightmare both in the compiler, and, what's more important, for human consumers of the code too. Encapsulation and visibility is not the only problem with inheritance. The general issue is that any sort of subtyping can easily produce surprising results in practically any context, because as it turns out, people are not terribly good at keeping entire hierarchies in mind. If I say a value is of type T, then readers of the code (my future self included) will assume that it does exactly what type T itself does and nothing else, an assumption broken by subtyping, where "superclass" behavior must also be taken into account. In other words, inheritance prevents local reasoning and introduces the need for global reasoning, which is a huge burden on both tools and brains, and a regular source of to hard-to-find bugs. (I've been using several traditional object-oriented languages, and I can tell you this is a very realistic problem in a large code base. Both I and many respected and skilled co-workers made really bad mistakes related to this nature of inheritance.)
I beg to differ. The two fields can still have different contexts and different meanings. The name and the type are — unfortunately — not everything. This "ignoring duplicates" feature sounds very much like the compiler second guessing the programmer based on some sort of heuristics which may even work in most cases, but would be completely broken if the assumption doesn't hold in just one single case. Magic like this I think has no place in a language that picked safety and correctness as its explicit goals. A very concrete example is: I'm currently writing a webservice and I'm using public key cryptography (through the excellent You might argue that I should just use newtypes to prevent this, however:
If we restrict the feature to inheriting methods, then what we get is a subset of the existing trait + default method system. If we also allow cherry-picking of individual methods, then we get a subset of the delegation RFC. I don't think we should add a subset of any feature twice to the language. In conclusion, while I do think at least some of these problems could somehow be worked around, they are so fundamental that it would never be possible to completely eliminate them without resorting to several special cases and ugly hacks (including non-orthogonal restrictions) in the design of the language, and thus for me, any OO-inheritance-like feature would essentially be a complete showstopper. |
Enums weren't really considered within the original scope of this discussion. Having said that, if they were, I think it could follow the same pattern of behavior as structs or implementations; to embed a struct (A) within another (B) would effectively copy-paste fields from A into B (perhaps excluding exact name:Type duplicates); to embed an implementation (X) into another (Y) would effectively copy-paste function definitions from X into Y; and, by that same pattern, to embed an enum (U) into another (v) would effectively copy-paste variants from U into V. Each one of these exists as nothing more than a piece of friendly syntax that discourages manually copy-pasted code. In other words: enum Foo {
Var1,
Var2(String).
}
enum Bar {
use Foo;
Var3,
}
// Could compile to something like:
enum Foo {
Var1,
Var2(String).
}
enum Bar {
Var1,
Var2(String),
Var3,
} Different-yet-similar enum types with zero automatic conversion or compatibility between, perhaps with sensible limitations, restrictions, or errors that occur to prevent name collisions. That's under the assumption that this type of embedding needs to be universal or apply to enums; I think it's possible, but it wasn't what I had in mind. Just like in the case of structs or impl blocks, the mantra here is "compiler-assisted copy-pasting of code" - it doesn't make the same promises that traditional inheritance makes, it simply tells the compiler to help you generate similar structures. In some ways I personally don't think it's too much different that Rust's generics, which really serve as a command to tell the compiler to statically generate a bunch of similar code on your behalf - you could just copy and paste a bunch of functions changing the type each time, but generics help you avoid that error-prone and time-wasting behavior.
Bingo.
A few of the guesses seem wildly out of left field here. If variants have clashing names do we need some genius solution? No way. Rust will do what it does best - compile error with a nice message telling you what the problem is and what needs to be done to fix it - in this case, one of the variants needs to be renamed because the same name cannot be used to describe to different variants. In my opinion, the compiler doesn't need to cleverly figure everything out or solve problems in the user's code, it just needs to enforce a set of rules, spitting out useful warning and errors wherever necessary. If anything, I think I've been pretty straight-forward about the concept here, it's not some massive, complex, or smart solution to ever problem, it's struct->struct or impl->imply (perhaps even enum->enum) code reuse with a basic set of logic and rules for each one.
I don't see this is a correct interpretation of what's being discussed here. The visibility or permissions of a field are not changed in any way. There is no subclassing, there is no implicit casting between these structures, no true 'inheritance'. If some field of the same name:Type exists, it will simply not be embedded (in other words, it won't be copied and pasted by the compiler).
I believe you, and I know that inheritance has hidden pitfalls. But since we're talking about behavior now, we can talk about implementation embedding. Instead of using animal/dog/cat/etc, i'll just use letters this time: impl T for A {
fn ding() {
//Type A uses an entirely customized version of `ding()`.
}
fn foo() use X; // Type A uses X's entire and exact `foo()` implementation.
fn bar() use Y; // Type A uses Y's entire and exact `bar()` implementation.
use Z; // Type A uses Z's entire and exact implementation for any remaining functions of this trait, T.
} In other words, each function implementation is either reused from an existing implementation of that same trait OR it is a new implementation - embedding allows no in-between. If you're using code from an existing implementation, you are committing to doing exactly what that other implementation does. And if you're writing your own implementation, you're committing to doing exactly what is within the brackets. There's no way to call into the superclass' version half way through, there is no dynamic behavior or polymorphism, etc. Your implementation either does something new or copies something else's exact way of behaving.
We'll have to agree to disagree here. There's no "second guessing" or "magic" here: as programmers we are telling the compiler exactly what to do. It's my opinion that two data fields with the exact same name and type are effectively the same, just in the same way that if I ask someone to pass me an empty 4x4x4ft box with the word "tools" written on the side, it doesn't matter how many identical boxes exist in the world, I am simply asking for one thing that meets that name:Type specification. I do realize that this is a matter of opinion and highly subjective, so we'll just have to leave it at this: either the compiler treats two fields of identical name:Type as the same OR the compiler spits out an error and the user has to manually clean things up. It's merely a matter of taste.
In this example you don't want embedding or inheritance though - you want aggregation, no? I agree that embedding doesn't sound like the answer for this particular scenario, but just because you have a hammer doesn't make every problem a nail. If you have a structure with one PublicKey you simply call it public_key and use it like that. Makes sense. But if you ask the compiler to embed another structure, or to embed some implementation that uses I think embedding feels wrong here because it is wrong here, and the generic or an enum parameter is right design. You may have already come across a good design.
How can you get a subset of what we have now by adding new functionality? I'm not sure I follow you there. But, more importantly, why be limited to a "this or default" paradigm, when we could allow embedding from any other existing implementation?
You could see it that way. But I see it instead as a much simpler solution to a much simpler problem, with, in my opinion, a bit nicer syntax. Embedding doesn't make the same promises or guarantees as delegation, nor does it have any real effect on the run-time behavior of a Rust program.
But which fundamental problems are those exactly? "Diamond problem?", doesn't really exist here. "Could this exist for enums too?", I've shown that it could. "Should identical fields be ignored or result on a compiler error?", that's a matter of opinion but would work either way - certainly not a "fundamental problem". From my perspective, the most immediately obvious problem is that assumptions are being made here based on an existing understand of inheritance in other languages. But this isn't "inheritance as it exists in other languages", embedding is simply a compile-time tool for simple and effective code reuse within data structures, implementations, and (maybe) enumerations. As was said at the outset, if a full and traditional implementation of OO-style inheritance is what Rust needs, then this isn't it. |
Alright. So no subtyping at all, but pure syntactic sugar? I think that should be made clearer. References have been made to "inheritance" and Go's "embedding", both of which imply more or less subtyping, respectively. However, I still have to question the value in introducing special syntax for such a piece of functionality. What problem does this solve that a macro (either declarative or procedural) couldn't? Rust users suggesting new syntactic sugar features often forget that the point of macros is that you don't have to build every new piece of syntactic sugar into the language; instead, you can write your own! Everyone benefits from that: if you want this functionality, you don't have to go through the RFC process and wait until the feature eventually lands in the language, if it ever does. You don't have to make compromises as to how it works in order to cater the entire community. You can just write your own macro implementing it for exactly the semantics you want. And those who don't need or want it wouldn't need to worry about the increased complexity in the language and the bugs it potentially hides.
So… if what you are proposing is indeed only compiler-assisted copy-pasting, then I especially don't see why you wouldn't want duplicate fields to be an error. When you copy fields of struct
According to your proposal, the compiler would need to assume scenario 1. This means that genuine oversights resulting in scenario 2 would go undetected. (See, this problem is not even specific to subtyping at all.) Since all of this is happening at compile time, a "duplicate field" error could be trivially resolved by the programmer in no time using static knowledge of his/her own code: either delete a line or edit the name of a field. But the important part is that the programmer gets to make that decision.
Again, I do realize that, although the description wasn't very clear about what exactly this doesn't propose. |
Just to clarify, Go embedding is not inheritance, and does not behaves like inheritance at all. For example: type T int
func (T) foo()
type U struct {
T // embedded T
}
var u U
If I understand this proposal correctly, Rust could do something similar. No inheritance, just syntactic sugar for composition and method/field delegation. |
The problem with proposals like this, is that people don't like it because "it feels like inheritance". Instead, everyone just implements |
Actually
We never |
@burdges pub struct Dog {
..Legs,
..Mouth
}
let dog = Dog::new();
dog.walk(); // dog.legs.walk()
dog.bark(); // dog.mouth.bark() The point of embedding is not to implement smart-pointers. That is literally what |
At that point, however, it would be trivial to add inherent accessor methods to the embedded types, like |
@H2CO3 I'm not saying that pattern is common, it is more common to have a single embedded field. I was simply showing how composition by embedding is very different than composition by |
I think "embedding" only confuses the issue, because whether you write Access and polymorphism matter though, meaning whether It's clear I'd expect derive_more could be expanded for say |
With |
We've a separate delegation discussion that largely avoids discussing embedding. If anything, embedding increases the strangeness cost of delegation, and thus makes ever approving a delegation RFC less likely.
|
I've been talking about code reuse in Rust with my brother ( @emmetoneillpdx ) and one of the ideas we considered was a form of "static inheritance" which basically amounts to a syntax for automatically pulling either data or functions (or both) from existing structs and trait implementations. The proposed syntax is roughly based on Rusts' existing "Struct Update Syntax". Here's a simple pseudocode example:
Allowing for code reuse this way should not fundamentally change the way Rust behaves at runtime but still allows for behavior that is somewhat similar to inheritance without error-prone code duplication practices. Essentially, we're just asking the compiler to copy/paste any undefined fields or functions from any known (at compile time) struct or implementation.
We don't have anything more concrete than that right now, but this is an idea that came up over the course of a conversation and we both wanted to share it and see what people thought. I'd love to hear feedback and form a discussion around this idea or others like it. Also, if you know of any relevant RFCs where it might be appropriate to share this (or is already working on a similar idea), please share.
The text was updated successfully, but these errors were encountered: