Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ephemeral/Shadow navigation properties or other mechanism of automatic joins. #32658

Closed
voroninp opened this issue Dec 21, 2023 · 44 comments
Closed
Assignees
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported

Comments

@voroninp
Copy link

I need the convenience of navigation properties and want to use them for implicit joins and filtering without having them as CLR properties or fields.

Motivation

  • Aggregates should not contain referential links to each other.

  • When aggregate references another one with a key, it may be wise not to put FK constraint. (I know there's an open issue for this)

  • People abuse navigation properties very often. They add all navigations mimicking the database relations and then return the entity with the subset of includes:

    ctx.Set<A>().Include(a => a.B.C.D).ThenInclude(d => d.E.F.G).Include(a => a.B.C.D).ThenInclude(d => d.H.I.J);
    

    This becomes a nightmare when one must guess somewhere deep in business logic which properties were included and which were not. A business object should not be a dynamic graph which changes its shape based on some implicit context.
    I blame ORMs for creating the mindset when people start designing their application from DB:

    What tables do we need for the service?

    Arrrghhh!
    image


Let's speculate a bit, now.

Here are two things: Blog and Post. I hope you agree that they usually change independently in different transactions, so they are indeed two distinct aggregates. Hence, post can reference the blog only by id.

public class Post
{
    public int Id {get;set;}
    public int BlogId {get; set;}
}

public class Blog
{
   public int Id {get; set;}
}

However, for the read side of my application I may need to build a projection from both of them.
To achieve this, I either must use separate DbContext|Types (hello, table splitting with its own problems), or I must join aggregates manually:

ctx.Set<Blog>().Join(ctx.Set<Post>(), (l,r) => l.Id == r.BlogId, (l,r) => new { Blog = l, Post = r}).Where(...)

This quickly becomes ugly, if I need to join more than two aggregates to build a projection. Alternatively, I could switch to linq syntax, but explicit join is still required.

Now let's assume we are allowed to have shadow navigation properties. The following becomes legit:

ctx.Set<Blog>().Where(b => EF.Property<ICollection<Post>>(b, "Posts").Any(p => ...));

Well, hardly better - too verbose.

Let's hope EF 9 finally has the feature to easily define custom expressions for methods or properties, so this becomes possible:

public static class Navigations
{
    public static class OfBlog
    {
        public const string Posts = "Posts";
    }

    public static class OfPost
    {
        public const string Blog= "Blog";
    }
}

public static class NavigationExtensions
{
   //  And there should be some magic way to make EF recognize this method and replace expression with EF.Property<ICollection<Post>>(blog, Navigation.Posts)
   public static ICollection<Post> Posts(this Blog blog) => EF.Property<ICollection<Post>>(blog, Navigations.OfBlog.Posts);
   public static Blog Blog(this Post post) => EF.Property<Blog>(post, Navigations.OfPost.Blog);
}

Now I'll be able to call it like this:

ctx.Set<Blog>().Where(b => b.Posts().Any(p => ...));
cts.Set<Post>().Where(p => p.Blog().Posts().Any(p => ...)).

Perfect. When filtering, I have the navigations to easily access properties of related aggregates, but when returning the entities, no navigation properties!

@ajcvickers
Copy link
Contributor

@voroninp I've read through this and I think it's extremely unlikely that we will implement any of this.

@voroninp
Copy link
Author

@ajcvickers But do you at least agree that it's a valid concern/pain point?

@ajcvickers
Copy link
Contributor

@voroninp Not really, no.

@voroninp
Copy link
Author

@ajcvickers Ok, how would you work with the aggregates on the read-side of the application?

  • Define plain anemic DTOs with Navigation properties using table splitting feature?
  • Join aggregates manually?
  • ...

@roji
Copy link
Member

roji commented Jan 31, 2024

I'm struggling to understand your argument above. Modeling e.g. blogs and posts via types that reference each other is the standard, natural way to do things in .NET (regardless of databases or ORMs); if I want to pass around a list of blogs - including their posts - each blog must reference their post - or I have to start passing around an additional dictionary just to represent the mapping between blogs and posts. I don't know of any design principle that would mandate shredding an object graph like this, or any .NET developer who would prefer to actually work in this totally disconnected way.

Given that, EF generally strives to support modeling your database via regular .NET types, with the same design principles and relationships you'd use in .NET. Throwing all of that out just because of the occasional difficulty of knowing which entities were included which weren't simply doesn't make sense to me.

Ok, how would you work with the aggregates on the read-side of the application?

EF has its normal/standard way of doing things - that's navigation properties; I recognize that there can be issues with the concept in some scenarios, i.e. needing to know/check whether some navigation property was loaded or not. At the same time, the concept is incredibly useful for passing a graph around - a blog with all of its posts, as a single unit - in application business logic; this is something that we know very, very clearly from our users.

At the end of the day, I'm happy to have conceptual discussions about how EF works and how it could work; and I appreciate your attempts to sometimes push EF to the edge and use it in atypical, novel ways. But at a very concrete level, EF (like any other component) has its way of doing things; those obviously cannot accomodate the exact tastes and needs of every user out there, and so in some cases users have to adapt their coding style to how the product works; this is one of these cases. So you'll have to make do with either regular navigations, or with manual joining on foreign keys.

@roji
Copy link
Member

roji commented Jan 31, 2024

One more note; the only way your proposal can actually help with the problem your describing, is to force the developer to always load the posts at the point where they're needed - this is something that's very bad for performance. Assume, when thinking about this problem, that eagerly loading the blogs with their posts is vital for performance, e.g. to avoid doing an additional roundtrip for each and every dependent.

@voroninp
Copy link
Author

@roji

At the same time, the concept is incredibly useful for passing a graph around

That's absolutely true with one particularly important nuance: aggregate = fixed aggregate It's the aggregate of the objects which must change within the same transaction.

i.e. needing to know/check whether some navigation property was loaded or not.

If I return you such an aggregate with optional (nullable) one to one relationship, how would you know whether it was not included, or that optional dependency does not exist at all?

@voroninp
Copy link
Author

@roji

Assume, when thinking about this problem, that eagerly loading the blogs with their posts is vital for performance, e.g. to avoid doing an additional roundtrip for each and every dependent.

I doubt these two types of entities should be changed within a single transaction. They are two different aggregates.
Yet when I make a query for a read-model, I may need to do some aggregations (like number of posts or something similar).

@voroninp
Copy link
Author

I had some check to prevent access to Navigation properties withing aggregates:

/// <summary>
/// Attribute for indicating that navigation property to other aggregate(s) should be used exclusively for reads,
/// and is not intended for access by aggregate itself.
/// </summary>
/// <remarks>
/// It's a very bad practice to have referential navigations between aggregates.
/// But in cases when aggregate is used as a source of out-DTOs it can be convenient to have navigation properties
/// queried with EF's Include. In all other cases, when aggregate requires some data belonging to other aggregate,
/// it's better to create a separate domain DTO or value object for that particular piece of data and pass it as
/// method's parameter:<br/>
/// <c>aggregateA.DoSomethingAccepting(domainDTOBuiltFromAggregateB)</c><br/><br/>
/// 
/// It is also possible to pass the aggregate itself:<br/>
/// <c>aggregateA.DoSomethingAccepting(aggregateB)</c><br/>
/// This approach, however, increases coupling between aggregates. In some circumstances this may be advantageous
/// because it will indicate close connectedness of business concepts.
/// </remarks>
[AttributeUsage(AttributeTargets.Property, AllowMultiple = false)]
public sealed class ReadOnlyNavigationAttribute : Attribute
{
}

with checks in my repository:

static bool IsProhibitedNavigationToAnotherAggregate(INavigation navigation)
    =>
    // shadow properties cannot reference other aggregates.
    navigation.IsShadowProperty()
    // private fields without public properties cannot reference other aggregates
    || navigation.FieldInfo is not null && (navigation.PropertyInfo is null || navigation.PropertyInfo.GetMethod?.IsPublic == false)
    // Not nullable field or property not allowed.
    || navigation.FieldInfo is { } field && Nullability.Context.Create(field).ReadState != NullabilityState.Nullable
    || navigation.PropertyInfo is { } property && Nullability.Context.Create(property).ReadState != NullabilityState.Nullable
    // only public nullable property with appropriate attribute can reference another aggregate.
    || !(navigation.PropertyInfo?.GetMethod?.IsPublic == true
      && Attribute.IsDefined(navigation.PropertyInfo, typeof(ReadOnlyNavigationAttribute)));

And analyzers ensuring that these properties are never accessed by entities themselves.

IDK, maybe @vkhorikov could explain the idea better =)

@roji
Copy link
Member

roji commented Jan 31, 2024

If I return you such an aggregate with optional (nullable) one to one relationship, how would you know whether it was not included, or that optional dependency does not exist at all?

You make the fact that the navigation was included part of the contract for the method in question. In other words, you mandate that the graph passed to a particular method must have its e.g. posts populated, and not doing so is a programmer error.

I doubt these two types of entities should be changed within a single transaction.

I'm not talking about transactions - I just need to load blogs and posts efficiently in order to show them to the user. The concept of an aggregate does not forbid having references outside of the aggregate.

@roji
Copy link
Member

roji commented Jan 31, 2024

I'd really examine exactly what it is you understand by the concept of aggregate here... You seem to be making some assumptions - or working with a definition - that doesn't really correpond to what I understand by the concept.

@voroninp
Copy link
Author

The concept of an aggregate does not forbid having references outside of the aggregate.

It depends on what you mean by reference. Foreign Key is also a reference. But if you mean reference to the instance of other aggregate, then it's forbidden.

@roji
Copy link
Member

roji commented Feb 1, 2024

I mean a .NET reference (e.g. between Blog and Post), with conceptually represents the same thing as a foreign key in the database.

Can you point to a resource explaining why it would be forbidden to have a .NET reference between a Blog CLR type and a Post CLR type?

@voroninp
Copy link
Author

voroninp commented Feb 1, 2024

Vaughn Vernon's article

I'll mention it once again. There are very different scenarios:

  • DDD aggregates are intended for processing commands (write model).
  • Query processing is far more laxed, there's no change at all, so it's ok to query as many related entities as one needs.

People tend to use aggregates (if they even use reach model) both for commands and queries. EF does not provide a straightforward way to define both read and write models simultaneously. Table splitting may be helpful, but it does not prevent changes.

@roji
Copy link
Member

roji commented Feb 1, 2024

People tend to use aggregates (if they even use reach model) both for commands and queries. EF does not provide a straightforward way to define both read and write models simultaneously. Table splitting may be helpful, but it does not prevent changes.

Even if I accept the idea that there shouldn't be cross-aggregate references when doing writing; given that in EF you have just one model for both reading and writing - using the same CLR type and properties - not having navigation properties for the purposes of writing would mean no navigation properties for the purpose of reading, which is a non-starter. Or am I missing something?

That would lead in the direction of having a different type model - different CLR types - for reading and writing, just so that you can omit the navigations from the write side. I'm following this train of thought out of theoretical interest, but this obviously doesn't seem like something anyone would want. It would also mean that you can no longer query for some entities, modify them and then persist back to the database, since the read types and the write types are different; so you'd need to transform the queried read hierarchy (which has navigation - so that you can load related entities, pass them around), into a write hierarchy (where there are no navigations). This basically kills the UoW pattern, at least in the simple way it's currently done, and makes things inefficient, since hierarchies have to be constantly copied/transformed from read to write.

Is this really a direction you're saying we should consider? Or have I misunderstood things?

Specifically on the article you cite above:

So the temptation to modify multiple Aggregates in the same transaction could be squelched by avoiding the situation in the first place.

This seems to be the crux of the matter; this is an attempt to remove the possibility of modifying multiple aggregates in a single transaction, by removing the navigation between them. Several thoughts on this:

  1. In EF specifically, just remove the navigation doesn't ensure you don't modify multiple aggregates in a single transaction. Since EF tracks multiple changes in its UoW, the developer can easily load multiple aggregates (related or not), make changes to them, and then persist them in a single transaction via SaveChanges. I suspect that removing the navigation might make sense in a more ActiveRecord-type of ORM, where changes are more scope-limited to the object being manipulated (and so no navigation means the change is scoped). But that's not the EF case.
  2. Even so, I'm generally skeptical of extreme defensive mechanisms such as this, where an enormous amount of complexity is introduced (severeing connections, introducing a whole service to resolve identity references instead) just to prevent the developer from making a mistake (committing two aggregates in a single transaction). The added complexity has a significant price, which IMHO just isn't worth it in the general case.

Aggregates with inferred object references are thus automatically smaller because references are never eagerly loaded. The model can perform better because instances require less time to load and take less memory. Using less memory has positive implications for both memory allocation overhead and garbage collection.

At least in EF, references are already never eagerly loaded; you need to explicitly use Include to d that. So it doesn't make sense to me to remove navigation properties just to avoid eager loading - the article possibly has other data layers in mind, or something else.

More generally, applying general DDD/theoretical concepts to EF isn't a simple or straightforward thing; EF is a specific ORM with its features and ways of doing things, and not everything is immediately relevant. I'd be wary of just reading DDD articles and trying to apply them verbatim with EF without careful consideration.

@voroninp
Copy link
Author

voroninp commented Feb 1, 2024

not having navigation properties for the purposes of writing would mean no navigation properties for the purpose of reading

Yes, the feature request is exactly to have navigations without having them as properties :-)

That would lead in the direction of having a different type model - different CLR types - for reading and writing

Ideally yes, I'd love to have that opportunity.

It would also mean that you can no longer query for some entities, modify them and then persist back to the database, since the read types and the write types are different

Read types - is what exposed to the UI. In other words, there are requests which ask for data, and there are some requests which change data. Two separate flows.
Read models are for the former, write-models are for the latter. You query the write model, you apply changes, you persist the changes. The commands usually contain IDs of the aggregates they want to affect. One rarely needs to perform multiple joins except when assembling a concrete aggregate.

For the read-models it's different, you just fetch it, but never change or persist.

I mean read and write not relative to EF.

the developer can easily load multiple aggregates (related or not), make changes to them, and then persist them in a single transaction via SaveChanges

You are right, and I had to introduce a check for this. ;-)

At least in EF, references are already never eagerly loaded; you need to explicitly use Include to do that.

For aggregates I use AutoInclude or owned entities.

@roji
Copy link
Member

roji commented Feb 1, 2024

That would lead in the direction of having a different type model - different CLR types - for reading and writing

Ideally yes, I'd love to have that opportunity.

Can you imagine what it would look like to actually work with EF in such a world of two models? Can you post an example snippet of a typical unit-of-work, where a graph of some blogs and their posts are loaded (in a single query), modified, and then persisted back? I can't imagine this being anything that anybody actually wants to do - just in order to be defensive against accidentally modifying two aggregates in a transaction.

At least in EF, references are already never eagerly loaded; you need to explicitly use Include to do that.

For aggregates I use AutoInclude or owned entities

Sure, but my point was that in EF specifically it makes little sense to remove non-owned navigations just to prevent them from being eagerly loaded, since eager loading is explicit anyway (via Include). The way I'm understanding the article, it's recommending removing the navigation for situations where having the navigations implies that they're automatically loaded; but this isn't the case in EF - you need to explicitly opt into loading navigations, which means there's no performance advantage in removing the navigations.

@voroninp
Copy link
Author

voroninp commented Feb 1, 2024

Can you post an example snippet of a typical unit-of-work, where a graph of some blogs and their posts are loaded (in a single query), modified, and then persisted back?

This is probably my English. But what you offer is exactly what I need to avoid.
You either change blog or post.

// Command
public async Task UpdatePost(int postId, UpdatePostRequest request)
{
    var post = await _repo.GetPost(postId);
    post.WithTitle(request.Title).WithContents(request.Contents);
    await _repo.Save(post);
}

// Query. I have to join explicitly, if I prohibitin having nav properties on aggregates.
public async Task<BlogOverview> GetBlogOverview(int blogId) 
{
     var query = 
         from blog in _dbContext.Blogs.Where(b => b.Id == blogId)
         join post in _dbContext.Posts on blog.Id equals post.BlogId into posts
         select new BlogOverview
         {
             Id = blogId,
             Title = blog.Title,
             Author = blog.Author, 
             PostsCount = posts.Count(),
             LastPostDate  = posts.Max(p => p.CreationDate)
         };

      return await query.FirstOrDefaultAsync():
}

// If Read-models with nav properties existed.
public async Task<BlogOverview> GetBlogOverviewWithReadModels(int blogId) 
{
     var query = _dbContext.BlogsReadModel.Where(b => b.Id == blogId)
         select new BlogOverview
         {
             Id = blogId,
             Title = blog.Title,
             Author = blog.Author, 
             PostsCount = blog.Posts.Count(),
             LastPostDate  = blog.Posts.Max(p => p.CreationDate)
         };

      return await query.FirstOrDefaultAsync():
}

// If nav-props existed as extension methods
public async Task<BlogOverview> GetBlogOverview2(int blogId) 
{
     //  just use Aggregate type.
     var query = _dbContext.Blogs.Where(b => b.Id == blogId)
         select new BlogOverview
         {
             Id = blogId,
             Title = blog.Title,
             Author = blog.Author, 
             PostsCount = blog.Posts().Count(),
             LastPostDate  = blog.Posts().Max(p => p.CreationDate)
         };

      return await query.FirstOrDefaultAsync():
}

@voroninp
Copy link
Author

voroninp commented Feb 1, 2024

The way I'm understanding the article, it's recommending removing the navigation for situations where having the navigations implies that they're automatically loaded;

The idea is that you define a fixed graph (root entity and all includes) which contains just enough data to work with business invariants while processing the transaction. The challenge of designing an aggregate is defining the properly sized graph.
The larger the graph, the more concurrency you have, and the less performant is the fetch, obviously.
The smaller the graph, the higher is the chance you won't have all the required data in your hands when you are processing a trasaction.

@roji
Copy link
Member

roji commented Feb 1, 2024

Thanks for the code sample, that helps (as always). So you're not proposing a different model for reading (with navigations) vs. for writing (without navigations) as I wrote above - you're just proposing having a single model for reading/writing, and dropping navigations altogether, forcing yourself to use explicit Joins in queries everywhere.

First, in your GetBlogOverview you've chosen a scenario where you only need the count and max date of a blog's posts (which you then just represent in an anonymous type). What happens when you need to return each blog's actual post instances, because e.g. the UI needs to show all of them? How do you represent the relationships between the blogs and the posts given that there are no navigation properties, and how does consuming code (e.g. in the UI) access them?

@voroninp
Copy link
Author

voroninp commented Feb 1, 2024

What happens when you need to return each blog's actual post instances, because e.g. the UI needs to show all of them? How do you represent the relationships between the blogs and the posts given that there are no navigation properties, and how does consuming code (e.g. in the UI) access them?

First, it's important to understand what actual data UI needs. Is it just post tile, or post title plus first several lines of the post? Or maybe the number of new comments since last visit.
Fetching the whole aggregate just to return a title is an overkill.

If I could predefine read-model (kind of virtual view) I could ask db context for a set of these models:
dbContext.BlogsWithPosts.Where(b => blogId)

But if we have only aggregates which do not have explicit navigation properties, then it will be a simple projection:

// Here it's effectively a split query. Blog is a head, and its posts are queried separately.
public async Task<BlogWithPosts> GetBlogWithPosts(int blogId) 
{
     var blog = await _db.Contex.Blogs.FindAsync(blogId);
     if (blog is null)
     {
         return NotFound();
     }

     var posts = await _dbContext.Posts.Where(p => p.BlogId == blogId)
         .Select(p => new PostPreview
         {
             Title = blog.Title,
             Head = blog.Contents.Substing(0, 255)
         })
         .ToListAsync();
    
     // combine blog and posts and return BlogWithPosts
     return ...; 
}

@roji
Copy link
Member

roji commented Feb 1, 2024

Let's assume that the entire Blog and Post entity types are needed (e.g. bound to some UI form), so a PostPreview doesn't really make sense.

You're basically introducing wrappers all along the way - BlogWithPosts - to work around the fact that you've removed the reference from Blog to its Posts. Now try imagining what you'd need to do in order to use EF's change tracking to persist arbitrary changes in the graph back to the database. Since instead of having a simple Blog referencing its Posts - which EF knows about, tracks, and can perform automatic change tracking on, you'll now have to do quite a bit of extra manual work. For example, imagine that the user moves some Post from Blog1 to Blog2; when using navigations, EF detects that and does everything automatically - how much manual work will you have to do here in order to support that?

I'll skip ahead to where I'm trying to go... Do you really think that all the added complexity here - the extra wrappers, the manual joining via foreign keys, etc. etc - are worth the value of defensively "squelching the temptation to modify multiple aggregates", as quoted in the article? Or are we just following a recommendation which might make sense for some other ORM (not sure), but very, very significantly increases complexity here? I'd argue that the risk of bugs you're introduced with this convoluted pattern far outweigh the risk of an accidental transaction including two aggregates.

@voroninp
Copy link
Author

voroninp commented Feb 1, 2024

Situation complicates a lot if aggregate becomes more complex.
As aggregate is a gateway to all the entities it contains, they must be exposed as immutable snapshot.
It means, some includes or complex objects will be private. In theory the interface of aggregate can be reducible to a set of exposed methods. (if you have acces to the course https://app.pluralsight.com/library/courses/ddd-ef-core-preserving-encapsulation/table-of-contents). In this case reusing aggregate for read queries is a pain.

@roji
Copy link
Member

roji commented Feb 1, 2024

I'm sorry, I'm not seeing the relationship between the last comment and our previous discussion...

@voroninp
Copy link
Author

voroninp commented Feb 1, 2024

Now try imagining what you'd need to do in order to use EF's change tracking to persist arbitrary changes in the graph back to the database

You should not track read models. Only aggregates.

For example, imagine that the user moves some Post from Blog1 to Blog2; when using navigations, EF detects that and does everything automatically - how much manual work will you have to do here in order to support that?

It's a very good question. If we are speaking about DDD we are doing more than just CRUD.
First, you have to answer whether the logic of change belongs to either of these aggregates.
What are the validation rules for this action, what must happen then?

void Task MovePost(int postId int destBlogId)
{
    var post = await _repo.GetPost(postId);
    if (post is null)
    {
        return NotFound();
    }
    
    // This may be not just a change to `post.BlogId`, post may publish a domain message.
    post.MoveToBlog(destBlogId, ...some required domain services...);
    await _repo.Save(post);
}

It's even questionable whether you should have a FK constraint between Blog and Post. Business rules may require you to let blog be removed, letting posts stay (No restrict, and no Cascade delete). Posts just become orphaned. Post is an aggregate. We are ok, if it does not reference a blog.

Here we do not check whether blog with destBlogId exists. It can fail on SaveChanges, if we have FK constraint. It's another story how to process that error. If source and destination blogs are interested parties, domain event handler should fetch each of them and apply required actions. Or there might be a rule to notify someone about the change.

@voroninp
Copy link
Author

voroninp commented Feb 1, 2024

I'm sorry, I'm not seeing the relationship between the last comment and our previous discussion...

Imagine entity where all the mapping is applied to private fields.
You fetch the entity by Id, and then you change its state only by calling methods it exposes.

No imagine how hard it will be to use the entity for the queries. You'll be forced to use EF.Property everywhere.

@roji
Copy link
Member

roji commented Feb 1, 2024

It's even questionable whether you should have a FK constraint between Blog and Post. Business rules may require you to let blog be removed, letting posts stay (No restrict, and no Cascade delete). Posts just become orphaned. Post is an aggregate. We are ok, if it does not reference a blog.

That can just be an optional foreign key (optional navigation); the fact that a post can exist without a blog is not a reason to not have a foreign key/navigation.

In any case, this has been an interesting conversation, and I learned some things from it. I'm not very knowledgeable about DDD, but I'm generally skeptical of the patterns being recommended, and the extreme defensiveness; all that seems to add a very large amount of complexity - as well as depart from very natural, idiomatic ways of doing things in .NET - for very little actual return value.

But at the end of the day, the request here seems to be to allow mapping some parameter-les method call on an entity type to a navigation:

ctx.Set<Blog>().Where(b => b.Posts().Any(p => ...));

Here, b.Posts() would be recognized (because of some special configuration in the model) as actually corresponding to the navigation, as an alternative to writing the more usual b.Posts. Is that a fair summary of what you're asking for @voroninp?

@voroninp
Copy link
Author

voroninp commented Feb 1, 2024

Exactly

@roji
Copy link
Member

roji commented Feb 1, 2024

OK, I'll discuss with the team (though there's very little chance we'll actually be able to work on something like this any time soon).

For my own understanding, can you point me to a resource explaining why it's better for the navigation to be accessed via a method call Posts() as opposed to via a read-only property?

@voroninp
Copy link
Author

voroninp commented Feb 2, 2024

There's no such resource. The idea is that this method should be an extension method relevant only within the query.
For materialised entities it should always return null or maybe even throw.

@roji
Copy link
Member

roji commented Feb 2, 2024

If that's your goal, then why does it have to be an extension method? Why not a simple Posts navigation property which can be translated in an EF LINQ query but throws when invoked?

@voroninp
Copy link
Author

voroninp commented Feb 2, 2024

You mean with getter which always returns null/throws? Is not EF accessing navigation properties on SaveChanges?

@roji
Copy link
Member

roji commented Feb 2, 2024

We're discussing a scenario where there is no navigation at all, right? In other words, you could have a Posts property, but tell EF to ignore it (just like it would ignore a Posts() extension method, which is what you're proposing above). So assuming you don't want EF to actually have the navigation, why not just have an unmapped property there instead of an extension method?

To be clear, neither is supported right now - but I'm trying to understand why you're looking specifically for an extension method rather than a property.

@voroninp
Copy link
Author

voroninp commented Feb 2, 2024

Ok. let me explain one more time from the very beginning.

  1. I have an aggregate - a fixed composition of entities. The head of this graph is called an Aggergate Root.
  2. The point of having an aggregate is not just to have a set of DTOs which should be changed within the same transaction, but also to expose a rich business-related behavior to ensure data correctness (If domain is simple, there's no need to make things so complex.)
  3. Aggregate root works as a gateway to its state. Ideally this state should be encapsulated. If aggregate root exposes CLR properties they are usually read-only, and the values of those properties are immutable. Otherwise, aggregate's state could be changed externally and that's bad because aggregate root must guard business invariants. State should be changed via public methods of aggregate root. The main benefit of this setup is that core behavior is adjacent to the state. It's not scattered across numerous services. It's nothing more than a good old OOP.
    As you @roji, told, this implies an additional level of complexity which is not always warranted. Obviously, aggregates are required for the flows where changes occur, it's overkill to use this machinery for producing state for UI. UI may need the state asembled from dozens of entities belonging to different aggregates. If there's no need to change the state, there's simply no need to guard the invariants.
  4. Unfortunately, there's no easy way to define two distinct yet aligned with each other models for Command and Query flows. One can define two separate database contexts or use table splitting. Alternatively, there is an option to slightly compromise and use DbSets of aggregates for Query flows: aggregate1 join with aggregate2 ... join with aggregateN [filter|group] select some projection.
  5. We have two ways to do joins in EF: explicitly by ids or implicitly with navigation properties.
  6. The existence of such property on Aggregate is not desirable (discussed above). First, it's not needed to accomplish the tasks aggregate exists for. Otherwise, it would be a part of the aggregate. It's polluting the model; additional checks must be introduced to ensure this property is not used anywhere in Command processing flow. Believe me this will happen sooner or later ;-).
    However, navigation properties are extremely powerful and convenient for querying. Therefore, I'd like to keep the convenience but get rid of the CLR-property defined on the entity. That's why I suggested introducing navigation as an extension method. The extension method would simply be a convenience over EF.Property if shadow navigation properties were supported. In other words, we have navigation for Querying, but it won't be exploitable in Command processing flow.
    One more time: Navigation should exist for Query flow but be unavailable for Command flow.

If we assume that requested feature is a no go, then we will be fine with explicit joins:

from blog in context.Blogs
join post in context.Posts on blog.Id equals post.BlogId

Then there's one more improvement possible.
Imagine, there's another query where I need to join not only blogs and posts, but tags as well:

from blog in context.Blogs
join post in context.Posts on blog.Id equals post.BlogId
join postTag in context.PostTags on post.Id equals postTag.PostId
joint tag in context.Tags on postTag.TagId equals tag.Id

First two lines are exactly the same as in the first query. What if we could express them as a queryable set?

public sealed record BlogWithPost(Blog Blog, Post Post);

DbSet<BlogWithPost> BlogsWithPosts;

then query turns into

from bwp in context.BlogsWithPosts
join postTag in context.PostTags on bwp.Post.Id equals postTag.PostId
joint tag in context.Tags on postTag.TagId equals tag.Id

Slightly shorter, isn't it?
But we need a way to define such set:

modelBuilder.View<BlogWithPost>(
    ctx => ctx.Blogs.Join(
        ctx.Posts, 
        b = b.Id, 
        p => p.BlogId, 
        (b, p) => new BlogWithPost(b, p)));

If any of the features were implemented, we'd have an option for convenient querying without affecting the Command processing side. This is the ultimate goal.

@roji
Copy link
Member

roji commented Feb 2, 2024

@voroninp thanks for the detailed explanation. I think I understand what you're saying (though I may not think that the added complexity is actually warranted).

But my question is actually much simpler... I understand that you don't want navigation properties on your entity CLR types, so that they wouldn't be usable when doing writing/updating (so that you wouldn't accidentally include two aggregates in the same transaction/SaveChanges); but at the same time, you want to be able to use navigations when querying, more or less in the usual way. So there needs to be something that isn't an actual EF navigation, but which can still be used in queries like a navigation, i.e. to tell EF to join with a related entity in the query, etc. Hopefully this summarizes what you're looking for.

My only question is why that thing - which can be used in queries but not outside of queries - must be an extension method over the CLR type rather than a property (not auto-property!) on that CLR type. To be extra clear: even as a property, it would still not be usable outside of queries; in other words, the property would only have a getter (no setter), and that getter would throw. So you would be able to reference that property in queries in the usual way - your queries would basically look exactly like regular EF LINQ queries (no weird extension methods), but at the same time they would throw whenever you try to use them outside of queries.

Again - I'm not saying this is possible today; there still needs to be some mechanism with EF to configure the property as representing the navigation, even though it has no setter and its getter should never be called.

@voroninp
Copy link
Author

voroninp commented Feb 2, 2024

To be extra clear: even as a property, it would still not be usable outside of queries; in other words, the property would only have a getter (no setter), and that getter would throw. So you would be able to reference that property in queries in the usual way - your queries would basically look exactly like regular EF LINQ queries (no weird extension methods), but at the same time they would throw whenever you try to use them outside of queries.

Ah... got you now. This would be fine.

@roji
Copy link
Member

roji commented Feb 10, 2024

@voroninp we've discussed this as a team, and decided that while shadow navigations aren't a perfect answer here, it's good enough:

ctx.Set<Blog>().Where(b => EF.Property<ICollection<Post>>(b, "Posts").Any(p => ...));

EF.Property may add a bit of verboseness, but this is the pattern already recommended e.g. for keeping database ID properties out of CLR types.

We don't plan to do the (significant) extra work to allow defining a property/method that would be usable as a navigation in queries, but not otherwise.

@AndriySvyryd I've looked around and in #240, and haven't found an issue tracking shadow navigations - do you remember if we have one?

@roji roji removed the needs-design label Feb 10, 2024
@voroninp
Copy link
Author

voroninp commented Feb 10, 2024

This would be ok. But I hope custom transformations of expression trees will be supported in the future anyway.

@roji
Copy link
Member

roji commented Feb 10, 2024

@voroninp have you looked at the LINQ expression tree interceptor introduced in EF 7.0?

@voroninp
Copy link
Author

@roji Wow, I was not aware of it. Thanks!

@AndriySvyryd
Copy link
Member

@AndriySvyryd I've looked around and in #240, and haven't found an issue tracking shadow navigations - do you remember if we have one?

It was this one #3864

@ajcvickers ajcvickers added closed-no-further-action The issue is closed and no further action is planned. and removed type-enhancement labels Feb 21, 2024
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Feb 21, 2024
@voroninp
Copy link
Author

@roji , just if you are curious how people use EF in the context of DDD ;-)

https://youtu.be/N_eLotlcjXo

@voroninp
Copy link
Author

More ideas:

Usually, entity of type A has only one relation to entity of type B.
For these cases we theoretically could have the following version of include:

B b = default;
var aWithB = dbContext.Set<A>().Select(a => a.Include(b)).Where(a => b.Prop == "something").Select(a => new { a, b}).ToList();

@voroninp
Copy link
Author

This article also explains the concern https://ardalis.com/navigation-properties-between-aggregates-modules/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported
Projects
None yet
Development

No branches or pull requests

4 participants