Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.InvalidOperationException: Unable to translate a collection subquery in a projection... using Union or Concat #26703

Closed
marianosz opened this issue Nov 16, 2021 · 21 comments
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported

Comments

@marianosz
Copy link

Hello team

I have this piece of code that is working fine on EF Core 5, and previously on v3:

This is a property defined on the DbContext:

public IQueryable<Candidate> MyCandidates
{
    get
    {
        if (this.usersService.UserHasPermission("all:candidates"))
        {
            return this.Candidates;
        }

        var userId = this.usersService.GetUserId();

        var candidates = this.Candidates
            .Where(x =>
                x.CreatedByUserId == userId || x.OwnedByUserId == userId);

        var asignedCandidates =
            from cs in this.Candidates
            from an in cs.Assignations
            where an.AssignedUserId == userId &&
            (an.Status == AssignationStatus.Approved ||
            an.Status == AssignationStatus.Direct)
            select cs;

        return asignedCandidates
            .Union(candidates);
    }
}

Basically, if the current user has the "all:candidates" permission, he can query using the Candidates DbSet, if not, he can query the union of candidates created by him, plus the candidates assigned to him.

Then when I try to run the following query, if the user doesn't have the "all:candidates" permission, and uses the code with the Union() I have the following exception:

var userId = this.usersService.GetUserId();

var query = dbContext.MyCandidates
    .AsSplitQuery()
    .Include(x => x.Pins
        .Where(x => x.UserId == userId))
    .Include(x => x.MainRole)
    .Include(x => x.Flags)
        .ThenInclude(x => x.Flag)
    .Include(x => x.ManagerFlags)
        .ThenInclude(x => x.Flag)
    .Include(x => x.Technologies
        .Where(x => !x.NotInterested))
        .ThenInclude(x => x.Technology)
    .Include(x => x.Activities
        .Where(x => x.Visibility == ActivityVisibility.Public))
    .Include(x => x.Age)
    .AsNoTracking()
    .ToList();

The same exception occurs with Concat() instead of union, and is not happening when I remove the Union and return one query or another

2021-11-15 22:19:32.1843|ERROR|ExceptionHandlerMiddleware|Exception: System.InvalidOperationException: Unable to translate a collection subquery in a projection since either parent or the subquery doesn't project necessary information required to uniquely identify it and correctly generate results on the client side. This can happen when trying to correlate on keyless entity type. This can also happen for some cases of projection before 'Distinct' or some shapes of grouping key in case of 'GroupBy'. These should either contain all key properties of the entity that the operation is applied on, or only contain simple property access expressions.
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression.ApplyProjection(Expression shaperExpression, ResultCardinality resultCardinality, QuerySplittingBehavior querySplittingBehavior)
   at Microsoft.EntityFrameworkCore.Query.Internal.SelectExpressionProjectionApplyingExpressionVisitor.VisitExtension(Expression extensionExpression)
   at System.Linq.Expressions.Expression.Accept(ExpressionVisitor visitor)
   at System.Linq.Expressions.ExpressionVisitor.Visit(Expression node)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryTranslationPostprocessor.Process(Expression query)
   at Microsoft.EntityFrameworkCore.Query.QueryCompilationContext.CreateQueryExecutor[TResult](Expression query)
   at Microsoft.EntityFrameworkCore.Storage.Database.CompileQuery[TResult](Expression query, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.CompileQueryCore[TResult](IDatabase database, Expression query, IModel model, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.<>c__DisplayClass12_0`1.<ExecuteAsync>b__0()
   at Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddQuery[TResult](Object cacheKey, Func`1 compiler)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.ExecuteAsync[TResult](Expression query, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryProvider.ExecuteAsync[TResult](Expression expression, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryable`1.GetAsyncEnumerator(CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.ConfiguredCancelableAsyncEnumerable`1.GetAsyncEnumerator()
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ToListAsync[TSource](IQueryable`1 source, CancellationToken cancellationToken)
   at x.HR.API.Controllers.CandidatesController.List(String filter, String sortBy, String sortDirection, Int32 page, Int32 pageSize) in /Users/x/src/x-hr-backend/x.HR.API/Controllers/CandidatesController.cs:line 102
   at lambda_method94(Closure , Object )
   at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.AwaitableObjectResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeActionMethodAsync>g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask`1 actionResultValueTask)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeNextActionFilterAsync>g__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeInnerFilterAsync()
--- End of stack trace from previous location ---
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeFilterPipelineAsync>g__Awaited|20_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
2
   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
   at Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)
   at Microsoft.AspNetCore.Localization.RequestLocalizationMiddleware.Invoke(HttpContext context)
   at x.HR.API.Infrastructure.Middlewares.Cache.NoCacheMiddleware.Invoke(HttpContext httpContext) in /Users/x/src/x-hr-backend/x.HR.API/Infrastructure/Middlewares/Cache/NoCacheMiddleware.cs:line 25
   at Microsoft.AspNetCore.Authorization.Policy.AuthorizationMiddlewareResultHandler.HandleAsync(RequestDelegate next, HttpContext context, AuthorizationPolicy policy, PolicyAuthorizationResult authorizeResult)
   at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.<Invoke>g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task)
Url: /api/candidates
Query String: ?page=0&pageSize=30
Method: GET

EF Core version: 6.0.0
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET 6.0
Operating system: MacOs
IDE: VS Code

@ajcvickers
Copy link
Member

@marianosz Please attach a small, runnable project or post a small, runnable code listing that reproduces what you are seeing so that we can investigate.

@marianosz
Copy link
Author

Ok, i will try to build something to reproduce...

@marianosz
Copy link
Author

@marianosz
Copy link
Author

@ajcvickers I did more research, looks like the issue is related to the Includes, I tested the query removing the Include for related collections, and is working.

@ajcvickers
Copy link
Member

@smitpatel Simplified query. Does not fail if Include is removed:

var asignedCandidates =
    from cs in context.Candidates
    join an in context.CandidateAssignations on cs.Id equals an.AssignedId
    select cs;

var result = asignedCandidates
    .Union(context.Candidates)
    .Include(x => x.Technologies)
    .ToList();

@ajcvickers
Copy link
Member

@marianosz This change is by-design for 6.0. This query now throws because, depending on the results, the query could easily return bad data in 5.0. As the message says, "This can also happen for some cases of projection before 'Distinct' or some shapes of grouping key in case of 'GroupBy'. These should either contain all key properties of the entity that the operation is applied on, or only contain simple property access expressions."

@smitpatel
Copy link
Member

We should add fwlink to the exception message and explain on the doc about set operations limitation when they can contain repeating elements.

@angelochiello
Copy link

@smitpatel Simplified query. Does not fail if Include is removed:

var asignedCandidates =
    from cs in context.Candidates
    join an in context.CandidateAssignations on cs.Id equals an.AssignedId
    select cs;

var result = asignedCandidates
    .Union(context.Candidates)
    .Include(x => x.Technologies)
    .ToList();

I have a similar but more complex query and it works only if "include" are just before executing (ToList()).

@smitpatel
Copy link
Member

You cannot include a collection navigation after performing a set operation like that since your set operation result can contain duplicate items.

@StevenRasmussen
Copy link

Just ran into this... So what is the correct way to try and perform a Union but also include related properties/collections? I tried to perform the Include on the queries before the Union but with the same result. Ie:

var result = asignedCandidates.Include(x => x.Technologies)
    .Union(context.Candidates.Include(x => x.Technologies))
    .ToList();

Still throws the error though... maybe this isn't possible 🤷‍♂️

@ajcvickers
Copy link
Member

@StevenRasmussen It depends how the queries are created. For example, this is fine:

var filteredBlogs = context.Blogs.Where(e => e.Id == 1);
var blogs = filteredBlogs.Union(context.Blogs).Include(e => e.Posts).ToList();

But this is not:

var filteredBlogs = context.Blogs.Where(e => e.Id == 1).Select(e => new Blog { Id = e.Id });
var blogs = filteredBlogs.Union(context.Blogs).Include(e => e.Posts).ToList();

Full repro code:

public class Blog
{
    public int Id { get; set; }
    public int SomeOtherId { get; set; }
    public string Name { get; set; }

    public List<Post> Posts { get; } = new();
}

public class Post
{
    public int Id { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }

    public Blog Blog { get; set; }
}

public class SomeDbContext : DbContext
{
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(Your.ConnectionString)
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();

    public DbSet<Blog> Blogs { get; set; }
    public DbSet<Post> Posts { get; set; }
}

public class Program
{
    public static void Main()
    {
        using (var context = new SomeDbContext())
        {
            context.Database.EnsureDeleted();
            context.Database.EnsureCreated();

            context.Add(new Blog {Posts = {new Post(), new Post()}});
            context.Add(new Blog {Posts = {new Post(), new Post(), new Post()}});
            context.SaveChanges();
        }

        using(var context = new SomeDbContext())
        {
            // var filteredBlogs = context.Blogs.Where(e => e.Id == 1).Select(e => new Blog { Id = e.Id });
            // var blogs = filteredBlogs.Union(context.Blogs).Include(e => e.Posts).ToList();

            var filteredBlogs = context.Blogs.Where(e => e.Id == 1);
            var blogs = filteredBlogs.Union(context.Blogs).Include(e => e.Posts).ToList();

            foreach (var blog in context.Blogs)
            {
                Console.WriteLine($"Blog {blog.Id} with {blog.Posts.Count} posts.");
            }
        }
    }
}

@StevenRasmussen
Copy link

@ajcvickers - Thanks for taking the time to respond... and for the demo project! You have pointed me in the right direction. It appears based on your project and from me updating mine that you can't really do a projection on the results of the union as part of the select statement if it includes any of the include properties/collections. What I ended up doing is simply selecting the actual entity before projecting to a new one. Thanks!

@StevenRasmussen
Copy link

@ajcvickers - Looks like I spoke too soon ☹️ . My situation is a bit more complicated. It appears that this breaks down if one of the queries has a join to another table, even if the resulting query only selects the same entity type. I have updated your example:

using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging;

public class Blog
{
    public int Id { get; set; }
    public int SomeOtherId { get; set; }
    public string Name { get; set; }
    public List<Post> Posts { get; } = new();
}

public class SuperBlog : Blog
{
    public string SomeOtherProperty { get; set; }
}

public class Post
{
    public int Id { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public int BlogId { get; set; }
    public Blog Blog { get; set; }
}

public class SomeDbContext : DbContext
{
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(Your.ConnectionString)
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();

    public DbSet<Blog> Blogs { get; set; }
    public DbSet<SuperBlog> SuperBlogs { get; set; }
    public DbSet<Post> Posts { get; set; }
}

public class Program
{
    public static void Main()
    {
        using (var context = new SomeDbContext())
        {
            context.Database.EnsureDeleted();
            context.Database.EnsureCreated();

            context.Add(new Blog { Posts = { new Post(), new Post() } });
            context.Add(new Blog { Posts = { new Post(), new Post(), new Post() } });
            context.SaveChanges();
        }

        using (var context = new SomeDbContext())
        {
            var filteredBlogs = context.Blogs.Where(e => e.Id == 1);

            var superBlogs = from b in context.Blogs
                             join sb in context.SuperBlogs on b.Id equals sb.Id
                             select b;

            var blogsWithJoin = from b in context.Blogs
                                join p in context.Posts on b.Id equals p.BlogId
                                select b;

            try
            {
                // This doesn't work :(.  I thought it was because of the inheritance ob 'Blog' -> 'SuperBlog'
                var blogs = filteredBlogs.Union(superBlogs).Include(e => e.Posts).ToList();
            }
            catch (Exception ex) { }

            try
            {
                // This proves that if there is any 'join' then the query fails
                var blogs = filteredBlogs.Union(blogsWithJoin).Include(e => e.Posts).ToList();
            }
            catch (Exception ex) { }

            foreach (var blog in context.Blogs)
            {
                Console.WriteLine($"Blog {blog.Id} with {blog.Posts.Count} posts.");
            }
        }
    }
}

Should this be supported?

@angelochiello
Copy link

angelochiello commented Nov 25, 2021

Have you tried to remove the 2 "join" lines from superBlogs and blogsWithJoin? Just a thought looking at yoyr code.

@StevenRasmussen
Copy link

@angelochiello - Yes, I know that in this particular case you could remove the join clauses... it was more just to demonstrate the issue of using union's with join clauses. In reality, I'm performing some where clauses on the joins to filter out some of the data. Thanks for taking a look though.

@angelochiello
Copy link

angelochiello commented Nov 25, 2021

@StevenRasmussen In my case, wheres were OK. Problem was on the include part.
So I did something like;

  1. var query = _ctx.Entity.Where(x => x.Property == 1);
  2. // some other query;
  3. // some union
  4. query = query.Include(x => x.ReferencedEntity);
  5. var list = query.ToList();

If I put include on line 1, it doesen't work. on line 4, id does.
I think "join" are translated as include, am I wrong?

@StevenRasmussen
Copy link

So to be clear, what I’m looking for is the ability to use union with multiple queries that contain joins and also be able to use include statements on the final query. Like the OP, I am dynamically building a query based on permissions, that I would like to use as my core query when dealing with the entity type. I then further modify that core query with include statements and/or join with other tables depending on the shape of the data I want to return in any particular situation.

@meinsiedler
Copy link

We faced the same exception as the OP when using .Concat in combination with .Include. After we changed the .Concat call to .Union, the query worked as expected and generated the same SQL query as EF Core 5.0.12 did.

With EF 5.0.12 both, .Concat and .Union produced the same SQL query.

@smitpatel
Copy link
Member

In EF Core, in order to do collection include, it requires each record of the parent (on which the collection will be populated) uniquely identifiable. Since then we can fetch related records and fix up the navigations properly. This is by design and cannot be changed else it can cause duplicated records in the collection/improperly filled collection/collection not marked as loaded. The concept of uniquely identifiable is at row level which means we do track origin, and even if the parent entity is duplicated, if the row have unique identifier we still work with above principal. An example would be Orders.Select(o => o.Customer), since each customer can multiple orders, the result set will have duplicated customer entities but together with PK of Orders, each row in the result is still unique (OrderId, CustomerId together creates a composite unique key).

With above things in mind

  • Concat will never work since it doesn't remove duplicate records. Union works since it removes duplicates between both result sets.
  • When you do join it can give rise to duplicate results when the cardinality is more than one.
from c in Customers
join o in Orders on c.CustomerId equals o.CustomerId
select c

Above query will have duplicated customers in the result. By tracking origin, EF Core will also include OrderId in unique identifier.

But once you apply set operation with above query to something else which has different origin, there is no way to uniquely identify rows anymore. So we cannot do collection include. This is something which cannot be changed unless we come up with altogether different idea in how to do collection include.

  • If the set operation is not doing any paging afterwards then the only difference to server side is one round trip vs two. Split the query to do set operation on client side.
  • If set operation is required to be performed on server then don't apply collection include to them. Rather, load them explicitly after fetching the results of set operation from server.
  • Use FromSql* methods to construct the result of set operation (this put set operation on server side) and then do collection include. We assume that results coming out of FromSql* is going to be unique on the entity type it returns. You can do this only if the set operation you want to perform is not giving rise to duplicate entity results.

@StevenRasmussen
Copy link

@smitpatel - Thanks for the detailed explanation! I think I understand the issue now and how to move forward in my particular scenario. I appreciate the effort into providing such a well thought out response!

@ajcvickers ajcvickers added the closed-no-further-action The issue is closed and no further action is planned. label Dec 6, 2021
@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
@SoftCircuits
Copy link

SoftCircuits commented Oct 12, 2023

@ajcvickers Is there a more detailed explanation of this anywhere? How would it return invalid data if AsNoTracking() is used? In my case, I need a different Select() syntax for each of the two results I'm Union()ing together. I don't see how to make that work reliably if this is an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported
Projects
None yet
Development

No branches or pull requests

7 participants