Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when doing unions where projections contain NULL values #3283

Closed
edgar-tamm-uptime opened this issue Sep 19, 2024 · 8 comments · Fixed by #3291
Closed

Error when doing unions where projections contain NULL values #3283

edgar-tamm-uptime opened this issue Sep 19, 2024 · 8 comments · Fixed by #3291
Assignees
Labels
bug Something isn't working
Milestone

Comments

@edgar-tamm-uptime
Copy link

Hello,

Stumbled upon a similar case as #2366, but the error still happens.

Package: "Npgsql.EntityFrameworkCore.PostgreSQL" version 8.0.4

Sample that reproduces error:
EFCoreUnionWithNullsError.zip

Hopefully using Aspire is fine as it was the easiest way to give a open & run the example reproduction (assuming you have Aspire Workflow & Docker).
The main project is the "App" project, this can be run without Aspire, but you will need a postgresql instance & pass in the connection string as the first argument.

When run it should give an error with the following message: UNION types text and bigint cannot be matched.

Hopefully the example is not too messy (and doesn't look too contrived as it basically is what is done in the actual app).

Additional note (as described in comments in 2366), I managed to work around the issue, by having custom DbFunctions which add type casts to each NULL value.

If you need more info I will try to provide it.

Thank you.

@edgar-tamm-uptime
Copy link
Author

A slightly more compact version:

var sourceQuery = dbContext.Entries;

var itemsQuery =
    from g in sourceQuery
        .GroupBy(x => x.Data2Id)
        .Select(x => new
        {
            Id = x.Key,
            Total = x.Sum(y => y.Total),
        })
     join d in dbContext.Entries2 on g.Id equals d.Id
     select new ResultDto
     {
         Id = null,
         Data2Id = g.Id,
         Data2Name = d.Name,
         Name = null,
         Date = null,
         Total = g.Total,
     };

// If this would be done last then there would be no error, but EF doesn't actually add casts
itemsQuery = itemsQuery.Concat(sourceQuery
    .GroupBy(x => true)
    .Select(x => new ResultDto
    {
        Id = null,
        Data2Id = null,
        Data2Name = "Grand Total",
        Name = null,
        Date = null,
        Total = x.Sum(y => y.Total),
    }));

itemsQuery = itemsQuery.Concat(sourceQuery.Select(x => new ResultDto
{
    Id = x.Id,
    Data2Id = x.Data2Id,
    Data2Name = null,
    Name = x.Name,
    Date = x.Date,
    Total = x.Total,
}));

var result = await itemsQuery.ToListAsync();

@roji
Copy link
Member

roji commented Sep 19, 2024

Thanks, here's a stripped down minimal repro as a console program (these are always best):

await using var context = new BlogContext();
await context.Database.EnsureDeletedAsync();
await context.Database.EnsureCreatedAsync();

var itemsQuery =
    (
        from g in context.Entries
            .GroupBy(x => x.Data2Id)
            .Select(x => new
            {
                Id = x.Key,
                Total = x.Sum(y => y.Total),
            })
        join d in context.Entries2 on g.Id equals d.Id
        select (long?)null
    )
    .Concat(context.Entries.Select(x => (long?)null))
    .Concat(context.Entries.Select(x => (long?)x.Id));

// OK:
// var itemsQuery = context.Entries.Select(x => (long?)null);
// itemsQuery = itemsQuery.Concat(context.Entries.Select(x => (long?)null));
// itemsQuery = itemsQuery.Concat(context.Entries.Select(x => (long?)x.Id));

var result = await itemsQuery.ToListAsync();

public class BlogContext : DbContext
{
    public DbSet<Data> Entries => Set<Data>();
    public DbSet<Data2> Entries2 => Set<Data2>();

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseNpgsql("Host=localhost;Username=test;Password=test")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();
}

public class Data
{
    public long Id { get; set; }
    public long Data2Id { get; set; }
    public decimal Total { get; set; }
}

public class Data2
{
    public long Id { get; set; }
    public required string Name { get; set; }
}

@roji roji self-assigned this Sep 19, 2024
@roji roji added the bug Something isn't working label Sep 19, 2024
@roji roji added this to the 9.0.0 milestone Sep 19, 2024
@ChrisJollyAU
Copy link
Contributor

@roji I fixed the same thing in EFCore.Jet CirrusRedOrg/EntityFrameworkCore.Jet@08718b8

I just did the cast in the sql generator (can't cast NULL to any specific type - had to be a variant CVar(NULL))

I did check Sql Server and it is fine there so I don't believe it is an upstream problem but just an individual provider

Sql Server query

SELECT NULL AS [c]
      FROM (
          SELECT [e].[Data2Id] AS [Id]
          FROM [Entries] AS [e]
          GROUP BY [e].[Data2Id]
      ) AS [e1]
      INNER JOIN [Entries2] AS [e0] ON [e1].[Id] = [e0].[Id]
      UNION ALL
      SELECT NULL AS [c]
      FROM [Entries] AS [e2]
      UNION ALL
      SELECT [e3].[Id] AS [c]
      FROM [Entries] AS [e3]

EFCore.Jet query

SELECT CVar(NULL) AS `c`
      FROM (
          SELECT `e`.`Data2Id` AS `Id`
          FROM `Entries` AS `e`
          GROUP BY `e`.`Data2Id`
      ) AS `e1`
      INNER JOIN `Entries2` AS `e0` ON `e1`.`Id` = `e0`.`Id`
      UNION ALL
      SELECT CVar(NULL) AS `c`
      FROM `Entries` AS `e2`
      UNION ALL
      SELECT `e3`.`Id` AS `c`
      FROM `Entries` AS `e3`

@roji
Copy link
Member

roji commented Sep 21, 2024

@ChrisJollyAU Thanks...

I haven't yet looked into this issue in depth, but PG infers the type from the first set operation; there's a special post-processing visitor (code to inject and explicit cast into the projection of the first set operation. I'm suspecting something's wrong with that visitor, though I'll take a look and keep your comment in mind (I'll post my findings here)

@ChrisJollyAU
Copy link
Contributor

Yeah there is a slight problem in the visitor - specifically where it can change the state to AlreadyCompenstated

Debugging through it when it processes one of the inner SELECTS

SELECT e."Data2Id" AS "Id"
          FROM "Entries" AS e
          GROUP BY e."Data2Id"

After it passes through that projection, even though it hasn't done anything it still changes the state to AlreadyCompensated which then prevents anything else being handled.
A simple change to only update the state if we actually did something works
_state = parentState == State.InNestedSetOperation && changed ? State.AlreadyCompensated : parentState;

This produces a working SQL of

SELECT NULL::bigint AS c
      FROM (
          SELECT e."Data2Id" AS "Id"
          FROM "Entries" AS e
          GROUP BY e."Data2Id"
      ) AS e1
      INNER JOIN "Entries2" AS e0 ON e1."Id" = e0."Id"
      UNION ALL
      SELECT NULL AS c
      FROM "Entries" AS e2
      UNION ALL
      SELECT e3."Id" AS c
      FROM "Entries" AS e3

Is that the right strategy you want?

roji added a commit to roji/efcore.pg that referenced this issue Sep 21, 2024
@roji
Copy link
Member

roji commented Sep 21, 2024

@ChrisJollyAU I'm just working on this :)

Yes, you're right about the diagnosis - but I think the general approach/design of that visitor is fundamentally wrong...! I've submitted #3291 which redoes it in a much simpler way, and which I think aligns more correctly with the PostgreSQL rules. There are other various scenarios where the current approach doesn't do its job right, and the new, simpler design in #3291 should take care of them...

Can I ask you to give #3291 a review and tell me what you think?

@ChrisJollyAU
Copy link
Contributor

Looks fine. You are only doing the NULLs on the left part of the union rather than on both sides? By the looks of it postgresql only does need just the left side with a type to work

@roji
Copy link
Member

roji commented Sep 21, 2024

@ChrisJollyAU yep. The point of the new implementation is also to always do this, rather than the previous implementaiton, which detected nested set operations etc.

roji added a commit to roji/efcore.pg that referenced this issue Sep 21, 2024
@roji roji closed this as completed in a894bdc Sep 21, 2024
WhatzGames pushed a commit to WhatzGames/efcore.pg that referenced this issue Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants