Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LINQ Any() statement generates invalid SQL #6710

Closed
caleblloyd opened this issue Oct 7, 2016 · 11 comments
Closed

LINQ Any() statement generates invalid SQL #6710

caleblloyd opened this issue Oct 7, 2016 · 11 comments
Assignees
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Milestone

Comments

@caleblloyd
Copy link

Steps to reproduce

Models:

public class Blog
{
    public int BlogId { get; set; }
    public string Title { get; set; }

    public List<Post> Posts { get; set; }
}

public class Post
{
    public int PostId { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }

    public int BlogId { get; set; }
    public Blog Blog { get; set; }
}

The issue

LINQ queries using .Any() are causing an incorrect SQL EXISTS query to be executed:

db.Blogs.Include(m => m.Posts).OrderByDescending(m => m.Id).Skip(5).Take(10).Any()
//translates to SQL:
SELECT CASE WHEN EXISTS ( SELECT 1 FROM `Blogs` AS `m`) THEN TRUE ELSE FALSE END LIMIT 10 OFFSET 5

I'd expect a correct EXISTS statement to look more like the query in the COUNT statement:

db.Blogs.Include(m => m.Posts).OrderByDescending(m => m.Id).Skip(5).Take(10).Count()
//translates to SQL:
SELECT COUNT(*)
FROM (
    SELECT `m`.`Id`, `m`.`Title`
    FROM `Blogs` AS `m`
    ORDER BY `m`.`Id` DESC
    LIMIT 10 OFFSET 5
) AS `t`

Therefore the correct EXISTS statement should be something like:

SELECT CASE WHEN EXISTS (
    SELECT 1
    FROM `Blogs` AS `m`
    ORDER BY `m`.`Id` DESC
    LIMIT 10 OFFSET 5
) THEN TRUE ELSE FALSE END

Further technical details

EF Core version: 1.0.1
Operating system: Ubuntu 16.04

I am using the Pomelo.EntityFramework.MySql driver (I am a contributor to the project). I believe this issue is present in the Microsoft.EntityFrameworkCore.Relational package though and may affect all drivers.

@smitpatel
Copy link
Contributor

Skip/Take should not be lifted from the subquery.

@caleblloyd
Copy link
Author

@smitpatel neither should OrderBy. The whole query needs to remain in-tact

@smitpatel
Copy link
Contributor

smitpatel commented Oct 7, 2016

@caleblloyd - Would the result be any different in absence/presence of OrderBy? Exists would return true as long as there are 6 records in Blogs table. Ordering really doesn't matter.
Queries:

EXISTS (
    SELECT 1
    FROM `Blogs` AS `m`
    ORDER BY `m`.`Id` DESC
    LIMIT 10 OFFSET 5
)

EXISTS (
    SELECT 1
    FROM `Blogs` AS `m`
    LIMIT 10 OFFSET 5
)

@caleblloyd
Copy link
Author

caleblloyd commented Oct 7, 2016

@smitpatel it doesn't matter in this case, but I imagine if this query were to end up nested into another query (but the ORDER BY stays on the outer query) it might. If this query will never be nested, then order by seems to not matter.

@caleblloyd
Copy link
Author

@smitpatel additionally, if a caller calls .Any() to check existence of a big record set, then calls .ToArray() if it does exist, there's a chance that the DBMS may be able to use a cached query.

By removing the ORDER BY in the EXISTS, the DBMS has no hope of caching the query.

@rowanmiller rowanmiller added this to the 1.1.0 milestone Oct 7, 2016
@maumar
Copy link
Contributor

maumar commented Oct 8, 2016

@caleblloyd we got a lot of feedback that people like smaller/more elegant queries. We should not be making them more complicated than necessary to satisfy fringe scenarios like accidentally being able to reuse same cache entry for two different queries.

@caleblloyd
Copy link
Author

@maumar why not submit the query in the form that the caller requests? If they hand EF an ORDER BY in an EXISTS it's probably for a reason.

The magical query reduction is what is causing bugs like these in the first place.

@yukozh
Copy link
Contributor

yukozh commented Oct 9, 2016

MyCAT(A distributed MySQL cluster middleware) does not return records in order, so I suppose this should be fixed.

@rowanmiller rowanmiller modified the milestones: 1.1.0, 1.2.0 Oct 17, 2016
@caleblloyd
Copy link
Author

caleblloyd commented Oct 20, 2016

@rowanmiller can this get a higher priority? Incorrect generated SQL seems like it should be fixed sooner than later

If the issue is programmer burden, I'm happy to take a shot at a PR

@rowanmiller
Copy link
Contributor

It's in the set of bugs allocated to be fixed in our next release (1.1 has pretty much locked now, so it will be 1.2).

@anpete could comment on how feasible this would be as a contribution

@anpete
Copy link
Contributor

anpete commented Oct 24, 2016

Sending back to triage to discuss where to put the fix.

@rowanmiller rowanmiller added this to the 1.1.0 milestone Oct 25, 2016
@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Oct 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Projects
None yet
Development

No branches or pull requests

6 participants