-
Notifications
You must be signed in to change notification settings - Fork 28
Attribute shadowing with join() encourages possible data leak bugs #347
Comments
I think we should throw a duplicate select column exception for any duplicated column on the result... For the implementation I think that the simpler would be to change the current Also we could hide this exception when not working with strict mode on sql, instead we would select the first value of the array and use it, as it should probably be what you'll expect as the value... Update: My exception idea was rejected so I think that we won't handle this situation at all, which isn't bad since it's something the developer should be taking care after all... I guess we can leave this open anyway so ppl can comment how other ORM's are handling this and propose alternatives... Also it keeps all discussion in one place... |
That's like saying it's unnecessary to create memory-safe programming languages because developers should not make mistakes with memory management in C, so why bother about memory safety in the first place? It's the programmer's fault if half of the world in vulnerable because of some memory management mistake in OpenSSL after all! Or it's unnecessary to have static type checking; it's the programmer's mistake if the program has type errors! I could name hundreds of other examples where smart people have decided to change how stuff works because other people tend to make too many silly mistakes. From a pragmatic point of view, if people make silly mistakes with your framework all the time, then you have simply made wrong decisions somewhere. |
I would love to get some feedback on this issue. |
Can you please provide an actual example exploit that is possible because of this? All I'm seeing so far is the theory of a possible exploit, not a concrete example where one exists in a situation that would generally occur in an application. |
There is no exploit that works for all applications. My point is that the current behavior encourages code that is prone to potential data leaks and bugs. One might argue that this is strictly speaking not a security issue but I think it's very similar to SQL injections though very limited in scope. |
As I said, can you please provide an example exploit? As it is, I don't quite understand what the exploit allows for an attacker to do. Or even how an attacker would exploit it if you aren't taking SQL commands from them. Helping get this looked at is difficult without a concrete "X report allows for Y attack to occur." |
Further, I didn't ask for an example that would affect "all", of course that isn't feasible. I simply asked for a "generally applicable" example, as in one that isn't caused by a developer explicitly opening an injection attack vector by taking user input and putting it directly into a SQL query. |
There is no general applicable exploit. There is no exploit. This issue is about preventing BUGS that may or may not have security impact. I'd say it's very likely for bugs causes by this behavior to have security consequences (other people's data being shown to visitors). |
Can you provide a case in which this bug happens? I don't know how many times a code sample needs to be asked for. |
Very first sentence in this issue |
Yes, and that code doesn't make sense anyways. Your getting a collection of multiple users from each query. Then trying to scope orders by a single user. Which means that example is already failing in multiple places. It isn't clearly demonstrating the risk you're asserting exists. |
I am sorry; here's an even more complete version of the code: function getUsers() { ... }
foreach (getUsers()->get() as $someUser) {
$someUserOrders = Order::where("user_id", $someUser->id);
...
} Simpler example without a loop: function getUserById($id) {
return User::where("id", $id)->first();
} A few months after coding this the project requirements change and we now have user groups as well, and we change the query to all users having a primary group : function getUserById($id) {
return User::where("id", $id)->join("groups", "groups.id", "=", "users.primary_group_id")->first();
} Also we always had some code that fetches users' orders which worked like this: Order::where("user_id", getUserById(...)->id); The Note that this is a problem that cannot happen with raw SQL; it's a problem introduced by Eloquent, see first post in this issue thread. |
What about using I feel like doing a manual join in this kind of scenario is a quirky edge-case for most Laravel users. |
From a quick test: No, it only happens with Thing is we have quite some places where we can't use I guess loads of people must be using |
So, if it is only with |
$query->has('PrimaryGroup') |
NO. Eloquent confuses the object's IDs. Let me make a very very concrete example. var_dump(User::where('login', 'foo@bar.com')->first()->id);
int(269)
var_dump(User::where('login', 'foo@bar.com')->join('groups', 'primary_group_id', '=', 'groups.id')->first()->id);
int(261) // <-- wrong id! Queries made by Eloquent are correct:
But when constructing the model objects, Eloquent chooses THE WRONG ID COLUMN. (EDIT: Or, rather, it's ambiguous, and it choses one of them.) |
Eloquent should be using aliases for all tables for all queries so that these ambiguities go away. The simplest way would be to something like |
So, this isn't really a security problem. It is a possible data leak issue. There isn't anything a user can really do here to force this and abuse it. It's like, if all the stars align just right and the developer implements a feature X way then the wrong data could be retrieved. I'll try pinging Taylor to peek at this later today. But fixing this seems like it may be a slow churn and possibly a big 6.0 type fix since changing this could break existing applications quite heavily. Thank you for the full examples, they helped a ton in understanding the exact problem! |
🎉 Thanks for being patient enough and asking all the questions. I'll definitely have to work on my "how to file a good bug report" skills :-) |
Also, I might be interested in coming up with the first patch for this issue. |
As far as bug reporting you got one part really well which is describing the technical side. However, that can be confusing which is where also providing a direct, workable code sample comes in. Beyond that, have a section that clearly describes a recommended solution and try to provide some information on foreseen implementation problems, (BC breaks, developer confusion, etc.) so there is some meat to be discussed on that end. It also works well when you use headings to show the sections of the issue and |
Also, this does cause an active exploit against sites if they use the query builder to join based directly on content sent from a client. If a user catches on and the proper checks aren't in place for the requested data, they can cause a force dump of targeted data out of the system. This is once again, "assuming all the stars align" in probability. Therefore, not a critical priority to get addressed. Still something we should help guard against if possible. |
I just had a look at the Eloquent/query internals and oh boy, using unique tables aliases as I initially suggested for sure doesn't look like an easy undertaking. IIUC Eloquent constructs model instances simply by taking SQL result row arrays and setting them as model attributes. For example if your SQL query returns an array It's also horrible from a deprecation path point of view, as many have suggested in the discussion already. I guess lots of people rely on the fact that model instances may "receive" attributes from other tables, aggregates, raw queries, ..., and changing that could be painful even with a long deprecation window. Maybe something along the following?
Step 1 should be reasonable, maybe even step 2. Step 3 and 4 sound a bit crazy though... |
How am I to walkaround this bug? I want to retrieve a model, with a condition about it joined table. $movie = Movie::query()
->where('name', 'Fightclub')
->join('directors', 'director.id', '=', 'movies.director_id')
->where('directors.country', 'USA')
->first(); Now |
I'm not an Eloquent user, but what you explain is whereHas, not a join. The original issue only occurs if you write queries with joins manually, instead of letting Eloquent build the queries automatically. Eloquent never issues joins, afaik.
|
Will it also eager load director record? Because of the |
Yes, it will eager-load all directors for any movie that has at least one director with |
This is how SQL works. |
No, it isn't. This is perhaps how the Eloquent handles it, or how PDO handles it. But it's absolutely not how SQL works. This issue/idea/suggestion is about fixing/supporting it. A database query can easily produce a result with columns with identical names; Throwing an exception telling the developer that there are duplicate column names (and they are expected to be unique) is a better solution that just overwriting some values as is currently done. |
@taylorotwell No, this is not "how SQL works." The closest you'd get to something a query like
is the SQL statement
which fails in MySQL and PostgreSQL with an error that the Besides, what kind of argument is this? You've got a well-defined, real-world problem that multiple users have experienced, with possibly catastrophic outcomes. While the root cause may not be easy to fix, there's an obvious mitigation for the problems that can arise from it: Detect duplicate columns when doing an SQL query and throw a runtime error. None of the ORMs I have ever used have had this kind of problem. It came as a nasty surprise to me: I consider protecting your users from shooting themselves into their own feet is a large part of an ORM's responsibility. (Unless users explicitly request that are protections be lifted, i.e. going to down raw SQL.) How many more Laravel users are affected by this issue remains unclear; I expect only a very small subset of them to find this GitHub bug report. |
Totally agree. At a minimum, put it in the docs to warn devs that this situation can happen. Thanks all. |
Why are we not being explicit in the table columns names when using the query builder? This would avoid this problem completely. And then on result in Eloquent, you map the column names from the table name and put them into the properties appropriately. Specifying table names in the column selects would, ultimately, rest this case and still have SQL work the way it's designed. |
@tdondich Because no one knows what columns exists in the database. Most queries are a simple |
FYI Ruby on Rails uses inspection to retrieve the list of columns. IIRC it does this during application boot and caches it (google the term Schema Cache). |
Something similar is also available in Doctrine where you declare your properties on your model, and map them accordingly to database columns. This means that Doctrine knows of every database column and can alias them accordingly. Laravel, however, does not have this functionality. |
Of course it doesn't have it at this point, but that's a bad argument for not adding another feature that depends on it. The solution is simple: one could add the feature. |
Sure. I misinterpreted the situation. I believe the current heading of this issue was to introduce exceptions when there's duplicate column names in the result. There are also other issues that touches the subject of introducing schemas/model metadata for Laravel. But in the end, if we want something that large, why not switch to Doctrine which has that feature already? |
similar issue as laravel/framework#23548 and definitely not sql problem to avoid duplicate column conflict relation() is defined as
There is only one related model with id of 1 as returned by base sql, but somewhere while creating a related model the id is overwritten by intermediate model id |
Problem starts in |
I've stopped counting how often I've been bitten by this. Every join between tables of same-names columns without explicit selection of columns is prone to this problem. This is one of the things CakePHP2 "got right" [1]:
That's also why, even in only a single "Model" (if you can even call it so) result, is always "nested": $post = $Post->find('first');
// Equals this structure
$post = [
'Post' => [
'id',
…
]
] Under the hood:
This approach however is at odds with so many things how Eloquent works so pretty much unfeasible I think. Fun fact: I recently had a performance issues with complex results and joins and there wrote exactly such a converter:
It felt wrong, but I couldn't find another way except doing everything manual. There was also mentioned No solution to the problem, just providing some perspective. [1] "got right" as in: does not suffer from this particular problem, doens't mean it's a saint with other things |
I guess you can "resolve" this by adding a global query scope. Untested [Doesn't work]: class MyModel extends Model
{
/**
* A side-effect that is loaded when the model is 'booted', to only query the model its own columns.
*
* @return void We only did things in secret
*/
protected static function booted()
{
static::addGlobalScope('only_my_columns', function (Builder $builder) {
$builder->select($this->table . '.*'); // Edit: TODO: Wait a minute, you don't have access to $this in a static function.
});
}
} For sure this means you need to remember to add extra columns of other tables, if you need them in one your models. Edit: Also it doesn't work since there is no "Hey Siri, what's an ORM?" 😅 |
I have written a workaround for this when I initially reported the issue that forces you to always explicitly select all the columns when doing joins. Can't find the code right now though. Honestly, at this point I would just recommend to everyone to not use Eloquent in the first place. This issue has been largely ignored by the developers for over 4 years now, with lots of people suffering from exactly the symptoms I have proclaimed, and also the Eloquent developers themselves having to plug workarounds at other places in the code because of the issue. Eloquent developers continue to claim that "this is how SQL works" here and in other places as well, while clearly proved wrong. |
Yeah, naming all required columns is a best practice in SQL (relational algebra): https://stackoverflow.com/a/3639964/273668 Sadly Eloquent didn't start that explicit way. Maybe it could be added as a kind of 'Eloquent2 is the future' to the side. Preference to the columns of your Model its own table seems pretty sane default though 🤔. Maybe someday someone will create one of these fancy SQL to typesafe interface/classes tooling for PHP 🤷♂️. So you can actually know what comes out. |
Hi, I'm new to the Laravel community and didn't know about this thread so I have opened another issue where I have described something that surprised me totally and that is actually the topic of this thread. This is actually a possible security issue and I have explained it in my comment. For a short explanation, if use users and tenant_users tables to authenticate the user and check is it belongs to the tenant, you can end up with a properly authenticated user which id is overwritten with the id from tenant_users which by accident could belong to some other record from the users table which has higher privileges. When you after that use that user model to load relations it would load them by id from that other user... (there is an example of code in issue) |
From #34671 @driesvints:
I think this is a major step forward from @taylorotwell claiming
:-) Now we just need someone to solve the problem on the technical level and come up with a PR. 🥳 |
I didn't say it would be easy. I was just saying that if anyone finds a genuine solution to this problem that they're free to open up a PR. |
That's great! I really meant it. No irony or sarcasm involved. I think it is going to be really difficult to solve, and painful in terms of backwards compatibility, but definitely worth the trouble, considering that it will prevent lots of security issues and data leaks, and save lots of debugging-hours for Laravel users. |
I don't think that Laravel should mandatory change how it works. The biggest issue with this is that it silently overwrite value and if you are not aware of it, it raise some issues. I believe @jonashaag aready suggested how it should be fixed:
From my point of view as someone who is new to Laravel raising warn/error if duplicate/ambiguous column was found would be all what I need to know. It is my responsibility like developer to solve that issue and be more preceise. My issue was that I was assumed that it works like I have expected from my previous experience with similar tools and I wasn't warned that something is not good. I will be happy if I can help somehow with this. |
@HenkPoley I took your idea and was able to create a Trait that can be used in each model or if you have a base Model then you can add it there. I haven't tested it much but it seems to work for simple cases. I was able to get the table name by using trait OnlyMyColumnsTrait
{
public static function bootOnlyMyColumnsTrait()
{
static::addGlobalScope('only_my_columns', function (Builder $builder) {
$builder->select($builder->getModel()->getTable().'.*');
});
}
} |
@ejunker I like your idea. I have finally got some time to check how PDO works in this case and I was very much surprised to see that it actually work the same way as Eloquent. I'm surprised because all other SQL clients that I have used ever was handling this "correctly". |
Please see laravel/framework#16704, I believe the way duplicate column names (for example:
users.id
,groups.id
when you join the both tables) are currently handled by Eloquent A) can be a surprise to many users and B) is a potential security issue.Summary from the discussion linked above:
id
overriding each other in case you're doing a join being a potential security riskSELECT
ambiguous column names, but Eloquent has an implicitSELECT
clause that "randomly" chooses one of the columns and ignores the rest of themThe text was updated successfully, but these errors were encountered: