-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: check ambiguous column reference #12467
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -132,7 +132,12 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { | |
nested_names, | ||
) { | ||
match planner_result { | ||
PlannerResult::Planned(expr) => return Ok(expr), | ||
PlannerResult::Planned(expr) => { | ||
// sanity check on column | ||
schema | ||
.check_ambiguous_name(qualifier, field.name())?; | ||
return Ok(expr); | ||
} | ||
PlannerResult::Original(_args) => {} | ||
} | ||
} | ||
|
@@ -143,6 +148,8 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { | |
} | ||
// found matching field with no spare identifier(s) | ||
Some((field, qualifier, _nested_names)) => { | ||
// sanity check on column | ||
schema.check_ambiguous_name(qualifier, field.name())?; | ||
Ok(Expr::Column(Column::from((qualifier, field)))) | ||
} | ||
None => { | ||
|
@@ -186,6 +193,9 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { | |
let s = &ids[0..ids.len()]; | ||
// safe unwrap as s can never be empty or exceed the bounds | ||
let (relation, column_name) = form_identifier(s).unwrap(); | ||
// sanity check on column | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As the comment says, how about adding a // sanity check on column
schema.check_ambiguous_qualified_name(relation.as_ref(), column_name)?;
Ok(Expr::Column(Column::new(relation, column_name)))
I suspect that we should not allow two identical qualified fields within a schema, but this would violate PR 6091. Therefore, introducing a method like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That‘s good idea. |
||
schema | ||
.check_ambiguous_name(relation.as_ref(), column_name)?; | ||
Ok(Expr::Column(Column::new(relation, column_name))) | ||
} | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1209,3 +1209,20 @@ drop table t1; | |
|
||
statement ok | ||
drop table t2; | ||
|
||
# Test SQLancer issue: https://github.com/apache/datafusion/issues/12337 | ||
statement ok | ||
create table t1(v1 int) as values(100); | ||
|
||
## Query with Ambiguous column reference | ||
query error DataFusion error: Schema error: Ambiguous reference to unqualified field v1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it should be an "Ambiguous reference to qualified field", but currently the SchemaError cannot describe it, so the current situation is acceptable to me. |
||
select count(*) | ||
from t1 | ||
right outer join t1 | ||
on t1.v1 > 0; | ||
|
||
query error DataFusion error: Schema error: Ambiguous reference to unqualified field v1 | ||
select t1.v1 from t1 join t1 using(v1) cross join (select struct('foo' as v1) as t1); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
|
||
statement ok | ||
drop table t1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the newly introduced functions
qualified_field_with_qualified_name
andqualified_fields_with_qualified_name
do more than what we need. They might be unnecessary.If I understand correctly, this function can be implemented as follows:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we no longer need to add these two functions🤔,
qualified_field_with_qualified_name
andqualified_fields_with_qualified_name
, along with the modifications toqualified_field_with_name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two functions will check the ambiguous column reference, it's necessary, or we could not identify the case below:
create table t1(v1 int);
select count(*)
from t1
right outer join t1
on t1.v1 > 0;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need add
check_ambiguous_name
at L132.The changes to
qualified_field_with_name
will cause the following query to produce an incorrect result.On the main branch, it gives
This is because it breaks search_dfschema, making it unable to find
t1.v1
, but instead found the nested columnt1
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The column reference 't1.v1' is ambiguous in query.
select t1.v1 from t1 join t1 using(v1) cross join (select struct('foo' as v1) as t1);
Fix it later and also add to test case.