Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KSQL 420 join criteria validation #447

Merged

Conversation

hjafarpour
Copy link
Contributor

No description provided.

@@ -236,8 +238,13 @@ protected Node visitJoin(final Join node, final AnalysisContext context) {
JoinOn joinOn = (JoinOn) (node.getCriteria().get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line here will throw an exception if the criteria is not present

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check for this.

}
if (keyInfo == null) {
throw new KsqlException(String.format("%s : Invalid join criteria %s ", comparisonExpression
.getLocation().get().toString(), comparisonExpression));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling get on an Opitional without checking if it is present. Might result in a NoSuchElementException

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to push the commit? I don't see the check.

(DereferenceExpression) expression;
return leftDereferenceExpression.getFieldName();
String sourceAliasVal = dereferenceExpression.getBase().toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this confusing. Can we at least add a method to DereferenceExpression called getSourceAlias()?
Ideally we can come up with an interface that both DereferenceExpression and QualifiedNameReference implement that enable us to get the keyFieldName. We can then get rid of the instanceof checks here and delegate to the impl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a bit difficult since these are two different nodes in our AST and SourceAlias is not always the lefthand side for these expressions. The interpretation of these expression also depends on the context.

metaStore);
final Query query = (Query) statements.get(0);
try {
final Analysis analysis = queryAnalyzer.analyize(query);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

analysis is unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the variable.

final Analysis analysis = queryAnalyzer.analyize(query);
} catch (KsqlException ex) {
assertThat(ex.getMessage().trim(), equalTo("Line: 1, Col: 46 : Invalid join criteria "
+ "(TEST1.COL1 = TEST2.COLL)"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anyway we can add something to say which column is invalid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more details.

"select * from test1 join test2 on test2.col2 = test1.col1;",
metaStore);
final Query query = (Query) statements.get(0);
final Analysis analysis = queryAnalyzer.analyize(query);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to check that the result has the correct structure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added join checks.

rightDataSource.getSchema());

String leftKeyFieldName = leftSide.getRight();
String rightKeyFieldName = rightSide.getRight();

if (comparisonExpression.getType() != ComparisonExpression.Type.EQUAL) {
throw new KsqlException("Join criteria is not supported.");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know its not part of the PR - but can the message be improved - i.e. Join criteria only supports [EQUALS]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, updated it!

try {
final Analysis analysis = queryAnalyzer.analyize(query);
} catch (KsqlException ex) {
assertThat(ex.getMessage().trim(), equalTo("Line: 1, Col: 46 : Invalid join criteria "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cant see where the source message comes from - but more information would be useful - i.e. coll field does not exist.

Copy link
Contributor Author

@hjafarpour hjafarpour Nov 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more info to the error message.

@@ -251,20 +258,43 @@ protected Node visitJoin(final Join node, final AnalysisContext context) {
return null;
}

private String fetchKeyFieldName(Expression expression) {
private Pair<String, String> fetchKeyFieldName(ComparisonExpression comparisonExpression, String sourceAlias, Schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow the logic here. We are trying to match the alias with both the left and right side of the expression. Isn't this too inclusive? In the calling context we are fetching both the left and right side separately. If the alias matches either, won't this validation pass erroneously?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to make sure that we don't impose an order on the comparison expression sides. For instance, left side can have the table and right side can have the stream key. We expect to see the key for both stream and key regardless of their place in the comparison expression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Perhaps it would be worth adding a comment about this. Something like 'This method picks the side of the expression which matches the given alias'.

An alternative would be to send the left and right side separately with both aliases, and allow either alias to be picked. Both are equivalent, but the suggested comment on the behavior would help, IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments providing more details on the method.

@@ -76,7 +76,7 @@ public void shouldLeftJoinOrderAndItems() throws Exception {

final String queryString = String.format(
"CREATE STREAM %s AS SELECT ORDERID, ITEMID, ORDERUNITS, DESCRIPTION FROM orders LEFT JOIN items " +
" on orders.ITEMID = item.ITEMID WHERE orders.ITEMID = 'ITEM_1' ;",
" on orders.ITEMID = items.ID WHERE orders.ITEMID = 'ITEM_1' ;",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an existing bug which now fails due to the newly raised KSQLException?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I fixed the test.

Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hjafarpour hjafarpour merged commit 19c6d41 into confluentinc:4.0.x Nov 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants