Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe extend functionality #511

Merged
merged 23 commits into from
Dec 12, 2017
Merged

Describe extend functionality #511

merged 23 commits into from
Dec 12, 2017

Conversation

bluemonk3y
Copy link

#470 - to include useful information. Functionality keeps the existing 'DESCRIBE' function, which is then enhanced via 'DESCRIBE EXTENDED <stream, table, query>' to show metrics, topic, key-field etc

Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. Just wondering if we added any tests for the describe extended? Didn't see any jump out at me

@bluemonk3y
Copy link
Author

@dguy - no tests were added for 'describe extend' functionality, it's not really doing much apart from pulling existing information and munging it into a pojo; lots of plumbing.

I guess I could put one in the CLI.test - as the extend functionality just enriches a POJO that is passed back to the client and printed to stdout. - what do you thunk?

Copy link
Contributor

@apurvam apurvam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, barring one comment.

exception);

MetricCollectors.recordError(record.topic());
return DeserializationHandlerResponse.CONTINUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this return value depend on how FAIL_ON_DESERIALIZATION is configured? Do we want to continue if FAIL_ON_DESERIALIZATION is false? Presumably the query is failed elsewhere in the latter case. If so, a comment here would help.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Apuva - if we fail then all processing stops; this impl is identical to the streams one. This is the side-channel where the user has the option to provide their own handler - however, a bad data format instance doesn't kill the stream processor.

private final List<FieldSchemaInfo> schema;
private final DataSource.DataSourceType type;
private final String type;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to change this to string?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjafarpour because it was returning a Query which is not a valid value on the enum

import io.confluent.ksql.parser.tree.ShowColumns;
import io.confluent.ksql.parser.tree.Statement;
import io.confluent.ksql.parser.tree.TerminateQuery;
import io.confluent.ksql.parser.tree.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if our more strict code style will require not using *.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. We should not be using wild card imports here or elsewhere. I am surprised that this passes the stylecheck.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

me 2

@hjafarpour
Copy link
Contributor

I still think we should separate Stream/Table, which represent data sources/sinks, from Queries which represent computation.
If @miguno is ok with this iteration from product point of view for this PR, let's merge it but let's keep track if my suggestions for a future PR which can include the following:

  • Describe Extended [stream/table name];
    This will present the detailed information on the stream or table. We wouldn't show any query informations but we would show all the queries that read from/write into this stream/table. Note that we assume N-to-N relationship between queries and streams/tables so we will have set of queries associated to this stream/table. We can show the id of these queries in the result of this statement too.

  • Describe Query [query id];
    This will show all the details on a query including the execution plan, streams topology and producer/consumer stats. Note that producer/consumer stats are for queries, aka streams jobs and not for data sources/sinks, aka streams/tables.

Finally, we would indicate that the shown data is only for one server/instance and in order to get the wholistic view of the metrics user should gather the metrics from all instances of a query.

@apurvam
Copy link
Contributor

apurvam commented Dec 12, 2017

There seems to be a bug with the ConsumerCollector: the ConsumerCollector.onConsume method takes an argument of type ConsumerRecords<String, List<ConsumerRecord<K,V>>.

The collector counts every invocatino of onConsume as a single event. So even if 10000 records were consumed (and passed into the onConsume method), it would only count as 1, causing the metric to be wrong.

@apurvam
Copy link
Contributor

apurvam commented Dec 12, 2017

Sorry, I misread the code. The current consumer collector behavior seems correct.

@bluemonk3y
Copy link
Author

bluemonk3y commented Dec 12, 2017

@hjafarpour - as discussed I will document this on the wiki and sync with @miguno

Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after you have merged

@bluemonk3y
Copy link
Author

retest this please

1 similar comment
@bluemonk3y
Copy link
Author

retest this please

@bluemonk3y bluemonk3y merged commit 6ca6220 into confluentinc:master Dec 12, 2017
@bluemonk3y bluemonk3y deleted the DescribeExtend branch January 5, 2018 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants