-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tag Search with Value Regex #87
Comments
Yes. This would be a lovely feature but kinda difficult to implement for one of our data stores (cassandra). AFAIK, you can't do prefix searches on cassandra unless you use SASI indices. (We didn't have a very good experience when we used SASI indices previously for Jaeger, it was too slow vs creating and maintaining our own index tables). For cassandra specifically, we'd need create a separate sasi index for the tags table which is doable and then update the query service. If you were to use the elastic search backend, prefix searches will be easy to do with very minor changes. The alternative is to have the query service provide possible tags to the UI when the user starts to filter based on tags. I think we can get away with this for both cassandra and ES without the need for extra tables or indices but I'm not sure how responsive this technique will be for the UI user. |
As a user of the Jaeger UI, the experience should be abstracted from the individual storage driver that i'm using and so i'm not sure if we'd want to touch the storage layer at all for this. This would also add additional complexity if other storage drivers were added. Here's a solution that might be the best of both worlds: Only send tags to the query service that are fully qualified (not a regex). This will return a set of spans to the UI that are filtered but still require more filtering. Then, the UI could easily filter the result set further based off of the tags that had a regex value search. Example: Image if I execute a Jaeger UI search and my tags field had the following: http.request.url.path:/foo*|http.response.status.code:200 The UI would ask the query service to retrieve all spans that have the tag The obvious downside to this is that we are receiving more results from the query service than we need. However, this approach would be a lot better then receiving all spans for a service and letting the UI filter on all the tags. I would also add that IMHO this user experience/feature is far more important that performance although the above solution would greatly help with performance. |
In addition, the above solution would only require a change to the UI and no other component which would be a plus. |
sounds reasonable. What should we do if user only passes in not qualified tags only? Are you opposed to having the query take care of the filtering based on the not qualified tags? ie have the storage query based on the qualified tags and then have the query service filter on the not qualified tags. This should reduce the amount of data over the wire. I think in the short term, we can do what you've suggested and do everything in the UI, but when we do support pagination, it might be easier to have some work done on the query side for efficiency reasons. Do you have any UI resources who could build a quick prototype of this? @yurishkuro @tiffon @vprithvi thoughts? |
@black-adder You are correct, this could just as easily be done in the query service after it receives a response from the storage driver. I don't oppose that at all. In the short term, I do have resources that can do to quickly do this on the frontend. I will talk to them now. The change should be simple and consist of the following two items:
|
I am not in favor of building this kind of logic into the UI. Also, the approach has a fundamental flaw. Suppose we take the example of query |
I agree that the we'd run into issues with the |
sorry, not sure what you mean by "equally as challenging" - query service is the only place to implement any sort of logic for the queries, UI should never be doing any logic. But doing the resolution in the query service does not address the |
Makes sense now. So if the query service observes that there is a tag that has a regex in it, it would need to retrieve more than the limit. This is where the challenge lies. On the UI, is the limit field meant to increase performance or shorten the list of spans that the user sees so it's easier for a user to make sense of them? Reason I ask is because I'd be a proponent of removing the Of course, performance can still be achieved if a user of the UI narrows certain parameters like time and increasing the number of qualified tags. |
Same here. If you've seen Ben Sigelman's demos of Lightstep (e.g. at Monitorama'17), their UI just starts showing you the results matching the query - there isn't even a Submit button iirc, the results are dynamically filtered. I don't know what kind of storage they are using internally that allows them to search so quickly. |
That would be great functionality to work towards! Until then, what do we think about the following steps needed to accomplish the above solution:
|
That's a non-starter, if you search by some common tag like |
@yurishkuro isn't this what you were a proponent of in your last comment? |
If you mean this https://github.com/uber/jaeger-ui/issues/87#issuecomment-332898783, I was describing the user-facing functionality, which does not expose any notion of the LIMIT. I don't believe you can actually get rid of the limit completely in the storage implementation. If you do |
How would this work though with Jaeger's goal of supporting multiple backends (MySQL, ES, Cassandra, etc.). If we do find a really good storage driver that allowed us to do iterative queries how could we continue to provide a plugable backend driver? |
I personally don't think supporting multiple backends is an official goal of Jaeger - we do it out of necessity as different teams may have different level of experience with C v ES v MySQL, so some choice in the matter is healthy, but not to the point where we start supporting every imaginable storage backend. But if we insist that the high-level functionality is the lowest common denominator of those backends, I don't think it's going to be a good path. If we had a clear winner in performance and features, I would totally put most weight on such backend, leaving the others to "best effort, you might loose some features". |
I’d definitely agree with that approach. Would it makes sense to ask the question of which db will let us preform queries with regex values (like influx)? |
I'd like to know the answer too. ElasticSearch would allow these queries, but its write performance is worse than Cassandra's so it's a trade-off. Someone was trying to use InfluxDB for traces, but I don't know what came out of it (we have jaegertracing/jaeger#272). |
This is a feature request
It is very common to have the http url path as a tag. Below is an example that i'll reference throughout this issue:
I have encountered numerous use cases at my organization where developers need to troubleshoot an issue and the only thing they really know is part of the request url path (e.g.
/foo
). Currently, when preforming a tag search in the UI, one must know the entire value of the tag. Instead of knowing the entire tag value, it would be extremely bolster the usage experience if you were able to search on a regex. Below is an example using the Jaeger query language:http.request.url.path:/foo*
Of course, I am using a specific example here (http request path), however, this feature would obviously support all tag values.
This would appear to be a feature that would be immensely helpful in numerous use cases. I am happy to begin work on this right away pending further discussion.
P.S.
Before this issue, it was a toss up as to whether I should create it in this repository or the main repository. This functionality could be implemented in either the UI or in the query service. It all depends on how the flow is structured. This is reduced to the following question. When receiving spans from the query service, when are tags filtered against? Does this happen at the React component level? Are the tags sent to the query service and filtered on the backend?
The text was updated successfully, but these errors were encountered: