Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag Search with Value Regex #87

Open
frankgreco opened this issue Sep 27, 2017 · 19 comments
Open

Tag Search with Value Regex #87

frankgreco opened this issue Sep 27, 2017 · 19 comments

Comments

@frankgreco
Copy link

This is a feature request

It is very common to have the http url path as a tag. Below is an example that i'll reference throughout this issue:

http.request.url.path = /foo/bar/car

I have encountered numerous use cases at my organization where developers need to troubleshoot an issue and the only thing they really know is part of the request url path (e.g. /foo). Currently, when preforming a tag search in the UI, one must know the entire value of the tag. Instead of knowing the entire tag value, it would be extremely bolster the usage experience if you were able to search on a regex. Below is an example using the Jaeger query language:

http.request.url.path:/foo*

Of course, I am using a specific example here (http request path), however, this feature would obviously support all tag values.

This would appear to be a feature that would be immensely helpful in numerous use cases. I am happy to begin work on this right away pending further discussion.

P.S.

Before this issue, it was a toss up as to whether I should create it in this repository or the main repository. This functionality could be implemented in either the UI or in the query service. It all depends on how the flow is structured. This is reduced to the following question. When receiving spans from the query service, when are tags filtered against? Does this happen at the React component level? Are the tags sent to the query service and filtered on the backend?

@frankgreco
Copy link
Author

/cc @bradwilliams-nm @pfremm

@black-adder
Copy link
Contributor

Are the tags sent to the query service and filtered on the backend

Yes. This would be a lovely feature but kinda difficult to implement for one of our data stores (cassandra). AFAIK, you can't do prefix searches on cassandra unless you use SASI indices. (We didn't have a very good experience when we used SASI indices previously for Jaeger, it was too slow vs creating and maintaining our own index tables). For cassandra specifically, we'd need create a separate sasi index for the tags table which is doable and then update the query service.

If you were to use the elastic search backend, prefix searches will be easy to do with very minor changes.

The alternative is to have the query service provide possible tags to the UI when the user starts to filter based on tags. I think we can get away with this for both cassandra and ES without the need for extra tables or indices but I'm not sure how responsive this technique will be for the UI user.

@frankgreco
Copy link
Author

@black-adder

As a user of the Jaeger UI, the experience should be abstracted from the individual storage driver that i'm using and so i'm not sure if we'd want to touch the storage layer at all for this. This would also add additional complexity if other storage drivers were added. Here's a solution that might be the best of both worlds:

Only send tags to the query service that are fully qualified (not a regex). This will return a set of spans to the UI that are filtered but still require more filtering. Then, the UI could easily filter the result set further based off of the tags that had a regex value search.

Example:

Image if I execute a Jaeger UI search and my tags field had the following:

http.request.url.path:/foo*|http.response.status.code:200

The UI would ask the query service to retrieve all spans that have the tag http.response.status.code and a value of 200. Then, the UI would just filter that result set so that only spans that have the tag http.request.url.path and a value matching the regex /foo* are shown.

The obvious downside to this is that we are receiving more results from the query service than we need. However, this approach would be a lot better then receiving all spans for a service and letting the UI filter on all the tags.

I would also add that IMHO this user experience/feature is far more important that performance although the above solution would greatly help with performance.

@frankgreco
Copy link
Author

In addition, the above solution would only require a change to the UI and no other component which would be a plus.

@black-adder
Copy link
Contributor

sounds reasonable. What should we do if user only passes in not qualified tags only?

Are you opposed to having the query take care of the filtering based on the not qualified tags? ie have the storage query based on the qualified tags and then have the query service filter on the not qualified tags. This should reduce the amount of data over the wire. I think in the short term, we can do what you've suggested and do everything in the UI, but when we do support pagination, it might be easier to have some work done on the query side for efficiency reasons.

Do you have any UI resources who could build a quick prototype of this?

@yurishkuro @tiffon @vprithvi thoughts?

@frankgreco
Copy link
Author

@black-adder You are correct, this could just as easily be done in the query service after it receives a response from the storage driver. I don't oppose that at all.

In the short term, I do have resources that can do to quickly do this on the frontend. I will talk to them now. The change should be simple and consist of the following two items:

  1. Modify the call to the query service to make sure that it doesn't include the tags that contain a regex.
  2. Filter the results when the result set is returned from the query service so that the regex tags are filtered on.

@yurishkuro
Copy link
Member

I am not in favor of building this kind of logic into the UI.

Also, the approach has a fundamental flaw. Suppose we take the example of query http.request.url.path:/foo* AND http.response.status.code:200. Almost every trace for that service will match the second condition. But because of the LIMIT 20 the storage will return only 20 records with a high probability that none of them match the URL path. We already have that issue with Cassandra anyway even if the first clause was an exact tag match, but by moving this logic to the query service we'll create the same problem even for Elasticsearch backend, even though ES could've answered the wildcard query directly & correctly.

@frankgreco
Copy link
Author

I agree that the we'd run into issues with the LIMIT 20. However, I'd need more clarification as to why this would be equally as challenging if implemented in the query service.

@yurishkuro
Copy link
Member

sorry, not sure what you mean by "equally as challenging" - query service is the only place to implement any sort of logic for the queries, UI should never be doing any logic.

But doing the resolution in the query service does not address the LIMIT 20 problem in any way, because the limit applied by the storage implementations, not the query service.

@frankgreco
Copy link
Author

because the limit applied by the storage implementations, not the query service.

Makes sense now.

So if the query service observes that there is a tag that has a regex in it, it would need to retrieve more than the limit. This is where the challenge lies.

On the UI, is the limit field meant to increase performance or shorten the list of spans that the user sees so it's easier for a user to make sense of them? Reason I ask is because I'd be a proponent of removing the LIMIT from the actual db query. Then, the query service can further filter the results by the search tags that have a regex and then can limit the actual number of spans sent back to the user as defined in LIMIT. While this would hurt performance, it would achieve the desired behavior while keeping it abstracted form the query.

Of course, performance can still be achieved if a user of the UI narrows certain parameters like time and increasing the number of qualified tags.

@yurishkuro
Copy link
Member

I'd be a proponent of removing the LIMIT from the actual db query.

Same here. If you've seen Ben Sigelman's demos of Lightstep (e.g. at Monitorama'17), their UI just starts showing you the results matching the query - there isn't even a Submit button iirc, the results are dynamically filtered. I don't know what kind of storage they are using internally that allows them to search so quickly.

@frankgreco
Copy link
Author

their UI just starts showing you the results matching the query - there isn't even a Submit button iirc, the results are dynamically filtered.

That would be great functionality to work towards!

Until then, what do we think about the following steps needed to accomplish the above solution:

  1. Remove LIMIT from the DB queries.
  2. Query the DB on all tags that do not contain regex values
  3. Filter result set on tags that contain regex values
  4. Return LIMIT or less spans to the response.

@yurishkuro
Copy link
Member

Remove LIMIT from the DB queries.

That's a non-starter, if you search by some common tag like span.kind=server you'll get potentially millions of records matching, it's going to kill an storage.

@frankgreco
Copy link
Author

@yurishkuro isn't this what you were a proponent of in your last comment?

@yurishkuro
Copy link
Member

yurishkuro commented Sep 28, 2017

If you mean this https://github.com/uber/jaeger-ui/issues/87#issuecomment-332898783, I was describing the user-facing functionality, which does not expose any notion of the LIMIT. I don't believe you can actually get rid of the limit completely in the storage implementation. If you do select * from tags_index where tag="span.kind=server" and it matches 1mm records, even though the whole result set won't be shipped to the client, it would still need to be computed in Cassandra, which requires a lot of resources. The only way I see of implementing Lightstep's behavior is by using some iterative queries, possibly with increasing LIMITs (but in reality we need a different/better storage backend that is better suitable for searching, Cassandra is simply not it).

@frankgreco
Copy link
Author

How would this work though with Jaeger's goal of supporting multiple backends (MySQL, ES, Cassandra, etc.). If we do find a really good storage driver that allowed us to do iterative queries how could we continue to provide a plugable backend driver?

@yurishkuro
Copy link
Member

I personally don't think supporting multiple backends is an official goal of Jaeger - we do it out of necessity as different teams may have different level of experience with C v ES v MySQL, so some choice in the matter is healthy, but not to the point where we start supporting every imaginable storage backend. But if we insist that the high-level functionality is the lowest common denominator of those backends, I don't think it's going to be a good path. If we had a clear winner in performance and features, I would totally put most weight on such backend, leaving the others to "best effort, you might loose some features".

@frankgreco
Copy link
Author

I’d definitely agree with that approach. Would it makes sense to ask the question of which db will let us preform queries with regex values (like influx)?

@yurishkuro
Copy link
Member

I'd like to know the answer too. ElasticSearch would allow these queries, but its write performance is worse than Cassandra's so it's a trade-off. Someone was trying to use InfluxDB for traces, but I don't know what came out of it (we have jaegertracing/jaeger#272).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants