Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async EQL POS (was: Make AsyncSearchIndexService reusable) #55119

Closed

Conversation

imotov
Copy link
Contributor

@imotov imotov commented Apr 13, 2020

@jimczi @javanna we need basically the same functionality in EQL. I took a stab
at refactoring some of the parts of async search to make the reusable in EQL.
Let me know what you think.

EQL will require very similar functionality to async search. This PR refactors
AsyncSearchIndexService to make it reusable for EQL.

Relates to #49638

EQL will require very similar functionality to async search. This PR refactors
AsyncSearchIndexService to make it reusable for EQL.

Relates to elastic#49638
@imotov imotov added :Search/Search Search-related issues that do not fall into other categories >refactoring :Analytics/EQL EQL querying labels Apr 13, 2020
@imotov imotov requested review from javanna and jimczi April 13, 2020 16:27
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (:Query Languages/EQL)

@imotov
Copy link
Contributor Author

imotov commented Apr 16, 2020

@javanna I added POC for the Transport and REST layers as you requested.

@javanna javanna self-assigned this Apr 20, 2020
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at this and left some comments. I think it makes sense to share especially the services between eql and async search given that the requirements when it comes to their async behaviour are the same.

I wonder if we go that route, if we should also share more around the transport actions, submit, delete and get, and rest actions, because most of the code is really the same.

I also wonder if this is going to be fine in the long run, for instance as we make async search more resilient, start saving partial results etc. I am not sure what the answer is to that, and I left a comment around partial results because at the moment they are not used for EQL.

On the other hand, I would try to look at the async impl of EQL from a different perspective, especially when it comes to the structure of the tasks and the use of a progress listener. While async search is the extension of search under a different license, EQL could just be an async API that can be made sync through a specific parameter (this would fit well with the goal of having a single endpoint). I don't think EQL needs a sync task and an async task that extends the sync one. That is outside the scope of this PR though and it should not affect how we share the services.


import org.elasticsearch.common.io.stream.Writeable;

public interface AsyncResponse extends Writeable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would look into making this an abstract class. I was under the impression that more logic can be shared around toXContent and the other fields that make sense in both scenarios, besides the actual response that differs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we can reuse this single line toXContent but since the expirationTimeMillis is serialized as the last element of AsyncSearchResponse it will create very awkward situation with everything else. It will also limit reuse of the services. Right now AsyncResponse is a very clear abstraction - it's a expirable writable. Nothing else, nothing more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, maybe we could leave the transport serialization to the implementers but still share more? Why would sharing more code limit reuse of services?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would sharing more code limit reuse of services?

With this being an interface, we can easily store any response. If we make it an abstract class and we will not be able to store any response that doesn't inherit from ActionResponse directly. For example, if we ever want to make any ReplicaitonResponse, BroadcastResponse or BaseNodeResponse derived response async we will have issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that we should model this so broadly and solve problems that we may never encounter. Ideally, we would avoid having an interface with only two implementers that do exactly the same thing, that is the main reason why I would go for an asbtract class and share what makes sense today, unless we already envision things to be different soon.

import static org.elasticsearch.xpack.core.security.authc.AuthenticationField.AUTHENTICATION_KEY;

/**
* A service that exposes the CRUD operations for the async-search index.
* A service that exposes the CRUD operations for the async-task index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name of the index has not changed right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the index is now passed as the first parameter of the AsyncTaskIndexService constructor, so it can be anything. AsyncSearch passes the original name of the index as the first parameter when it instantiates this class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen that, maybe update the comment to clarify that async-task is not the name of the index?

@@ -476,4 +476,8 @@ public TotalHits totalHits() {
return this.totalHits;
}
}

public EqlSearchResponse merge(EqlSearchResponse partial) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are partial results a think in the context of EQL? I see that the progress listener is never notified for partial results, I was wondering if that is to keep the POC small or if partial results don't apply.

Copy link
Contributor Author

@imotov imotov Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have logic in EQL that can notify about partial results yet. So this is a stub at the moment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok do you know what partial results will consist of in the context of EQL which is made of multiple search requests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I assume that they will be just as final EqlSearchResponse only smaller. That's why I stubbed it with the same response object. We might change if we will realize we need additional data there.

return AsyncEqlSearchResponse::new;
}

public static class Request extends ActionRequest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could/should the request here be shared? they seem exactly the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we can share this one.

*
* @see AsyncEqlSearchResponse
*/
public class SubmitAsyncEqlSearchRequest extends ActionRequest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, could/should we share the request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed yesterday this one goes away once I merge async and non sync method together. Even if we didn't merge them, this method wraps EqlSearchRequest while SubmitAsyncSearchRequest wraps SearchRequest and has several additional attributes that this one doesn't. So I don't think we could share it at least easily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I meant that we could share the stub like wait for completion and keep alive, but I guess we would have also in this case problems due to multiple inheritance that is not possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looking into this now. Need to resolve fallout from the last night merge first though.

listener.accept(finalResponse);
}
completionListeners.clear();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if all this logic around handling wait for completion timeout should be shared. Not sure how complicated that would make things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could, but at we would have to extract some common parts of the completion listeners and to be honest I am not quite sure if we will have a lot of common logic when dust settles. I would prefer with sharing parts that can be easily abstracted (my original commit) and then implement more specific parts (POC parts here) and after EQL matures a bit we can take another pass and see if we can identify other areas for reuse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point of view. On the other hand, all the logic around waitForCompletion timeout is quite hard and it would be good to have it in one consolidated place. And I don't like being it duplicated. What different logic would you expect EQL to have when the dust settles? I was under the impression that waitForCompletionTimeout and the corresponding functionality will be the same, and I see this logic almost as part of the services that we would like to share.

@imotov
Copy link
Contributor Author

imotov commented Apr 21, 2020

@javanna thanks for the review! I addressed some of your comments, but I will take a look at the reuse of parts of the transport layer tomorrow. I got a bit sidetracked by all the merge conflicts and couldn't get to it today.

Copy link
Contributor Author

@imotov imotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this with @javanna and we decided that it's enough of an exercise for POC. I am going to close this PR and start opening piecemeal PRs for separate components.

@imotov imotov closed this Apr 22, 2020
imotov added a commit to imotov/elasticsearch that referenced this pull request Apr 22, 2020
EQL will require very similar functionality to async search. This PR refactors
AsyncSearchIndexService to make it reusable for EQL.

Supersedes elastic#55119
Relates to elastic#49638
imotov added a commit that referenced this pull request Apr 23, 2020
EQL will require very similar functionality to async search. This PR refactors
AsyncSearchIndexService to make it reusable for EQL.

Supersedes #55119
Relates to #49638
imotov added a commit to imotov/elasticsearch that referenced this pull request Apr 23, 2020
EQL will require very similar functionality to async search. This PR refactors
AsyncSearchIndexService to make it reusable for EQL.

Supersedes elastic#55119
Relates to elastic#49638
imotov added a commit that referenced this pull request Apr 23, 2020
EQL will require very similar functionality to async search. This PR refactors
AsyncSearchIndexService to make it reusable for EQL.

Supersedes #55119
Relates to #49638
@imotov imotov changed the title Make AsyncSearchIndexService reusable Async EQL POS (was: Make AsyncSearchIndexService reusable) Apr 30, 2020
@imotov imotov deleted the make-async-index-service-reusable branch May 1, 2020 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/EQL EQL querying >refactoring :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants