Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User Access and security support #14

Closed
ps48 opened this issue Apr 20, 2022 · 10 comments · Fixed by #41
Closed

User Access and security support #14

ps48 opened this issue Apr 20, 2022 · 10 comments · Fixed by #41
Assignees
Labels
enhancement New feature or request

Comments

@ps48
Copy link
Collaborator

ps48 commented Apr 20, 2022

Issue:
The Querqy plugin doesn't honor the user access controls provided by OpenSearch.

Access Control: https://opensearch.org/docs/latest/security-plugin/access-control/index/
More about FGAC: https://opensearch.org/blog/technical-posts/2020/10/fine-grained-access-control-for-search/

@ps48 ps48 self-assigned this Jun 23, 2022
@ps48
Copy link
Collaborator Author

ps48 commented Jun 23, 2022

Hi @JohannesDaniel and @renekrie, OpenSearch security plugin has issues in restricting the Querqy plugin search requests. This is mainly due to caching of RewriterFactoryAndLogging. I would like to understand why was this introduced in the codebase, Was there any particular issue with performance or accuracy?

@renekrie
Copy link
Contributor

The factories hold the processed rewriter configurations. Depending on the rewriter, these could be objects that were parsed from thousands of rules - something we cannot do per request.

What exactly is the security issue? If it is about using the specific cache implementation - I had considered replacing it by a simple map anyway. If it is about keeping state across requests - that's something we will need.

@ps48
Copy link
Collaborator Author

ps48 commented Jun 25, 2022

That makes sense, the more rules we have -> more is the processing time and hence we need cache.

The OpenSearch security plugin is responsible for role based access control. It checks authentication over all the REST APIs provided by OpenSearch.

In our scenario for the Querqy Plugin, the cache holds processed rewriter configuration across all users. Whenever a user makes a search request using Querqy, the user needs to have access to read the queried index(index containing data) and the Querqy index (index containing rules). The security plugin is able to do access check for the queried index out of the box, but if the Querqy index is already cached then it's not able to check authorization of user over the Querqy index.

I was thinking of 2 work-arounds for this:

  1. User based cache -> We have a cache for each user. This would make sure with each request a user based cache is built. If the user is making a request for the first time, the security plugin will allow/block the user, when the GET request is make to the Querqy index for the first time. Issue with this approach is, If user access is changed by the admin user and cache is not cleared, the same user will still have the access to uncleared cache.

  2. Mandatory access check -> Here, we use the same cache implementation, but we add a mandatory GET request (maybe without fetching the source of document) with every search request made with Querqy. This way the security plugin is able to check in real time if the user has access to the Querqy index. This costs a document hit for every request.

@renekrie
Copy link
Contributor

renekrie commented Jun 26, 2022

Hi @ps48,

What I don't get is why we have to check the access to the Querqy index. Can't we just say that if a user can use Querqy queries, s/he would implicitly have to have access to that index so that we assume it's granted any way?!

I think the first of your two options would be too slow and put a lot of load on the nodes (we'd have to load and parse rules for each user).

I'd also like to avoid the additional GET request in the second work-around. Is there a way to check that index access programatically, without sending anything over HTTP?

@renekrie
Copy link
Contributor

The more I think about it, the more I wonder whether this access control of the Querqy index is needed - it is more or less used like a Querqy internal data store. We are not exposing any data that a user would search . In a classical DB setup, the access rights needed would be that of a system user, not the rights of an end user.

@ps48 ps48 changed the title FGAC support User Access and security support Jun 27, 2022
@ps48
Copy link
Collaborator Author

ps48 commented Jun 28, 2022

I can check again, to see if there's a way to check a user's index access get the HTTP call. Coming to the second comment, we can do a system user way to implement this. Adding/Editing rules needs index write access to Querqy index. Meanwhile, All users can use search queries ( indirect read access Querqy Index) without any special access needed. I'll check deeper into both of these.

Also, thank you so much for discussing things here, this is very useful.

@renekrie
Copy link
Contributor

Hey @ps48,
Just to make sure: When we configure rewriters, we use an endpoint that is provided by the Querqy plugin, à la
PUT /_querqy/rewriter/common_rules (also see https://docs.querqy.org/querqy/rewriters.html). It would totally make sense to constrain that access (i.e. only allow an admin user to perform the configuration request). Under the hood, this endpoint validates the configuration and writes to the Querqy index (and invalidates the rewriter in the cache, as far as I remember).
This type of access control should be different from the one that involves as system user for reading from that index when a user runs a query using the querqy request builder/query type.

@renekrie
Copy link
Contributor

@ps48 Is there still an open TODO out of this issue?

@ps48
Copy link
Collaborator Author

ps48 commented Sep 14, 2022

yes I have an action item here, to validate this approach with OpenSearch Team. Once, done I'll close the issue.

@ps48
Copy link
Collaborator Author

ps48 commented Sep 18, 2022

Checked with the team, they suggested to keep the default behavior of the plugin for now. We can take improvements here from user feedback. For now, I'll update the developer document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants