Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds privileges to kibana_system to support APM service maps #50051

Closed

Conversation

ogupte
Copy link
Contributor

@ogupte ogupte commented Dec 10, 2019

Addresses elastic/kibana#48996 by giving the kibana_system reserved role privileges to create and write to index apm-service-connections and read from apm-* indices.

@ogupte ogupte added blocker :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC v7.6.0 labels Dec 10, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (:Security/Authorization)

@kobelb
Copy link
Contributor

kobelb commented Dec 10, 2019

Hey @ogupte, the kibana_system role currently only has privileges to the "internal dot-indices" and we generally require that end-users themselves have access to the "data indices".

If I understand correctly, apm-* has data ingested into it by the APM Server which uses a configurable index name. Therefore, end-users must have privileges to read from the apm-* data-index.

I'm not familiar with the details of the apm-service-connections index. Is this meant to be an "internal dot-index" which end-users shouldn't access?

@ogupte
Copy link
Contributor Author

ogupte commented Dec 10, 2019

@kobelb In order to make this feature work, we use the task manager to schedule a task runner (runs every minute) which queries APM transactions and spans (covered by apm-*) then it processes these documents to create a set of 'service connection' documents which is indexed in apm-service-connections (non-configurable index). Because these queries are done in the task runner, they run as the kibana user. Then when we want to render a service map in the UI, a query is made by the logged in user to query apm-service-connections with the selected filters render the data in the client.

@kobelb
Copy link
Contributor

kobelb commented Dec 11, 2019

Alerting is using the task-manager to run jobs that internally use Elasticsearch API Keys so that the background jobs run as the end-user which scheduled them. This allows us quite a bit of flexibility in regard to the data-indices that the background job can read from and index into without requiring that the kibana_system has these privileges. Additionally, this way we aren't constrained with only reading data from the apm-* indices and always outputting the results into apm-service-connections since the APM server lets you configure the output indices to something different than apm-*.

@tvernum tvernum added the v8.0.0 label Dec 11, 2019
@tvernum tvernum self-requested a review December 11, 2019 00:51
@ogupte
Copy link
Contributor Author

ogupte commented Dec 11, 2019

In our implementation kibana schedules the job when it starts, there is no end-user who kicks it off. As for the apm-*, we would expect the user to create a custom role to accommodate non-default indices that match apm-server's configuration.

@kobelb
Copy link
Contributor

kobelb commented Dec 11, 2019

In our implementation kibana schedules the job when it starts, there is no end-user who kicks it off. As for the apm-*, we would expect the user to create a custom role to accommodate non-default indices that match apm-server's configuration.

Granting the kibana_system role access to data-indices breaks from precedent, so I'm hesitant to do so. However, I'd like to further understand the design of service maps, are there design documents or a PR that I should be referring to? Automatically scheduling a task which reads data from apm-* transforms it and then indexes it into apm-service-connections has various security implications which I haven't fully thought through. During our discussions at EAH, there was mention about using DLS and FLS on top of the apm-* data-indices to allow users to view segmented portions of the APM data and this potentially violates that approach.

@ogupte
Copy link
Contributor Author

ogupte commented Dec 11, 2019

@kobelb take a look at the service maps PR for Kibana: elastic/kibana#50844

@dgieselaar
Copy link
Member

@kobelb I'm also working on a background task that collects telemetry about the data profile of our customers, e.g. how many transactions or services are stored/retained. This will likely run into the same issue; there's no user that would kick this off either, and to expect the user to click a button so Elastic can collect telemetry doesn't sound useful. How would you recommend solving this?

@albertzaharovits
Copy link
Contributor

I see that there is an apm_user reserved role with read access to apm-* indices.
@ogupte Can the apm-service-connections be renamed under the apm-* namespace?

I am also leaning towards only granting the apm_user role access to the APM data and I also think that there generally should not be background tasks that fire up when the node server starts and instead that they should be hooked into user actions. If however there is a component that works in the background, on a par with Kibana, then it should require the usual username/password configuration in a configuration file; this is also a form of requesting user consent for some feature to access his data.

@kobelb
Copy link
Contributor

kobelb commented Dec 11, 2019

Apologies for the wall of text. I've included quite a bit of "history" and ancillary content here to make sure we're all on the same page.

There are two general categories of indices:

  • System-indices, which start with a .
  • Data-indices, which don't start with a .

System-indices are used by the various systems within the Elastic Stack to store data. These indices should never be accessed directly by end-users, and all indirect access to these indices should go through an API provided by an application within the Elastic Stack.

Data-indices store data which has been ingested into the Elastic Stack. These indices are accessed directly by end-users and can be accessed by the standard Elasticsearch document and search APIs.

Privileges

End-users should not have Elasticsearch privileges to access system-indices. Instead, end-users should have either cluster privileges which allow them to use dedicated Elasticsearch REST APIs, for example the Security APIs for indirectly manipulating the .security indices; or the end-users should have application privileges, and use dedicated REST APIs within other applications in the Elastic Stack, for example Kibana's Saved objects APIs for indirectly manipulating the .kibana* indices. Applications themselves have privileges for system-indices, for example the kibana user has the kibana_system role which has privileges to the .kibana* indices.

End-users should have Elasticsearch privileges to access data-indices. This allows users to utilize DLS to restrict which documents a user can access, and FLS to restrict the fields a user can access.

Data ingestion and privileges

Historically, all data was ingested by applications like Logstash before being inserted into data-indices and end users of Kibana could just read the documents from the data-indices. This requires Logstash itself to have permission to write to the data indices, and it requires user intervention to configure the user and role that Logstash uses. End-users are then assigned roles which allow them to read from the data-indices that Logstash ingests data into.

A similar situation applies to APM Server, which ingests documents into data-indices. The user must create a user and role that the APM Server will use for writing documents into Elasticsearch. This requires user-intervention, and the apm_system role doesn't automatically have privileges to read or write from the apm-* indices.

Using Kibana to transform data

Kibana's list of responsibilities continues to grow, and Kibana has started to do its own data transformation. Examples of this are Alerting, SIEM's detection engine, Endpoint's alerts and now APM's service maps. In these situations, the transformation at a very high-level reads from existing documents in data-indices does some transformation and then stores the results in Elasticsearch. During this transformation, we need to ensure that we don't break the existing privileges model.

Alerting does this by using Elasticsearch's API keys to run the transformation using the identity of the user which scheduled the alert. This ensures that the end-user themselves is able to read from the source data-indices, their DLS/FLS is respected, and that they are authorized to write the results to Elasticsearch. This differs from the previous ingest workflow by not requiring a specific user be configured in a .yml file somewhere, and instead gets the user's consent when the alert itself is created. This approach also ensures that a user is unable to "escalate their privileges" to get access to data they previously wouldn't have access to.

If a hard-fast requirement of APM's implementation of service maps is that the job is automatically ran by Kibana on start-up, then Alerting's approach will not work. However, we still need explicit "user consent" and we can potentially utilize the approach that Albert brought up where a username/password is provided in the kibana.yml. The drawback with this approach is only a small fraction of users will be able to modify the kibana.yml file, which will likely hurt service map's usage.

@kobelb
Copy link
Contributor

kobelb commented Dec 11, 2019

@kobelb I'm also working on a background task that collects telemetry about the data profile of our customers, e.g. how many transactions or services are stored/retained. This will likely run into the same issue; there's no user that would kick this off either, and to expect the user to click a button so Elastic can collect telemetry doesn't sound useful. How would you recommend solving this?

Do you have any additional documentation regarding the "source data" that you'd like to derive this telemetry data from, and how it will be consumed?

@dgieselaar
Copy link
Member

@kobelb I'm running some queries on the data indices (apm-*), and then storing the result in a saved object that is fetched when the telemetry plugin needs it. Background is here: elastic/kibana#50757

kibana_system to support APM service maps in kibana
@ogupte ogupte force-pushed the apm-48996-service-maps-index-privileges branch from 80f737d to 005d0e0 Compare January 3, 2020 19:53
@kobelb
Copy link
Contributor

kobelb commented Jan 6, 2020

Hey @ogupte, this PR is still adding privileges to the kibana_system role to allow it to index documents into the apm-service-connections index. The concerns expressed in the "Using Kibana to transform data" section of #50051 (comment) still apply in this situation. Did any of the design around how this process will work change since our last discussion?

@ogupte
Copy link
Contributor Author

ogupte commented Jan 6, 2020

Since our discussion, the design changed to address the security concern of potentially leaking user data. We did this by making the apm-service-connections index non-configurable and only giving it the create_index, view_index_metadata, and index permissions, so it is not able to even read from that index. All the reads are done by the user directly who already has apm-* privileges by default.

@kobelb
Copy link
Contributor

kobelb commented Jan 6, 2020

If I understand correctly, the end-user themselves has privileges to read from the apm-* indices. So, we have some process which uses the end-user to read the data from the apm-* indices and then uses the internal server user to insert documents into the apm-service-connections index?

@ogupte
Copy link
Contributor Author

ogupte commented Jan 6, 2020

If I understand correctly, the end-user themselves has privileges to read from the apm-* indices. So, we have some process which uses the end-user to read the data from the apm-* indices and then uses the internal server user to insert documents into the apm-service-connections index?

Yes, this is accurate.

The implementation to make it work like this adds a quite of bit of complexity to an already complex task. In retrospect, we're considering abandoning the background task in favor of doing all queries at runtime without any persistence.

The platform just isn't able to support the data transformations we need in a way that doesn't also interrupt the user flow.

Closing for now.

@ogupte ogupte closed this Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants