Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Add a human-readable API and documentation for configuring package policies #132263

Closed
Tracked by #123150 ...
joshdover opened this issue May 16, 2022 · 18 comments · Fixed by #139420
Closed
Tracked by #123150 ...

[Fleet] Add a human-readable API and documentation for configuring package policies #132263

joshdover opened this issue May 16, 2022 · 18 comments · Fixed by #139420
Assignees
Labels
enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@joshdover
Copy link
Contributor

While improvements in #119739 were made to make consuming the package_policies API simpler, by automatically providing default values for all fields possible, there are still several issues that make consuming this API hard for end-users. For example, see this API call to create an Apache package policy:

POST http://elastic:changeme@localhost:5601/julia/api/fleet/package_policies
kbn-xsrf: kibana

{
  "name":"apache-5",
  "inputs": [
    {"type":"logfile","enabled":true},
    {"type":"apache/metrics","enabled":false}
  ],
  "package": {
    "name":"apache",
    "version":"0.3.3"
  }
}
See UI for Apache integration

A few issues standout:

  • How do these input types map the two data streams shown in the UI (“Collect logs from Apache instances” and “Collect metrics from Apache instances”)? Why are they different types?
    • This is not obvious to end users and should be considered an implementation detail of Fleet/Agent.
    • Could we provide human-readable fields on each input that are guaranteed not to change across package versions and then omit the type field. This would allow for an inputs array like:
        "inputs": [
          {"id":"apache_logs","enabled":true},
          {"id":"apache_metrics","enabled":false}
        ],
      
  • The available options on each data stream are not visible anywhere.
    • Could we add in-app documentation of the available options on each input? This could live in a dedicated tab within the Integration’s detail view. Ideally, this is also available in our public integration docs.
  • What is the purpose of the name? If I’m managing this from an IaC tool, like Terraform, what’s the value of providing a name?
    • Could we generate a default name if none is provided or not require this field at all?
@joshdover joshdover added enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team labels May 16, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@joshdover joshdover changed the title [Fleet] Add a human-readable API for configuring package policies [Fleet] Add a human-readable API and documentation for configuring package policies May 17, 2022
@simitt
Copy link
Contributor

simitt commented May 19, 2022

Adding another concrete example that I believe would be worth changing before making the API GA:
How to delete an agent-policy:

POST /agent_policies/delete

IMO it should be changed to

DELETE /agent_policies/:agentPolicyId

@RichiCoder1
Copy link

Is there any guidance on doing this day, or is it recommended to use the UI for now until further changes are made? For context, we're trying to automate applying similar-but-different package policies across multiple agents in multiple environments, and having to manually provision them today is a lot.

(Ideally we eventually get elastic/terraform-provider-elasticstack#89, but even a half step would be great)

@juliaElastic
Copy link
Contributor

@joshdover
To provide human readable input ids like apache_logs, do you think the id field should be added to all package definitions?
If so, this task should be raised for Ecosystem team.

@akshay-saraswat
Copy link

Product brief

@joshdover
Copy link
Contributor Author

joshdover commented Jul 27, 2022

To provide human readable input ids like apache_logs, do you think the id field should be added to all package definitions?
If so, this task should be raised for Ecosystem team.

@juliaElastic Yes, we should open a package-spec discussion about this. I think this will be important for the long-term. Otherwise, it's very easy for a package author to introduce a breaking change in their package, simply by re-ordering their inputs.

@nchaulet
Copy link
Member

nchaulet commented Aug 9, 2022

Trying to summarize the different level we will have to document for an integration you could have

  • n policy_template that contains
    • n inputs (currently identified by type) that can contains
      • variables, with type and optionally required
      • n streams that can contains
        • variables, with type and optionally required

@joshdover
Copy link
Contributor Author

joshdover commented Aug 22, 2022

In terms of introducing a simplified form of the package policy API, it seems like there is quite a bit of room to simplify the existing API without adding input IDs with an eye towards what that could look like too:

Existing API call

{
  "name": "system-5",
  "package": {
    "name": "system",
    "version": "1.19.1"
  },
  "inputs": [
    {
      "policy_template": "system",
      "type": "logfile",
      "enabled": true,
      "streams": [
        {
          "enabled": true,
          "data_stream": {
            "type": "logs",
            "dataset": "system.auth"
          },
          "vars": {
            "preserve_original_event": {
              "value": true
            },
            "paths": {
              "value": ["/var/log/auth.log*"]
            },
            "tags": {
              "value": ["test"]
            }
          }
        }
      ]
    },
  ]
}

Simplified streams section

There are some improvements we could make the streams portion of each input by:

  • Assuming there is only a single stream for each dataset in an input
  • Looking up the data stream type from the package
  • Supporting a simplified schema for the vars fields
{
  "name": "system-5",
  "package": {
    "name": "system",
    "version": "1.19.1"
  },
  "inputs": [
    {
      "policy_template": "system",
      "type": "logfile",
      "enabled": true,
      "streams": { 
        "system.auth": { // `data_stream.dataset` is used for the key
          // `data_stream.type` looked up from package
          // `enabled: true` is assumed
          // keys are vars, values are values (no nested `values` key necessary)
          "preserve_original_event": true,
          "paths": ["/var/log/auth.log*"],
          "tags": ["test"]
        }
      }
    }
  ]
}

Simplified inputs section (requires packages to add input IDs elastic/package-spec#385)

If we supported a human readable ID for each input, we'd be able to make further assumptions to simplify the inputs part of the API:

  • Assume that if an input ID is present in inputs then it's enabled. If it's not present it's disabled.
  • Lookup the input type from the package (if there's two data streams with the same dataset but different types, throw an error and require the user specify which type)
  • Avoid requiring the policy_template if there's only one template in a package (this is the most typical case)
  • Support specifying the package as a string. We only support a single version of a package being installed anyways, if the user doesn't specify an object for the package key we can assume they just mean whatever version is already installed or latest.
{
  "name": "system-5",
  "package": "system",
  "inputs": {
    "system_auth": { // input id is used as key (NEW: this doesn't exist yet in packages)
      // `enabled: true` is assumed
      // `policy_template` is not required if there is only 1 template
      "streams": { 
        "system.auth": { // `data_stream.dataset` is used for the key
          // `data_stream.type` looked up from package
          // `enabled: true` is assumed
          // keys are vars, values are values
          "preserve_original_event": true,
          "paths": ["/var/log/auth.log*"],
          "tags": ["test"]
        }
      }
    }
  }
}

@nchaulet curious if there's any complexity I'm missing here (I'm sure there is)

@nchaulet
Copy link
Member

@joshdover I like the idea of using object here instead of array make things a lot more readable, we may have some additional key at the same level as variables but it could probably work (we may have to restrict some variables name in the package spec)

Alsofor the input id I am wondering if we can generate it on Fleet size if it's not present in the package ${policy_template}_${input.type} should be unique.

I am going to dig a little more into that and see if there is some limitations here, and how we can bring that change.

@joshdover
Copy link
Contributor Author

Alsofor the input id I am wondering if we can generate it on Fleet size if it's not present in the package ${policy_template}_${input.type} should be unique.

One downside to this is that packages could introduce a breaking change to the package policy API if they change the input type. But I think it's a good stopgap solution if it's indeed true that there aren't duplicate inputs of the same type.

Another option could be to only support packages that contain input IDs on the new API during the migration period to add IDs to package inputs.

@nchaulet
Copy link
Member

@joshdover I have a working API with key instead of array and it's a lot easier to use, I am wondering if all the inputs and streams should be disabled by default or if we should use package default, (I like the idea of using default) this mean the user can create a working integration with just something like this and then he can start to customize inputs by adding disabled by default one or disabling the default one it does not need

{
        name: 'nginx-1',
        package: { "name": "nginx", "version": "latest" }
        namespace: 'default',
        policy_id: 'policy123',
        description: 'Test description'
}

@nchaulet
Copy link
Member

nchaulet commented Aug 23, 2022

@mtojek @jsoriano what do you think of Kibana temporarly introducing an input id (like policy_template + input type) while it's implemented in the package (#132263 (comment)) do you think of package where this will not be working?

@mtojek
Copy link
Contributor

mtojek commented Aug 23, 2022

Hi @nchaulet,

temporarly introducing

What do you mean by temporarily? I may not see the full picture.

do you think of package where this will not be working?

To be honest, with current package scale it's hard to predict which packages may potentially break. I'm wondering if it doesn't affect elastic-package (system tests), but if it's just about populating when the property is missing, it should be safe, right?

If you need 100% certainty, we could perform such an exercise (install every package).

@nchaulet
Copy link
Member

nchaulet commented Aug 23, 2022

Hi @mtojek going to give more context we want to introduce a new API to create package policy much simpler for user something like this #132263 (comment)

For this we need an id for each input, while this is adopted and implemented for each package elastic/package-spec#385 what I propose is we use ${policy_template}-${input.type} as an input id provided by the user when calling the Fleet API

a request to the API will be something like

// POST /package-policies
{
  "name": "nginx-test"
  "inputs: {
      "nginx-logfile":  { // policy template - input type
          "streams": {
             "nginx.error": { // stream dataset
                 tags: ['test', 'nginx-error'], // variables as key:value
             }
         }
      }
      //... input variables optionnal
  }
  //... package variables optionnal
}

@mtojek
Copy link
Contributor

mtojek commented Aug 23, 2022

I checked elastic-package internals and it looks like elastic-package is not aware of the input ID at all. I would assume that this property is transparent for the tool and packages. On the other hand, stream ID is being populated.

@jsoriano
Copy link
Member

After elastic/package-spec#385 is implemented, Fleet will still need to be able to handle current integrations that don't have this id.
If the API needs an id, I guess that it is ok if it generates one somehow as a best-effort.

@jen-huang
Copy link
Contributor

@nchaulet Could this be closed?

@nchaulet
Copy link
Member

Yes this should be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants