Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Run agent policy schema in batches during fleet setup + add xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize config #150688

Merged

Conversation

hop-dev
Copy link
Contributor

@hop-dev hop-dev commented Feb 9, 2023

Summary

Closes #150538

As part of the Fleet plugin setup, we check to see if any agent policies have an out of date schema_version and upgrade them. We encountered an error when this upgrade happens on a large number of agent policies as we attempted the upgrade in one large batch.

This pull request performs the schema upgrade in batches of 100 by default and also adds the config value xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize to make the batch size configurable.

I have also added more debug logging to show progress, and reduced the response payload of one of our requests which was very large.

Dev testing

To test this you need an environemnt with lots of agent policies (> 2k) where schema_version
is not set. To create an environment with a large number of agent policies I have added a new param to the agent creation script, I ran:

cd x-pack/plugins/fleet
node scripts/create_agents --count 20  --kibana http://127.0.0.1:5601/mark --status online --delete --batches 3000 --concurrentBatches 100

To generate 3000 agent policies each with 20 agents in.

I then modified the agent policies so that they require an upgrade, as system_indices_superuser run:

POST /.kibana/_update_by_query
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "type": "ingest-agent-policies"
          }
        }
      ]
    }
  },
  "script": {
    "source": "ctx._source['ingest-agent-policies'].remove('schema_version')",
    "lang": "painless"
  }
}

restarting kibana will run the setup and in batches.

@hop-dev hop-dev force-pushed the 150538-agent-policy-upgrade-resilience branch from aecd936 to 9cb8d3b Compare February 9, 2023 13:38
@hop-dev hop-dev added backport:all-open Backport to all branches that could still receive a release release_note:enhancement Team:Fleet Team label for Observability Data Collection Fleet team labels Feb 9, 2023
@hop-dev hop-dev changed the title 150538 agent policy upgrade resilience [Fleet] Run agent policy schema in batches during fleet setup + add xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize config Feb 9, 2023
@hop-dev hop-dev marked this pull request as ready for review February 9, 2023 15:18
@hop-dev hop-dev requested a review from a team as a code owner February 9, 2023 15:18
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@@ -315,15 +315,22 @@ class AgentPolicyService {
soClient: SavedObjectsClientContract,
options: ListWithKuery & {
withPackagePolicies?: boolean;
fields?: string[];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the ability to restrict the agent policy fields returned to reduce payload size.

@@ -25,6 +25,8 @@ const printUsage = () =>
[--kibana]: full url of kibana instance to create agents and policy in e.g http://localhost:5601/mybase, defaults to http://localhost:5601
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just changes to the test script to make creating envs with lots of agent policies easier.

@@ -26,13 +27,23 @@ function getOutdatedAgentPoliciesBatch(soClient: SavedObjectsClientContract) {
// deploy outdated policies to .fleet-policies index
// bump oudated SOs schema_version
export async function upgradeAgentPolicySchemaVersion(soClient: SavedObjectsClientContract) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvements!
It would be great to add an integration test with a small batch size.
I'm curious how long it takes to update a large set of agent policies in batches of 100. Hopefully not too long.

Copy link
Contributor Author

@hop-dev hop-dev Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does take a long time (e.g 50s per 1000 agent policies in my local dev env) but its such an expensive operation I didn't dare put the default higher at risk of overwhelming elastic and kibana. My reasoning was that I suspect the vast majority of users have less than 100 agent policies anyway.

Copy link
Member

@kpollich kpollich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM outside of Julia's very accurate nitpick 👍

@hop-dev hop-dev enabled auto-merge (squash) February 9, 2023 16:10
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@hop-dev hop-dev merged commit 6e06452 into elastic:main Feb 9, 2023
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Feb 9, 2023
…xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config (elastic#150688)

## Summary

Closes elastic#150538

As part of the Fleet plugin setup, we check to see if any agent policies
have an out of date `schema_version` and upgrade them. We encountered an
error when this upgrade happens on a large number of agent policies as
we attempted the upgrade in one large batch.

This pull request performs the schema upgrade in batches of 100 by
default and also adds the config value
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` to make the batch
size configurable.

I have also added more debug logging to show progress, and reduced the
response payload of one of our requests which was very large.

### Dev testing

To test this you need an environemnt with lots of agent policies (> 2k)
where `schema_version`
is not set. To create an environment with a large number of agent
policies I have added a new param to the agent creation script, I ran:

```
cd x-pack/plugins/fleet
node scripts/create_agents --count 20  --kibana http://127.0.0.1:5601/mark --status online --delete --batches 3000 --concurrentBatches 100
```

To generate 3000 agent policies each with 20 agents in.

I then modified the agent policies so that they require an upgrade, as
`system_indices_superuser` run:

```
POST /.kibana/_update_by_query
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "type": "ingest-agent-policies"
          }
        }
      ]
    }
  },
  "script": {
    "source": "ctx._source['ingest-agent-policies'].remove('schema_version')",
    "lang": "painless"
  }
}
```

restarting kibana will run the setup and in batches.

(cherry picked from commit 6e06452)
@kibanamachine
Copy link
Contributor

💔 Some backports could not be created

Status Branch Result
7.17 Backport failed because of merge conflicts
8.7

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 150688

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Feb 9, 2023
… add `xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config (#150688) (#150750)

# Backport

This will backport the following commits from `main` to `8.7`:
- [[Fleet] Run agent policy schema in batches during fleet setup + add
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config
(#150688)](#150688)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Mark
Hopkin","email":"mark.hopkin@elastic.co"},"sourceCommit":{"committedDate":"2023-02-09T17:16:35Z","message":"[Fleet]
Run agent policy schema in batches during fleet setup + add
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config
(#150688)\n\n## Summary\r\n\r\nCloses #150538 \r\n\r\nAs part of the
Fleet plugin setup, we check to see if any agent policies\r\nhave an out
of date `schema_version` and upgrade them. We encountered an\r\nerror
when this upgrade happens on a large number of agent policies as\r\nwe
attempted the upgrade in one large batch.\r\n\r\nThis pull request
performs the schema upgrade in batches of 100 by\r\ndefault and also
adds the config
value\r\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` to make
the batch\r\nsize configurable.\r\n\r\nI have also added more debug
logging to show progress, and reduced the\r\nresponse payload of one of
our requests which was very large.\r\n\r\n### Dev testing\r\n\r\nTo test
this you need an environemnt with lots of agent policies (> 2k)\r\nwhere
`schema_version`\r\nis not set. To create an environment with a large
number of agent\r\npolicies I have added a new param to the agent
creation script, I ran:\r\n\r\n```\r\ncd x-pack/plugins/fleet\r\nnode
scripts/create_agents --count 20 --kibana http://127.0.0.1:5601/mark
--status online --delete --batches 3000 --concurrentBatches
100\r\n```\r\n\r\nTo generate 3000 agent policies each with 20 agents
in.\r\n\r\nI then modified the agent policies so that they require an
upgrade, as\r\n`system_indices_superuser` run:\r\n\r\n```\r\nPOST
/.kibana/_update_by_query\r\n{\r\n \"query\": {\r\n \"bool\": {\r\n
\"filter\": [\r\n {\r\n \"term\": {\r\n \"type\":
\"ingest-agent-policies\"\r\n }\r\n }\r\n ]\r\n }\r\n },\r\n \"script\":
{\r\n \"source\":
\"ctx._source['ingest-agent-policies'].remove('schema_version')\",\r\n
\"lang\": \"painless\"\r\n }\r\n}\r\n```\r\n\r\nrestarting kibana will
run the setup and in
batches.","sha":"6e06452aac11ed22efa923284fbb9ad4da1f7ce1","branchLabelMapping":{"^v8.8.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:enhancement","Team:Fleet","backport:all-open","v8.8.0"],"number":150688,"url":"https://github.com/elastic/kibana/pull/150688","mergeCommit":{"message":"[Fleet]
Run agent policy schema in batches during fleet setup + add
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config
(#150688)\n\n## Summary\r\n\r\nCloses #150538 \r\n\r\nAs part of the
Fleet plugin setup, we check to see if any agent policies\r\nhave an out
of date `schema_version` and upgrade them. We encountered an\r\nerror
when this upgrade happens on a large number of agent policies as\r\nwe
attempted the upgrade in one large batch.\r\n\r\nThis pull request
performs the schema upgrade in batches of 100 by\r\ndefault and also
adds the config
value\r\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` to make
the batch\r\nsize configurable.\r\n\r\nI have also added more debug
logging to show progress, and reduced the\r\nresponse payload of one of
our requests which was very large.\r\n\r\n### Dev testing\r\n\r\nTo test
this you need an environemnt with lots of agent policies (> 2k)\r\nwhere
`schema_version`\r\nis not set. To create an environment with a large
number of agent\r\npolicies I have added a new param to the agent
creation script, I ran:\r\n\r\n```\r\ncd x-pack/plugins/fleet\r\nnode
scripts/create_agents --count 20 --kibana http://127.0.0.1:5601/mark
--status online --delete --batches 3000 --concurrentBatches
100\r\n```\r\n\r\nTo generate 3000 agent policies each with 20 agents
in.\r\n\r\nI then modified the agent policies so that they require an
upgrade, as\r\n`system_indices_superuser` run:\r\n\r\n```\r\nPOST
/.kibana/_update_by_query\r\n{\r\n \"query\": {\r\n \"bool\": {\r\n
\"filter\": [\r\n {\r\n \"term\": {\r\n \"type\":
\"ingest-agent-policies\"\r\n }\r\n }\r\n ]\r\n }\r\n },\r\n \"script\":
{\r\n \"source\":
\"ctx._source['ingest-agent-policies'].remove('schema_version')\",\r\n
\"lang\": \"painless\"\r\n }\r\n}\r\n```\r\n\r\nrestarting kibana will
run the setup and in
batches.","sha":"6e06452aac11ed22efa923284fbb9ad4da1f7ce1"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.8.0","labelRegex":"^v8.8.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/150688","number":150688,"mergeCommit":{"message":"[Fleet]
Run agent policy schema in batches during fleet setup + add
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config
(#150688)\n\n## Summary\r\n\r\nCloses #150538 \r\n\r\nAs part of the
Fleet plugin setup, we check to see if any agent policies\r\nhave an out
of date `schema_version` and upgrade them. We encountered an\r\nerror
when this upgrade happens on a large number of agent policies as\r\nwe
attempted the upgrade in one large batch.\r\n\r\nThis pull request
performs the schema upgrade in batches of 100 by\r\ndefault and also
adds the config
value\r\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` to make
the batch\r\nsize configurable.\r\n\r\nI have also added more debug
logging to show progress, and reduced the\r\nresponse payload of one of
our requests which was very large.\r\n\r\n### Dev testing\r\n\r\nTo test
this you need an environemnt with lots of agent policies (> 2k)\r\nwhere
`schema_version`\r\nis not set. To create an environment with a large
number of agent\r\npolicies I have added a new param to the agent
creation script, I ran:\r\n\r\n```\r\ncd x-pack/plugins/fleet\r\nnode
scripts/create_agents --count 20 --kibana http://127.0.0.1:5601/mark
--status online --delete --batches 3000 --concurrentBatches
100\r\n```\r\n\r\nTo generate 3000 agent policies each with 20 agents
in.\r\n\r\nI then modified the agent policies so that they require an
upgrade, as\r\n`system_indices_superuser` run:\r\n\r\n```\r\nPOST
/.kibana/_update_by_query\r\n{\r\n \"query\": {\r\n \"bool\": {\r\n
\"filter\": [\r\n {\r\n \"term\": {\r\n \"type\":
\"ingest-agent-policies\"\r\n }\r\n }\r\n ]\r\n }\r\n },\r\n \"script\":
{\r\n \"source\":
\"ctx._source['ingest-agent-policies'].remove('schema_version')\",\r\n
\"lang\": \"painless\"\r\n }\r\n}\r\n```\r\n\r\nrestarting kibana will
run the setup and in
batches.","sha":"6e06452aac11ed22efa923284fbb9ad4da1f7ce1"}}]}]
BACKPORT-->

Co-authored-by: Mark Hopkin <mark.hopkin@elastic.co>
@hop-dev
Copy link
Contributor Author

hop-dev commented Feb 9, 2023

💚 All backports created successfully

Status Branch Result
8.6

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

hop-dev added a commit to hop-dev/kibana that referenced this pull request Feb 9, 2023
…xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config (elastic#150688)

## Summary

Closes elastic#150538

As part of the Fleet plugin setup, we check to see if any agent policies
have an out of date `schema_version` and upgrade them. We encountered an
error when this upgrade happens on a large number of agent policies as
we attempted the upgrade in one large batch.

This pull request performs the schema upgrade in batches of 100 by
default and also adds the config value
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` to make the batch
size configurable.

I have also added more debug logging to show progress, and reduced the
response payload of one of our requests which was very large.

### Dev testing

To test this you need an environemnt with lots of agent policies (> 2k)
where `schema_version`
is not set. To create an environment with a large number of agent
policies I have added a new param to the agent creation script, I ran:

```
cd x-pack/plugins/fleet
node scripts/create_agents --count 20  --kibana http://127.0.0.1:5601/mark --status online --delete --batches 3000 --concurrentBatches 100
```

To generate 3000 agent policies each with 20 agents in.

I then modified the agent policies so that they require an upgrade, as
`system_indices_superuser` run:

```
POST /.kibana/_update_by_query
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "type": "ingest-agent-policies"
          }
        }
      ]
    }
  },
  "script": {
    "source": "ctx._source['ingest-agent-policies'].remove('schema_version')",
    "lang": "painless"
  }
}
```

restarting kibana will run the setup and in batches.

(cherry picked from commit 6e06452)

# Conflicts:
#	x-pack/plugins/fleet/scripts/create_agents/create_agents.ts
@hop-dev hop-dev deleted the 150538-agent-policy-upgrade-resilience branch February 9, 2023 21:18
hop-dev added a commit that referenced this pull request Feb 9, 2023
… add `xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config (#150688) (#150781)

# Backport

This will backport the following commits from `main` to `8.6`:
- [[Fleet] Run agent policy schema in batches during fleet setup + add
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config
(#150688)](#150688)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Mark
Hopkin","email":"mark.hopkin@elastic.co"},"sourceCommit":{"committedDate":"2023-02-09T17:16:35Z","message":"[Fleet]
Run agent policy schema in batches during fleet setup + add
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config
(#150688)\n\n## Summary\r\n\r\nCloses #150538 \r\n\r\nAs part of the
Fleet plugin setup, we check to see if any agent policies\r\nhave an out
of date `schema_version` and upgrade them. We encountered an\r\nerror
when this upgrade happens on a large number of agent policies as\r\nwe
attempted the upgrade in one large batch.\r\n\r\nThis pull request
performs the schema upgrade in batches of 100 by\r\ndefault and also
adds the config
value\r\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` to make
the batch\r\nsize configurable.\r\n\r\nI have also added more debug
logging to show progress, and reduced the\r\nresponse payload of one of
our requests which was very large.\r\n\r\n### Dev testing\r\n\r\nTo test
this you need an environemnt with lots of agent policies (> 2k)\r\nwhere
`schema_version`\r\nis not set. To create an environment with a large
number of agent\r\npolicies I have added a new param to the agent
creation script, I ran:\r\n\r\n```\r\ncd x-pack/plugins/fleet\r\nnode
scripts/create_agents --count 20 --kibana http://127.0.0.1:5601/mark
--status online --delete --batches 3000 --concurrentBatches
100\r\n```\r\n\r\nTo generate 3000 agent policies each with 20 agents
in.\r\n\r\nI then modified the agent policies so that they require an
upgrade, as\r\n`system_indices_superuser` run:\r\n\r\n```\r\nPOST
/.kibana/_update_by_query\r\n{\r\n \"query\": {\r\n \"bool\": {\r\n
\"filter\": [\r\n {\r\n \"term\": {\r\n \"type\":
\"ingest-agent-policies\"\r\n }\r\n }\r\n ]\r\n }\r\n },\r\n \"script\":
{\r\n \"source\":
\"ctx._source['ingest-agent-policies'].remove('schema_version')\",\r\n
\"lang\": \"painless\"\r\n }\r\n}\r\n```\r\n\r\nrestarting kibana will
run the setup and in
batches.","sha":"6e06452aac11ed22efa923284fbb9ad4da1f7ce1","branchLabelMapping":{"^v8.8.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:enhancement","Team:Fleet","backport:all-open","v8.7.0","v8.8.0"],"number":150688,"url":"https://github.com/elastic/kibana/pull/150688","mergeCommit":{"message":"[Fleet]
Run agent policy schema in batches during fleet setup + add
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config
(#150688)\n\n## Summary\r\n\r\nCloses #150538 \r\n\r\nAs part of the
Fleet plugin setup, we check to see if any agent policies\r\nhave an out
of date `schema_version` and upgrade them. We encountered an\r\nerror
when this upgrade happens on a large number of agent policies as\r\nwe
attempted the upgrade in one large batch.\r\n\r\nThis pull request
performs the schema upgrade in batches of 100 by\r\ndefault and also
adds the config
value\r\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` to make
the batch\r\nsize configurable.\r\n\r\nI have also added more debug
logging to show progress, and reduced the\r\nresponse payload of one of
our requests which was very large.\r\n\r\n### Dev testing\r\n\r\nTo test
this you need an environemnt with lots of agent policies (> 2k)\r\nwhere
`schema_version`\r\nis not set. To create an environment with a large
number of agent\r\npolicies I have added a new param to the agent
creation script, I ran:\r\n\r\n```\r\ncd x-pack/plugins/fleet\r\nnode
scripts/create_agents --count 20 --kibana http://127.0.0.1:5601/mark
--status online --delete --batches 3000 --concurrentBatches
100\r\n```\r\n\r\nTo generate 3000 agent policies each with 20 agents
in.\r\n\r\nI then modified the agent policies so that they require an
upgrade, as\r\n`system_indices_superuser` run:\r\n\r\n```\r\nPOST
/.kibana/_update_by_query\r\n{\r\n \"query\": {\r\n \"bool\": {\r\n
\"filter\": [\r\n {\r\n \"term\": {\r\n \"type\":
\"ingest-agent-policies\"\r\n }\r\n }\r\n ]\r\n }\r\n },\r\n \"script\":
{\r\n \"source\":
\"ctx._source['ingest-agent-policies'].remove('schema_version')\",\r\n
\"lang\": \"painless\"\r\n }\r\n}\r\n```\r\n\r\nrestarting kibana will
run the setup and in
batches.","sha":"6e06452aac11ed22efa923284fbb9ad4da1f7ce1"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"8.7","label":"v8.7.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"url":"https://github.com/elastic/kibana/pull/150750","number":150750,"state":"MERGED","mergeCommit":{"sha":"2713d135c6d942c0007315a139c0b4114a9054b7","message":"[8.7]
[Fleet] Run agent policy schema in batches during fleet setup + add
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config (#150688)
(#150750)\n\n# Backport\n\nThis will backport the following commits from
`main` to `8.7`:\n- [[Fleet] Run agent policy schema in batches during
fleet setup + add\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize`
config\n(#150688)](https://github.com/elastic/kibana/pull/150688)\n\n<!---
Backport version: 8.9.7 -->\n\n### Questions ?\nPlease refer to the
[Backport
tool\ndocumentation](https://github.com/sqren/backport)\n\n<!--BACKPORT
[{\"author\":{\"name\":\"Mark\nHopkin\",\"email\":\"mark.hopkin@elastic.co\"},\"sourceCommit\":{\"committedDate\":\"2023-02-09T17:16:35Z\",\"message\":\"[Fleet]\nRun
agent policy schema in batches during fleet setup +
add\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize`
config\n(#150688)\\n\\n## Summary\\r\\n\\r\\nCloses #150538
\\r\\n\\r\\nAs part of the\nFleet plugin setup, we check to see if any
agent policies\\r\\nhave an out\nof date `schema_version` and upgrade
them. We encountered an\\r\\nerror\nwhen this upgrade happens on a large
number of agent policies as\\r\\nwe\nattempted the upgrade in one large
batch.\\r\\n\\r\\nThis pull request\nperforms the schema upgrade in
batches of 100 by\\r\\ndefault and also\nadds the
config\nvalue\\r\\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize`
to make\nthe batch\\r\\nsize configurable.\\r\\n\\r\\nI have also added
more debug\nlogging to show progress, and reduced the\\r\\nresponse
payload of one of\nour requests which was very large.\\r\\n\\r\\n### Dev
testing\\r\\n\\r\\nTo test\nthis you need an environemnt with lots of
agent policies (> 2k)\\r\\nwhere\n`schema_version`\\r\\nis not set. To
create an environment with a large\nnumber of agent\\r\\npolicies I have
added a new param to the agent\ncreation script, I
ran:\\r\\n\\r\\n```\\r\\ncd
x-pack/plugins/fleet\\r\\nnode\nscripts/create_agents --count 20
--kibana http://127.0.0.1:5601/mark\n--status online --delete --batches
3000 --concurrentBatches\n100\\r\\n```\\r\\n\\r\\nTo generate 3000 agent
policies each with 20 agents\nin.\\r\\n\\r\\nI then modified the agent
policies so that they require an\nupgrade,
as\\r\\n`system_indices_superuser`
run:\\r\\n\\r\\n```\\r\\nPOST\n/.kibana/_update_by_query\\r\\n{\\r\\n
\\\"query\\\": {\\r\\n \\\"bool\\\": {\\r\\n\n\\\"filter\\\": [\\r\\n
{\\r\\n \\\"term\\\": {\\r\\n
\\\"type\\\":\n\\\"ingest-agent-policies\\\"\\r\\n }\\r\\n }\\r\\n
]\\r\\n }\\r\\n },\\r\\n \\\"script\\\":\n{\\r\\n
\\\"source\\\":\n\\\"ctx._source['ingest-agent-policies'].remove('schema_version')\\\",\\r\\n\n\\\"lang\\\":
\\\"painless\\\"\\r\\n }\\r\\n}\\r\\n```\\r\\n\\r\\nrestarting kibana
will\nrun the setup and
in\nbatches.\",\"sha\":\"6e06452aac11ed22efa923284fbb9ad4da1f7ce1\",\"branchLabelMapping\":{\"^v8.8.0$\":\"main\",\"^v(\\\\d+).(\\\\d+).\\\\d+$\":\"$1.$2\"}},\"sourcePullRequest\":{\"labels\":[\"release_note:enhancement\",\"Team:Fleet\",\"backport:all-open\",\"v8.8.0\"],\"number\":150688,\"url\":\"https://github.com/elastic/kibana/pull/150688\",\"mergeCommit\":{\"message\":\"[Fleet]\nRun
agent policy schema in batches during fleet setup +
add\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize`
config\n(#150688)\\n\\n## Summary\\r\\n\\r\\nCloses #150538
\\r\\n\\r\\nAs part of the\nFleet plugin setup, we check to see if any
agent policies\\r\\nhave an out\nof date `schema_version` and upgrade
them. We encountered an\\r\\nerror\nwhen this upgrade happens on a large
number of agent policies as\\r\\nwe\nattempted the upgrade in one large
batch.\\r\\n\\r\\nThis pull request\nperforms the schema upgrade in
batches of 100 by\\r\\ndefault and also\nadds the
config\nvalue\\r\\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize`
to make\nthe batch\\r\\nsize configurable.\\r\\n\\r\\nI have also added
more debug\nlogging to show progress, and reduced the\\r\\nresponse
payload of one of\nour requests which was very large.\\r\\n\\r\\n### Dev
testing\\r\\n\\r\\nTo test\nthis you need an environemnt with lots of
agent policies (> 2k)\\r\\nwhere\n`schema_version`\\r\\nis not set. To
create an environment with a large\nnumber of agent\\r\\npolicies I have
added a new param to the agent\ncreation script, I
ran:\\r\\n\\r\\n```\\r\\ncd
x-pack/plugins/fleet\\r\\nnode\nscripts/create_agents --count 20
--kibana http://127.0.0.1:5601/mark\n--status online --delete --batches
3000 --concurrentBatches\n100\\r\\n```\\r\\n\\r\\nTo generate 3000 agent
policies each with 20 agents\nin.\\r\\n\\r\\nI then modified the agent
policies so that they require an\nupgrade,
as\\r\\n`system_indices_superuser`
run:\\r\\n\\r\\n```\\r\\nPOST\n/.kibana/_update_by_query\\r\\n{\\r\\n
\\\"query\\\": {\\r\\n \\\"bool\\\": {\\r\\n\n\\\"filter\\\": [\\r\\n
{\\r\\n \\\"term\\\": {\\r\\n
\\\"type\\\":\n\\\"ingest-agent-policies\\\"\\r\\n }\\r\\n }\\r\\n
]\\r\\n }\\r\\n },\\r\\n \\\"script\\\":\n{\\r\\n
\\\"source\\\":\n\\\"ctx._source['ingest-agent-policies'].remove('schema_version')\\\",\\r\\n\n\\\"lang\\\":
\\\"painless\\\"\\r\\n }\\r\\n}\\r\\n```\\r\\n\\r\\nrestarting kibana
will\nrun the setup and
in\nbatches.\",\"sha\":\"6e06452aac11ed22efa923284fbb9ad4da1f7ce1\"}},\"sourceBranch\":\"main\",\"suggestedTargetBranches\":[],\"targetPullRequestStates\":[{\"branch\":\"main\",\"label\":\"v8.8.0\",\"labelRegex\":\"^v8.8.0$\",\"isSourceBranch\":true,\"state\":\"MERGED\",\"url\":\"https://github.com/elastic/kibana/pull/150688\",\"number\":150688,\"mergeCommit\":{\"message\":\"[Fleet]\nRun
agent policy schema in batches during fleet setup +
add\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize`
config\n(#150688)\\n\\n## Summary\\r\\n\\r\\nCloses #150538
\\r\\n\\r\\nAs part of the\nFleet plugin setup, we check to see if any
agent policies\\r\\nhave an out\nof date `schema_version` and upgrade
them. We encountered an\\r\\nerror\nwhen this upgrade happens on a large
number of agent policies as\\r\\nwe\nattempted the upgrade in one large
batch.\\r\\n\\r\\nThis pull request\nperforms the schema upgrade in
batches of 100 by\\r\\ndefault and also\nadds the
config\nvalue\\r\\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize`
to make\nthe batch\\r\\nsize configurable.\\r\\n\\r\\nI have also added
more debug\nlogging to show progress, and reduced the\\r\\nresponse
payload of one of\nour requests which was very large.\\r\\n\\r\\n### Dev
testing\\r\\n\\r\\nTo test\nthis you need an environemnt with lots of
agent policies (> 2k)\\r\\nwhere\n`schema_version`\\r\\nis not set. To
create an environment with a large\nnumber of agent\\r\\npolicies I have
added a new param to the agent\ncreation script, I
ran:\\r\\n\\r\\n```\\r\\ncd
x-pack/plugins/fleet\\r\\nnode\nscripts/create_agents --count 20
--kibana http://127.0.0.1:5601/mark\n--status online --delete --batches
3000 --concurrentBatches\n100\\r\\n```\\r\\n\\r\\nTo generate 3000 agent
policies each with 20 agents\nin.\\r\\n\\r\\nI then modified the agent
policies so that they require an\nupgrade,
as\\r\\n`system_indices_superuser`
run:\\r\\n\\r\\n```\\r\\nPOST\n/.kibana/_update_by_query\\r\\n{\\r\\n
\\\"query\\\": {\\r\\n \\\"bool\\\": {\\r\\n\n\\\"filter\\\": [\\r\\n
{\\r\\n \\\"term\\\": {\\r\\n
\\\"type\\\":\n\\\"ingest-agent-policies\\\"\\r\\n }\\r\\n }\\r\\n
]\\r\\n }\\r\\n },\\r\\n \\\"script\\\":\n{\\r\\n
\\\"source\\\":\n\\\"ctx._source['ingest-agent-policies'].remove('schema_version')\\\",\\r\\n\n\\\"lang\\\":
\\\"painless\\\"\\r\\n }\\r\\n}\\r\\n```\\r\\n\\r\\nrestarting kibana
will\nrun the setup and
in\nbatches.\",\"sha\":\"6e06452aac11ed22efa923284fbb9ad4da1f7ce1\"}}]}]\nBACKPORT-->\n\nCo-authored-by:
Mark Hopkin
<mark.hopkin@elastic.co>"}},{"branch":"main","label":"v8.8.0","labelRegex":"^v8.8.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/150688","number":150688,"mergeCommit":{"message":"[Fleet]
Run agent policy schema in batches during fleet setup + add
`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` config
(#150688)\n\n## Summary\r\n\r\nCloses #150538 \r\n\r\nAs part of the
Fleet plugin setup, we check to see if any agent policies\r\nhave an out
of date `schema_version` and upgrade them. We encountered an\r\nerror
when this upgrade happens on a large number of agent policies as\r\nwe
attempted the upgrade in one large batch.\r\n\r\nThis pull request
performs the schema upgrade in batches of 100 by\r\ndefault and also
adds the config
value\r\n`xpack.fleet.setup.agentPolicySchemaUpgradeBatchSize` to make
the batch\r\nsize configurable.\r\n\r\nI have also added more debug
logging to show progress, and reduced the\r\nresponse payload of one of
our requests which was very large.\r\n\r\n### Dev testing\r\n\r\nTo test
this you need an environemnt with lots of agent policies (> 2k)\r\nwhere
`schema_version`\r\nis not set. To create an environment with a large
number of agent\r\npolicies I have added a new param to the agent
creation script, I ran:\r\n\r\n```\r\ncd x-pack/plugins/fleet\r\nnode
scripts/create_agents --count 20 --kibana http://127.0.0.1:5601/mark
--status online --delete --batches 3000 --concurrentBatches
100\r\n```\r\n\r\nTo generate 3000 agent policies each with 20 agents
in.\r\n\r\nI then modified the agent policies so that they require an
upgrade, as\r\n`system_indices_superuser` run:\r\n\r\n```\r\nPOST
/.kibana/_update_by_query\r\n{\r\n \"query\": {\r\n \"bool\": {\r\n
\"filter\": [\r\n {\r\n \"term\": {\r\n \"type\":
\"ingest-agent-policies\"\r\n }\r\n }\r\n ]\r\n }\r\n },\r\n \"script\":
{\r\n \"source\":
\"ctx._source['ingest-agent-policies'].remove('schema_version')\",\r\n
\"lang\": \"painless\"\r\n }\r\n}\r\n```\r\n\r\nrestarting kibana will
run the setup and in
batches.","sha":"6e06452aac11ed22efa923284fbb9ad4da1f7ce1"}}]}]
BACKPORT-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:all-open Backport to all branches that could still receive a release release_note:enhancement Team:Fleet Team label for Observability Data Collection Fleet team v8.6.2 v8.7.0 v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Fleet] Make agent policy schema upgrade resilient to high numbers of agent policies
6 participants