Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for reindexing APM indices #29845

Merged
merged 12 commits into from
Feb 6, 2019

Conversation

tylersmalley
Copy link
Contributor

@tylersmalley tylersmalley commented Feb 1, 2019

Starting in 7.0, APM will be aligning with the property names defined in the Elastic Common Schema. When APM users upgrade to 7.x, they will need to migrate those indices created prior to 7.0 for the data to be present in the APM app.

Blocked by the finalizing of mappings and the reindex script here: elastic/apm-integration-testing#277

Discussion:

When we re-index, we are appending -reindex-v7 to the current index name. With doing this, we will almost always run into conflicts with the APM template used to create the index in the first place.

By default, APM creates an index template named apm-%{[beat.version]} with an index pattern of apm-%{[beat.version]}-*. The default index names are as follows:

  • apm-%{[beat.version]}-sourcemap
  • apm-%{[beat.version]}-error-%{+yyyy.MM.dd}
  • apm-%{[beat.version]}-transaction-%{+yyyy.MM.dd}
  • apm-%{[beat.version]}-span-%{+yyyy.MM.dd}
  • apm-%{[beat.version]}-metric-%{+yyyy.MM.dd}
  • apm-%{[beat.version]}-onboarding-%{+yyyy.MM.dd}

Unfortunately, there is no way to ignore index templates on index creation or update an index pattern to not match indices ending in -reindexed-v7

There have been a few options which have been discussed which require input from the APM team. @elastic/apm-server @bleskes

  • Produce a warning with a list of possibly effecting templates prior to indexing, letting the user know they might create conflicts and they should remove them if they are no longer used or APM data for that version is no longer being ingested.
  • If the index creation fails, we can give the user additional information on what to do. This could include a link to documentation on identifying and removing the effecting templates. It's also possible that we could produce a list of index templates by iterating over /_template?filter_path=*.index_patterns to see if any patterns match.
  • Similar to how we stop/start ML/Watcher, we could remove the index templates and replace them once we are done with the re-index. This seems way too risky considering it might be a template they are using elsewhere which sets something like replicas. Additionally, they could still be ingesting data from the old APM Beat and removing the template will cause issues.
  • The APM server can prompt for removal during the 7.x setup script.

Please also review the language used in the following screenshots and let us know if it should be changed.

Documentation links to: https://www.elastic.co/guide/en/apm/server/master/breaking-changes.html
More about ECS links to: https://github.com/elastic/ecs

apm-list

When clicking on "Reindex" for an APM index, currently presented is a single warning which the user must check acknowledge the data will be transformed.
apm-flyout

Checklist

Use strikethroughs to remove checklist items you don't feel are applicable to this PR.

For maintainers

Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@tylersmalley tylersmalley added blocker Team:Operations Team label for Operations Team v7.0.0 labels Feb 1, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations

@@ -30,12 +32,24 @@ export default function apmOss(kibana) {
indexPattern: Joi.string().default('apm-*'),

// ES Indices
sourcemapIndices: Joi.string().default('apm-*'),
Copy link
Contributor Author

@tylersmalley tylersmalley Feb 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was missing.

@elasticmachine

This comment has been minimized.

Tyler Smalley added 2 commits February 1, 2019 22:32
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine

This comment has been minimized.

Tyler Smalley added 2 commits February 2, 2019 09:11
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine

This comment has been minimized.

Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@tylersmalley tylersmalley changed the title [WIP] Support for reindexing APM indices Support for reindexing APM indices Feb 3, 2019
@elasticmachine

This comment has been minimized.

Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@simitt
Copy link
Contributor

simitt commented Feb 4, 2019

Regarding template conflicts during reindexing: since we know there will most probably be conflicts while reindexing when using default settings, I suggest to either show the user upfront that they will need to clean up the templates or prompt for removal when the migration script is started. If feasible I'd prefer the options where we detect potential problematic templates and directly show them to the user, over pointing to an external documentation asking the user to query for the templates themselves.

Regarding linking to the documentation, there is also a general APM release notes section. This section should give a high level overview of breaking changes for APM in general, and then link to the more detailed server section. It should also contain explanation around switching to ECS. I'd rather link to these more general APM docs from Kibana.

APM index needs converted to ECS format

How about we put the detail information around field changes, explaining why we move to ECS and linking to it, into the APM docs and only put some more general information into the migration assistant, e.g. APM indices need conversion to 7.x format.

@bmorelli25 I'd appreciate your input regarding documentation linking and wording.

@bmorelli25
Copy link
Member

Regarding linking to the documentation, there is also a general APM release notes section. This section should give a high level overview of breaking changes for APM in general, and then link to the more detailed server section. It should also contain explanation around switching to ECS. I'd rather link to these more general APM docs from Kibana.

Yup, let's link to this release notes section from Kibana. From there, I'll handle the linking to other sections of the APM documentation with more details.

How about we put the detail information around field changes, explaining why we move to ECS and linking to it, into the APM docs and only put some more general information into the migration assistant, e.g. APM indices need conversion to 7.x format.

I mostly agree with this. We should have a sentence (for the people who are curious, but not that curious) and the bulk of information should be in the APM documentation. I also thinks it makes more sense to remove the "More about ECS" link from this page. That's something I can link to from the APM documentation. I think directing users to the APM documentation initially is more beneficial.

In other words. Instead of this:

This index will be converted to ECS format
Starting in version 7.0.0, APM data will be represented in the Elastic Common Schema. Historical APM data will not visible until it's reindexed.
Documentation
More about ECS

I'm thinking something more like this:

This index will be converted to the 7.x format
Starting in version 7.0.0, APM data will be represented in the Elastic Common Schema. Historical APM data will not visible until it's reindexed. More information on this change is available in the APM release notes.

@graphaelli
Copy link
Member

We're working through an issue with conflicting field mapping, summarized with this example:

PUT _template/ex-6.6.0
{
  "index_patterns": [
    "ex-6.6.0-*"
  ],
  "mappings": {
    "properties": {
      "docker": {
        "properties": {
          "container": {
            "properties": {
              "image": {
                "type": "keyword",
                "ignore_above": 1024
              }
            }
          }
        }
      },
      "container": {
        "properties": {
          "image": {
            "properties": {
              "name": {
                "type": "alias",
                "path": "docker.container.image"
              }
            }
          }
        }
      }
    }
  }
}
PUT ex-6.6.0-test
{
  "mappings": {
    "properties": {
      "container": {
        "properties": {
          "image": {
            "properties": {
              "name": {
                "ignore_above": 1024,
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine

This comment has been minimized.

@tylersmalley
Copy link
Contributor Author

Updated messaging and changed to be a warning instead of critical since critical denotes something which will prevent the cluster from starting.

apm-list

apm-flyout

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@tylersmalley
Copy link
Contributor Author

@graphaelli should we move forward with merging this PR and treat the ES error as a bug?

Copy link
Contributor

@joshdover joshdover left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, tested with APM data and was able to see the data in the APM app after reindexing. I'm out of the loop on the template conflict issue, but assuming there's a fix for that, the rest of this looks good.

@jalvz
Copy link
Contributor

jalvz commented Feb 5, 2019

I don't understand the conflict very well. The conflict is between 6.6 index mappings and 7.0 mappings, or between 7.0 mappings and 7.0-reindexed mappings? (If the former: why having a different version causes conflicts? If the later: doesn't make sense to me, what I am missing?)

And silly question: does it have to be necessarily a reindexed suffix?
Can indices be named like apm-reindexed-7.0-blablabla and change index pattern to apm-*-{beat.version}-* ?

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@tylersmalley
Copy link
Contributor Author

We decided to get around the index template conflicts by pre-pending to the index name, which did here: #30166

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants