Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow migrations to split documents into new document type #26602

Closed
mattapperson opened this issue Dec 4, 2018 · 9 comments
Closed

Allow migrations to split documents into new document type #26602

mattapperson opened this issue Dec 4, 2018 · 9 comments
Labels
Feature:beats-cm Feature:Saved Objects Team:Beats Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@mattapperson
Copy link
Contributor

At the moment BeatsCM has a document type of tag that is shaped like this:

        "tag": {
          "properties": {
            "id": {
              "type": "keyword"
            },
            "color": {
              "type": "keyword"
            },
            "last_updated": {
              "type": "date"
            },
            "configuration_blocks": {
              "type": "nested",
              "properties": {
                "type": {
                  "type": "keyword"
                },
                "description": {
                  "type": "text"
                },
                "configs": {
                  "type": "nested",
                  "dynamic": true
                }
              }
            }
          }

where configs is an array of objects. We would like to take that array of objects, with each create it's own document of a new config_block type, and on the tag the data came from have an array of config_block IDs

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations

@mattapperson
Copy link
Contributor Author

Note, this issue is considered a potential blocker for BeatsCM going GA. This issue is also a blocker for @elastic/uptime

@joshdover
Copy link
Contributor

@mattapperson Could clarify a bit with an example of what you'd like the end result to look like? That would help us understand a bit more what you need to accomplish. Also are these saved objects in the .kibana index or is there a separate index for BeatsCM data?

@tylersmalley tylersmalley added Feature:Saved Objects Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc and removed Team:Operations Team label for Operations Team labels Mar 26, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-platform (Team:Platform)

@rudolf
Copy link
Contributor

rudolf commented Mar 29, 2020

The simplest implementation would be to change the migration transform function to be able to return an array of objects instead of a single object. Because it's possible for an ID conflict, the migration transform function won't be able to assign ID's to the new documents. This means it's possible for the new sub-type documents to have a reference to the parent, but not the other way around as in your example.

We will need to consider how we apply migrations to the newly split out documents. If a later version of Kibana adds a migration for the sub-type we need to apply this migration to newly split out documents when Kibana starts up, but also when importing saved objects.

Related: #34996

@pgayvallet
Copy link
Contributor

Because it's possible for an ID conflict, the migration transform function won't be able to assign ID's to the new documents

We introduced the SavedObjectMigrationContext type a few month ago. Couldn't we add an idGenerator of some kind to it to allow migration functions to generate 'new' conflict-safe IDs to avoid this object->newChild reference to be added?

@rudolf
Copy link
Contributor

rudolf commented Apr 29, 2020

It's only Elasticsearch that can guarantee uniqueness, any generated id might conflict. But even though this is theoretically possible it's practically impossible. Since there won't be any data loss, we can ignore this case and simply throw a fatal if it ever occurs.

@pgayvallet
Copy link
Contributor

It's only Elasticsearch that can guarantee uniqueness, any generated id might conflict. But even though this is theoretically possible it's practically impossible

With a valid UUID generator, That's approximatively the same risk as a git commit hash conflict.

Still think that exposing the id generator would be preferable, as that would keep this responsibility inside core.

@rudolf
Copy link
Contributor

rudolf commented Nov 27, 2020

Closing this as we're not aware of any plugins requiring this functionality at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:beats-cm Feature:Saved Objects Team:Beats Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

6 participants