This repository has been archived by the owner on Apr 5, 2021. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a workflow for applying delta updates to the documents of an existing ElasticSearch index, including root-level document fields (file fields specified in the
:only
array in the data.yaml).This story comes about from the need to update only the latest data in the index (nested fields and/or root-level fields) more frequently, which is currently problematic with several large CSV documents. In the past, any update to the data required a full re-index of all datafiles. This PR creates a
delta
rake task that allows for specifying an update file, and mapping it to one of the original data.yaml files entries. This way we can use the same data.yaml config that was stored in ES and update specific fields and nested fields. The only limitation is that it can only update one of the API's nested key per run (you could run the rake command multiple times specifying different:nest
keys)NOTE: One of the key API changes in the document builder is that a
config.files
file entry can now specify both an:only
fields array AND a:nest
fields array. Previously it was one or the other. As a side-effect, this allows you to create your root-level document and append nested fields on the same pass of the file importer.