Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Fix duplicate data streams being shown in UI #89812

Merged
merged 9 commits into from
Feb 2, 2021

Conversation

jen-huang
Copy link
Contributor

@jen-huang jen-huang commented Jan 30, 2021

Summary

Resolves #87082. Duplicate data streams were being shown in UI once there are multiple backing indices per stream (backing indices are created due to rollover, either as a natural result of the ILM policy over time, or when packages are upgraded). This PR fixes that by:

  1. Querying against ES data stream APIs as source of truth for list of data streams, rather than querying backing indices and aggregating on _index field
  2. Retaining the query against backing indices to figure out the data stream dataset, namespace, and type fields
  3. Get package information based on package name stored in the data stream's _meta info, rather than coercing it from the dataset or index name
  4. Adding an API integration tests for the data stream list endpoint that includes a multiple backing indices scenario

Screenshots

Response showing multiple backing indices for some data streams:

image

Correct data streams show in Fleet, no duplication per backing index:

image

@jen-huang jen-huang added bug Fixes for quality problems that affect the customer experience release_note:fix v8.0.0 Feature:Fleet Fleet team's agent central management project Team:Fleet Team label for Observability Data Collection Fleet team v7.11.0 v7.12.0 labels Jan 30, 2021
@jen-huang jen-huang self-assigned this Jan 30, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Feature:Fleet)

@jen-huang jen-huang requested review from a team and removed request for a team January 30, 2021 00:58
const dataStreamsInfoByName = keyBy(dataStreamsInfo, 'name');

// Get data stream stats
const { data_streams: dataStreamStats } = (await callCluster('transport.request', {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we can do the callCluster calls in parallel instead of serially.

Copy link
Contributor Author

@jen-huang jen-huang Feb 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated! I tossed the packaged SO retrieval in there too

metric: ['store'],
});
// Get matching data streams
const { data_streams: dataStreamsInfo } = (await callCluster('transport.request', {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem better defined in a service than a http handler.


// Return final data streams objects sorted by last activity, decending
// After filtering out data streams that are missing dataset/namespace/type fields
body.data_streams = (await Promise.all(dataStreamPromises))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handler has more logic/statements than most others. What can we move to the service layer?

I think we can move lines 53-177 into a service allowing something like

Suggested change
body.data_streams = (await Promise.all(dataStreamPromises))
body.data_streams = await service.method()

Perhaps it's more than that, but I think we should move as much of this code as possible into a service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this code was written originally, it was an intentional decision to not create a data stream service yet and just leave the code in the handler because 1) there's only 1 data stream API (list), 2) no other place in the plugins needs access to this logic, and 3) no downstream plugins needs access either. I'd like to keep this simple for now until one of these situations change. Is this blocking for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not a blocker. Service or not, I think that code should be pulled into functions to keep the handler easier to read/test, but that can be done later

{
exists: {
field: 'data_stream.namespace',
// Query backing indices to extract data stream dataset, namespace, and type values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could find a better way here for doing this. Instead of checking all the indices for it we could check what the values are in the mapping of the write index? Or if we run the query, could we ensure it only touches one index (most recent one) and not all of them? But then, what if this index was rollover but has no data inside yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if this index was rollover but has no data inside yet

At some point I wrote the index name to be the last backing index (write index), but changed it for this exact reason :) I couldn't think of a better way to get these values but am open to suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would tend to trust ES here to be efficient in its aggregations, and leave it like this, until we know that it's a problem. Probing several indices until we find the newest one that has data is probably slower. Do we have a system with lots of rolled-over backing indices we could test with?

@jen-huang
Copy link
Contributor Author

@elasticmachine merge upstream

@jen-huang jen-huang requested a review from skh February 1, 2021 17:35
@jen-huang
Copy link
Contributor Author

@elasticmachine merge upstream

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
fleet 752.1KB 752.1KB +3.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@skh skh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this locally with these steps:

  • set up Fleet & Agents
  • use the default policy, install & run an agent
  • verify data streams are created and data is coming in
  • roll over one data stream manually
  • on master, I can reproduce the bug: the rolled-over data stream appears twice in the list of data streams, and the number of data streams on the overview page increases
  • with this change, I can verify the bug is fixed: every data stream only appears once in the list. Also, the number of data streams on the overview page stays constant.

One comment inline about the aggregation to find dataset, namespace and type -- I think ES is in general more efficient than we can be by trying to be clever, so I'd leave it like this until we know it becomes a problem.

During a recent customer call it was pointed out that the data stream view is slow. It's probably worth investigating, but rather in a separate issue than in this PR.

@jen-huang
Copy link
Contributor Author

Thanks @skh for the thorough testing and review!

@jen-huang jen-huang merged commit 19effe2 into elastic:master Feb 2, 2021
@jen-huang jen-huang deleted the fix/dupe-data-streams branch February 2, 2021 18:11
jen-huang added a commit to jen-huang/kibana that referenced this pull request Feb 2, 2021
* Add API integration tests for data streams list, including one that is expected to fail due to reliance on number of backing indices

* Use ES data streams API as source of truth for list of data streams, and only query against backing indices afterwards

* Get package name from data stream meta info

* Increate retry timeout

* Move initial info requests inside Promise.all

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
# Conflicts:
#	x-pack/test/fleet_api_integration/apis/index.js
@jen-huang jen-huang removed the v7.10.3 label Feb 2, 2021
phillipb added a commit to phillipb/kibana that referenced this pull request Feb 2, 2021
…-ml-jobs

* 'master' of github.com:elastic/kibana: (254 commits)
  [Security Solution] [Detections] Remove allow_no_indices to prevent error being thrown in response of field capabilities (elastic#89927)
  Skip test for cloud (elastic#89450)
  [Fleet] Fix duplicate data streams being shown in UI (elastic#89812)
  Bump package dependencies (elastic#90034)
  [App Search] DRY helper for encoding/decoding routes that can have special characters in params (elastic#89811)
  TypeScript project references for Observability plugin (elastic#89320)
  [SearchSource] Combine sort and parent fields when serializing (elastic#89808)
  Made imports static (elastic#89935)
  [ml] migrate file_data_visualizer/import route to file_upload plugin (elastic#89640)
  [Discover] Adapt default column behavior (elastic#89826)
  Round start and end values (elastic#89030)
  Rename getProxyAgents to getCustomAgents (elastic#89813)
  [Form lib] UseField `onError` listener (elastic#89895)
  [APM] use latency sum instead of avg for impact (elastic#89990)
  migrate more core-owned plugins to tsproject ref (elastic#89975)
  [Logs UI] Load <LogStream> entries via async searches (elastic#86899)
  [APM] Abort browser requests when appropriate (elastic#89557)
  [Alerting] Allow user to select existing connector of same type when fixing broken connector (elastic#89062)
  [Data Table] Use shared CSV export mechanism (elastic#89702)
  chore(NA): improve logic check when installing Bazel tools (elastic#89634)
  ...
jen-huang added a commit that referenced this pull request Feb 2, 2021
* Add API integration tests for data streams list, including one that is expected to fail due to reliance on number of backing indices

* Use ES data streams API as source of truth for list of data streams, and only query against backing indices afterwards

* Get package name from data stream meta info

* Increate retry timeout

* Move initial info requests inside Promise.all

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
jen-huang added a commit that referenced this pull request Feb 2, 2021
* Add API integration tests for data streams list, including one that is expected to fail due to reliance on number of backing indices

* Use ES data streams API as source of truth for list of data streams, and only query against backing indices afterwards

* Get package name from data stream meta info

* Increate retry timeout

* Move initial info requests inside Promise.all

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
# Conflicts:
#	x-pack/test/fleet_api_integration/apis/index.js
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Fleet Fleet team's agent central management project release_note:fix Team:Fleet Team label for Observability Data Collection Fleet team v7.11.0 v7.12.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Fleet] Data streams page is showing hidden indices
6 participants