Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discover] Improve performance of getFieldsToShow #144672

Merged
merged 5 commits into from
Nov 16, 2022

Conversation

kertal
Copy link
Member

@kertal kertal commented Nov 7, 2022

Summary

This PR significantly improves the performance of getFieldsToShow for very large data views (>=100k). This is done by reducing the number of iterations necessary to detect multi fields shat should not be displayed. The number of iterations caused significant delays when rendering the document table (document explorer and classic). With a given example dataset and a mapping containing 126k fields, the performance was significantly better:

Before - 34s of calculation time when rendering the table

Bildschirmfoto 2022-11-10 um 13 54 41

After - 1,6 s of calculation time when rendering the table

Bildschirmfoto 2022-11-10 um 13 55 56

Fixes #144673

@kertal kertal added Feature:Discover Discover Application Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. labels Nov 7, 2022
if (mapped && subTypeMulti?.multi?.parent) {
childParentFieldsMap[mapped.name] = subTypeMulti.multi.parent;
childParentFieldsMap[key] = subTypeMulti.multi.parent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these two are necessarily the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're getting the mapped already by key, and the mapping function uses getByName for get the field, so I assume map === key in this case?

@kertal
Copy link
Member Author

kertal commented Nov 10, 2022

@elasticmachine merge upstream

@kertal kertal marked this pull request as ready for review November 14, 2022 06:13
@kertal kertal requested a review from a team as a code owner November 14, 2022 06:13
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

@kertal
Copy link
Member Author

kertal commented Nov 14, 2022

@elasticmachine merge upstream

@kertal kertal requested a review from majagrubic November 14, 2022 08:25
@kertal
Copy link
Member Author

kertal commented Nov 15, 2022

@elasticmachine merge upstream

Copy link
Contributor

@majagrubic majagrubic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

const parent = childParentFieldsMap[key];
return showMultiFields || (parent && !fields.includes(parent));
const parent = childParentFieldsMap.get(key);
return parent && !fieldSet.has(parent);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could simplify even more as the following:

export const getFieldsToShow = (fields: string[], dataView: DataView, showMultiFields: boolean) => {
  if (showMultiFields) {
    return fields;
  }
  return fields.filter((fieldName) => {
    const mapped = dataView.fields.getByName(fieldName);
    const isMultiField = Boolean(mapped && getFieldSubtypeMulti(mapped.spec));
    return !isMultiField;
  });
};

It's the same way as we filter subfields from the sidebar.

Copy link
Member Author

@kertal kertal Nov 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great suggestion, it further simplifies the code, removes another iteration of the field list, aligns the code ... and works, applied the change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@majagrubic this change seems to break a legacy table test

expect(findTestSubject(component, 'tableDocViewRow-city.raw').length).toBe(1);

It expects city.raw to be rendered, because city is not returned by the fields API .. but city is in the mapping. I simply can't remember why we were doing this and if this is an actual case to be considered?

@jughosta To be safe we could revert the last change for this PR, apply it in a follow up for the next minor, without backporting ... getting rid of legacy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jughosta wanna create the follow up PR with your suggestion, just for 8.7?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up PR #145698
I kept the parent-checking logic too.

@kertal kertal added v8.5.2 and removed v8.5.1 labels Nov 16, 2022
Copy link
Contributor

@jughosta jughosta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@kibana-ci
Copy link
Collaborator

kibana-ci commented Nov 16, 2022

💔 Build Failed

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #8 / DocViewTable at Discover Doc with Fields API does not render multifield rows if showMultiFields flag is not set

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
discover 414.0KB 413.5KB -490.0B
Unknown metric groups

ESLint disabled in files

id before after diff
osquery 1 2 +1

ESLint disabled line counts

id before after diff
enterpriseSearch 19 21 +2
fleet 59 65 +6
osquery 108 113 +5
securitySolution 441 447 +6
total +19

Total ESLint disabled count

id before after diff
enterpriseSearch 20 22 +2
fleet 67 73 +6
osquery 109 115 +6
securitySolution 518 524 +6
total +20

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @kertal

@kertal kertal force-pushed the discover-fix-fields-to-show-performance branch from 3726410 to 3ab4d6a Compare November 16, 2022 12:00
@kertal kertal added the v8.7.0 label Nov 16, 2022
@kertal kertal merged commit 1ffb93d into elastic:main Nov 16, 2022
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Nov 16, 2022
## Summary

Significantly improves the performance of `getFieldsToShow` for very large data views (>=100k). This is done by reducing the number ofiterations necessary to detect multi fields shat should not be displayed.

(cherry picked from commit 1ffb93d)
@kibanamachine
Copy link
Contributor

💔 Some backports could not be created

Status Branch Result
7.17 Backport failed because of merge conflicts
8.5
8.7 The branch "8.7" does not exist

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 144672

Questions ?

Please refer to the Backport tool documentation

@kertal kertal removed the v8.7.0 label Nov 16, 2022
kibanamachine added a commit that referenced this pull request Nov 16, 2022
…45424)

# Backport

This will backport the following commits from `main` to `8.5`:
- [[Discover] Improve performance of getFieldsToShow
(#144672)](#144672)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Matthias
Wilhelm","email":"matthias.wilhelm@elastic.co"},"sourceCommit":{"committedDate":"2022-11-16T17:53:18Z","message":"[Discover]
Improve performance of getFieldsToShow (#144672)\n\n##
Summary\r\n\r\nSignificantly improves the performance of
`getFieldsToShow` for very large data views (>=100k). This is done by
reducing the number ofiterations necessary to detect multi fields shat
should not be
displayed.","sha":"1ffb93d687e899ec93b4e1735458e9e2863d6b39","branchLabelMapping":{"^v8.6.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Feature:Discover","release_note:fix","Team:DataDiscovery","v8.6.0","v7.17.8","v8.7.0","v8.5.2"],"number":144672,"url":"https://github.com/elastic/kibana/pull/144672","mergeCommit":{"message":"[Discover]
Improve performance of getFieldsToShow (#144672)\n\n##
Summary\r\n\r\nSignificantly improves the performance of
`getFieldsToShow` for very large data views (>=100k). This is done by
reducing the number ofiterations necessary to detect multi fields shat
should not be
displayed.","sha":"1ffb93d687e899ec93b4e1735458e9e2863d6b39"}},"sourceBranch":"main","suggestedTargetBranches":["7.17","8.7","8.5"],"targetPullRequestStates":[{"branch":"main","label":"v8.6.0","labelRegex":"^v8.6.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/144672","number":144672,"mergeCommit":{"message":"[Discover]
Improve performance of getFieldsToShow (#144672)\n\n##
Summary\r\n\r\nSignificantly improves the performance of
`getFieldsToShow` for very large data views (>=100k). This is done by
reducing the number ofiterations necessary to detect multi fields shat
should not be
displayed.","sha":"1ffb93d687e899ec93b4e1735458e9e2863d6b39"}},{"branch":"7.17","label":"v7.17.8","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.7","label":"v8.7.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.5","label":"v8.5.2","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Matthias Wilhelm <matthias.wilhelm@elastic.co>
jughosta added a commit that referenced this pull request Nov 22, 2022
…145698)

A follow up for #144672 (issue
#144673)

## Summary

This PR reduces number of iterations through fields array by simplifying
the logic and switching to `Map` and `.get()` instead of using
`.includes()` on a visible fields array for each rendered field. Also
`shouldShowFieldHandler` can be further improved in the future without
the need to refactor how it's being used.

Before:
<img width="1490" alt="Screenshot 2022-11-21 at 19 36 17"
src="https://user-images.githubusercontent.com/1415710/203134749-190df445-f0dc-4a66-8797-21fedc1345d2.png">
<img width="430" alt="Screenshot 2022-11-21 at 19 36 43"
src="https://user-images.githubusercontent.com/1415710/203134792-06b659cd-b4e0-4248-b1b2-63767528738b.png">
<img width="406" alt="Screenshot 2022-11-21 at 19 39 05"
src="https://user-images.githubusercontent.com/1415710/203134836-2881e4bf-a9be-4a12-bcd5-76074d3ae0d3.png">

After:
<img width="1493" alt="Screenshot 2022-11-21 at 19 42 48"
src="https://user-images.githubusercontent.com/1415710/203134864-9d686705-891f-4640-96e6-29e7af5ae9d9.png">
<img width="496" alt="Screenshot 2022-11-21 at 19 44 54"
src="https://user-images.githubusercontent.com/1415710/203135114-fda20375-fd3f-4ab4-b3e5-cdf7758ca3c8.png">
<img width="370" alt="Screenshot 2022-11-21 at 19 44 32"
src="https://user-images.githubusercontent.com/1415710/203135038-c4fd40b7-1c8a-43a3-b08c-c4c19c74743f.png">
@kertal
Copy link
Member Author

kertal commented Mar 14, 2024

💚 All backports created successfully

Status Branch Result
7.17

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kertal added a commit to kertal/kibana that referenced this pull request Mar 14, 2024
## Summary

Significantly improves the performance of `getFieldsToShow` for very large data views (>=100k). This is done by reducing the number ofiterations necessary to detect multi fields shat should not be displayed.

(cherry picked from commit 1ffb93d)

# Conflicts:
#	src/plugins/discover/public/application/helpers/get_fields_to_show.ts
kertal added a commit that referenced this pull request Mar 18, 2024
…178731)

# Backport

Backport of the following commits from `main` to `7.17`:
- [[Discover] Improve performance of getFieldsToShow (#144672)](#144672)

## Summary

Significantly improves the performance of `getFieldsToShow` for very large data views (>=100k). This is done by reducing the number of iterations necessary to detect multi fields shat should not be displayed.
@kertal kertal removed the v7.17.8 label Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Discover Discover Application release_note:fix Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. v7.17.19 v8.5.2 v8.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Discover] Bad performance rendering document table when mapping contains lots of fields
6 participants