Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review attribute fetching to improve performance of processing of groups with many children #807

Closed
loichuder opened this issue Oct 6, 2021 · 2 comments · Fixed by #863
Assignees

Comments

@loichuder
Copy link
Member

loichuder commented Oct 6, 2021

Problem

Opening a group with many children is slow.

This was found to be the case for the JupyterProvider (see point 3. of silx-kit/jupyterlab-h5web#51) and should be the same for H5GroveProvider and HsdsProvider.

Reason

When processing a group, children are processed as well (https://github.com/silx-kit/h5web/blob/main/packages/app/src/h5web/providers/jupyter/jupyter-api.ts#L109). This processing triggers the fetching of attribute values for each child. This abundance of requests is the reason of the slowness.

Proposed improvement

We could instead have a fetching on-demand of attribute values, removing the fetching of attribute values from the entity processing.

Attribute values are needed in the following scenarios:

  1. When inspecting an entity (via Inspect button)
  2. When processing a NX entity (as a default attribute could point to a NXData that needs to be displayed). This scenario can be discriminated without additional requests as the group metadata holds the child metadata that include the names of their attributes.
@loichuder loichuder changed the title Review attribute fetching to improve performances Review attribute fetching to improve performance of processing of groups with many children Oct 6, 2021
@axelboc
Copy link
Contributor

axelboc commented Oct 26, 2021

All three providers support fetching the value of a single attribute (in theory):

In practice, H5Grove doesn't seem to respect the attr_keys query parameter and always returns all attributes.

Note also that H5Grove and Jupyter support fetching the values of all the attributes of an entity at once, which is great for performance when switching to the "Inspect" tab, for instance, but may prevent us from fetching attribute values in binary to solve the remaining NaN/Infinity use case #641.

@axelboc
Copy link
Contributor

axelboc commented Nov 25, 2021

In the end, we always fetch the values of all the attributes of an entity at once, even if we just need one. This doesn't seem to matter much with H5Grove, since it provides an endpoint for this, which is why I haven't gone further. It's more of a problem for HSDS, since the endpoint it provides only allows fetching the value of a single attribute, so we end up making consecutive requests to fetch all the values. I think this is fine for now -- we can optimise things later if the need arises.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants