Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to store non-openPMD information #115

Open
RemiLehe opened this issue Dec 4, 2015 · 11 comments
Open

Allow user to store non-openPMD information #115

RemiLehe opened this issue Dec 4, 2015 · 11 comments
Assignees
Labels
question revision change backwards-compatible, stylistic change (e.g. typos)
Milestone

Comments

@RemiLehe
Copy link
Member

RemiLehe commented Dec 4, 2015

Today I had a request from a prospective user of openPMD: can the user store other datasets in the HDF5-openPMD files, which are not meant to be read by the openPMD parser / viewer ? (for instance because they don't fall either in the category mesh or or in the category particle)

From my point of view, the best way to do this is to store this data outside of the basePath (in this case the parser will not try to read it).

@ax3l : Does that make sense ? If yes, I think it would be good to add it as a side-remark in the standard.

@RemiLehe RemiLehe added this to the 1.0.1: Typo and Wording Changes milestone Dec 4, 2015
@ax3l
Copy link
Member

ax3l commented Dec 4, 2015 via email

@ax3l ax3l added question revision change backwards-compatible, stylistic change (e.g. typos) and removed enhancement labels Dec 5, 2015
@ax3l
Copy link
Member

ax3l commented Dec 5, 2015

Does that make sense ? If yes, I think it would be good to add it as a side-remark in the standard.

Make absolutely sense and good point and we should make this more clear in the standard.
I marked it as a revision change (patch level) which means it can be added in e.g., a 1.0.1 release (as you already marked!).

@RemiLehe
Copy link
Member Author

RemiLehe commented Dec 7, 2015

Ok, great! I'll do a corresponding PR in the next few days.

@ax3l
Copy link
Member

ax3l commented Dec 10, 2015

An other and additional/orthogonal approach to allow non-openPMD information inside basePath too: if one wants to avoid parsing of additional records, in detail

  • directories
  • data sets

(since additional attributes do not harm the parsers), we could also provide a prefix that is ignored by the parsers. Lets say their names must start with "+".
But maybe that is messy.

Would allow to experiment with new additional information, e.g., irregular mesh geometries, inside basePath.

Currently, only attributes can be freely added at any place (well, it's recommended to name them comment). Groups and data sets are restricted inside those paths.

@ax3l ax3l self-assigned this Dec 10, 2015
ax3l added a commit that referenced this issue Dec 10, 2015
Close  #115: Allow user to store non-openPMD information
@ax3l ax3l removed this from the 1.0.1: Typo and Wording Changes milestone Nov 24, 2017
@ax3l ax3l added this to the 1.0: First Major Release milestone Nov 24, 2017
@ax3l
Copy link
Member

ax3l commented Nov 24, 2017

implemented in 1.0.1

@DavidSagan
Copy link
Collaborator

@ax3l

Currently, only attributes can be freely added at any place (well, it's recommended to name them comment). Groups and data sets are restricted inside those paths.

I would propose that that it should be allowed for groups and data sets to be freely added. I can imagine situations where, for example, a person wants to add per-particle data and the restriction that this has to be put outside of the group that holds the particle data makes things very messy. Certainly we could mandate that such added groups or data sets be marked as extra. For example using a "+" prefix as you suggested. I think this is a fairly clean solution.

@franzpoeschel
Copy link

franzpoeschel commented Sep 21, 2022

Alternative (and maybe more radical) suggestion:

  • Allow custom group hierarchies with custom datasets and custom attributes inside every iteration
  • Treat meshes and particles as keywords, reserved to openPMD (to be more precise: whatever is defined in meshesPath and particlesPath)
  • Inside these paths, the typical openPMD hierarchy applies, and all data should follow strictly the openPMD standard

The fundamental idea would be that an openPMD dataset cannot only (1) be augmented by custom hierarchies (i.e. have the classical openPMD hierarchy, and other stuff around it that the API ignores), but instead that (2) an openPMD is a custom hierarchy with the classical openPMD structure embedded into it at any place.
Instead of ignoring custom hierarchies, openPMD could then benefit from and interact with them.

Example:
Mesh refinement currently works via the naming of the meshes. Alternatively, one could do:

/data/0/refined_mesh_levels/0/meshes/E
/data/0/refined_mesh_levels/0/meshes/B
/data/0/refined_mesh_levels/1/meshes/E
/data/0/refined_mesh_levels/1/meshes/B
/data/0/refined_mesh_levels/2/meshes/E
/data/0/refined_mesh_levels/2/meshes/B
+++++++ ––––––––––––––––––––– ++++++++
standard        custom        standard

/data/0/simulation_internal/some_checkpointing_info
+++++++ –––––––––––––––––––––––––––––––––––––––––––
standard                  custom

Codes such as for example PIConGPU can put their internal datasets (e.g. PIConGPU_id_provider) anywhere in that hierarchy, and it would be ignored instead of cluttering the openPMD dataset.

Ideally, if done correctly, this would mean that a single dataset can use several standards at the same time, such as mixing Nexus with openPMD.

Downside: No huge change for the standard, but a rather large change for implementations. Readers would need to be updated to find openPMD structures throughout the datasets.

@ax3l
Copy link
Member

ax3l commented Oct 12, 2022

That sounds useful and would be equivalent to relaxing meshes path from values like meshesPath="meshes/" to regexes such as meshesPath=".*meshes/ (or the hard-coding of this exact regex in the standard).

I am not sure if we will not need an "exclude this from parsing" nonetheless via an attribute on custom groups/variables - without it we would keep things definitely fully separate besides sharing an iteration/snapshot id (if that works then that is fine).

@franzpoeschel
Copy link

For the HELPMI project, I drew up some visualizations of the proposed addition.

openPMD currently:
opmd_hierarchy

Proposed extension:
opmd_hierarchy_extended

That sounds useful and would be equivalent to relaxing meshes path from values like meshesPath="meshes/" to regexes such as meshesPath=".*meshes/ (or the hard-coding of this exact regex in the standard).

Using a regex is one of the options, yes. Another (more restricted) option would be a list of paths.
Even though it's redundant, I would even suggest a list of patterns, as that is a common workflow in file managing software?

I am not sure if we will not need an "exclude this from parsing" nonetheless via an attribute on custom groups/variables - without it we would keep things definitely fully separate besides sharing an iteration/snapshot id (if that works then that is fine).

Using exclude patterns is a common enough pattern in a lot of software (rsync, git ignore, backup software, …), so, I'm fine with using that.
I don't understand what you mean by "without it we would keep things definitely fully separate besides sharing an iteration/snapshot id"?

@ax3l
Copy link
Member

ax3l commented Apr 25, 2023

Sounds great. Designing as lists of patterns/paths is a good idea.

The last comment was simply: yes, I think we need an exclude pattern, too (as in rsync, git ignore, backup software, ...).

@franzpoeschel
Copy link

Real-life WIP example from PIConGPU: Checkpointing information is stored under picongpu_internal/, the RNGProvider is a field inside that group (normal openPMD markup), idProvider contains two non-openPMD datasets.

  float     /data/1000/fields/B/x                                      {64, 64, 64}                                                                                                                                                          
  float     /data/1000/fields/B/y                                      {64, 64, 64}                                                                                                                                                          
  float     /data/1000/fields/B/z                                      {64, 64, 64}                                                                                                                                                          
  float     /data/1000/fields/Convolutional PML B/xy                   {1, 1, 198144}                                                                                                                                                        
  float     /data/1000/fields/Convolutional PML B/xz                   {1, 1, 198144}                                                                                                                                                        
  float     /data/1000/fields/Convolutional PML B/yx                   {1, 1, 198144}                                                                                                                                                        
  float     /data/1000/fields/Convolutional PML B/yz                   {1, 1, 198144}                                                                                                                                                        
  float     /data/1000/fields/Convolutional PML B/zx                   {1, 1, 198144}                                                                                                                                                        
  float     /data/1000/fields/Convolutional PML B/zy                   {1, 1, 198144}                                                                                                                                                        
  float     /data/1000/fields/Convolutional PML E/xy                   {1, 1, 198144}                                                                                                                                                        
  float     /data/1000/fields/Convolutional PML E/xz                   {1, 1, 198144}                                                                                                                                                        
  float     /data/1000/fields/Convolutional PML E/yx                   {1, 1, 198144}                                 
  float     /data/1000/fields/Convolutional PML E/yz                   {1, 1, 198144}                                 
  float     /data/1000/fields/Convolutional PML E/zx                   {1, 1, 198144}                                 
  float     /data/1000/fields/Convolutional PML E/zy                   {1, 1, 198144}                                 
  float     /data/1000/fields/E/x                                      {64, 64, 64}                                   
  float     /data/1000/fields/E/y                                      {64, 64, 64}                                   
  float     /data/1000/fields/E/z                                      {64, 64, 64}                                   
  float     /data/1000/particles/e/momentum/x                          {55401}                                        
  float     /data/1000/particles/e/momentum/y                          {55401}                                        
  float     /data/1000/particles/e/momentum/z                          {55401}                                        
  uint64_t  /data/1000/particles/e/particlePatches/extent/x            {1}                                            
  uint64_t  /data/1000/particles/e/particlePatches/extent/y            {1}                                            
  uint64_t  /data/1000/particles/e/particlePatches/extent/z            {1}                                            
  uint64_t  /data/1000/particles/e/particlePatches/numParticles        {1}                                            
  uint64_t  /data/1000/particles/e/particlePatches/numParticlesOffset  {1}                                            
  uint64_t  /data/1000/particles/e/particlePatches/offset/x            {1}                                            
  uint64_t  /data/1000/particles/e/particlePatches/offset/y            {1}                                            
  uint64_t  /data/1000/particles/e/particlePatches/offset/z            {1}                                            
  float     /data/1000/particles/e/position/x                          {55401}                                        
  float     /data/1000/particles/e/position/y                          {55401}                                        
  float     /data/1000/particles/e/position/z                          {55401}                                        
  int32_t   /data/1000/particles/e/positionOffset/x                    {55401}                                        
  int32_t   /data/1000/particles/e/positionOffset/y                    {55401}                                        
  int32_t   /data/1000/particles/e/positionOffset/z                    {55401}                                        
  float     /data/1000/particles/e/weighting                           {55401}                                        
  char      /data/1000/picongpu_internal/fields/RNGProvider3XorMin     {64, 64, 1536}                                 
  uint64_t  /data/1000/picongpu_internal/idProvider/nextId             {1, 1, 1}                                      
  uint64_t  /data/1000/picongpu_internal/idProvider/startId            {1, 1, 1}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question revision change backwards-compatible, stylistic change (e.g. typos)
Projects
Status: Proposed
Development

No branches or pull requests

4 participants