Dataset variants #494

Zaharid · 2019-07-01T15:29:18Z

We have recently been taking the approach of never modifying existing datasets (which I think is a good thing) and instead adding fixed versions of them with different names. For example we have the so called *_SF sets, which stands for "symmetrization fix".

I find this approach to also be problematic. For one I seemed to be the only one at the meeting last week who seemed to know what these things are and that you should use them instead of the default ones. Also, all that might be documented somewhere, but I couldn't find it (admittedly I only spent 30 seconds searching but still).

I think a better approach would be to teach the code to know that there are variants of the various datasets (the variant used in 3.1 the variant that had a bug fixed and so on). And then have ways to warn you if you are using a deprecated variant or to know what the default variants are.

This ties with:

The discussion on general metadata files.
The discussion on defaults Datasets should know their default configuration #226.
The discussion on resources Resource system #224.

enocera · 2019-07-01T15:41:28Z

Two comments.

Even if you spent more than 30 secs, you wouldn't find any documentation - as documentation is restricted to a couple of lines in the old and new .cc files in buildmaster/filter. This is just to reinforce your statement "I find this approach to be problematic".
Default variants will in principle depend on what you're doing. For instance, we consciously decided not to switch to the "new variants" for the theory uncertainty papers. My understanding is that the "new" variants will become "default" variants only when we will seriously start to work on NNPDF4.0.

Zaharid · 2019-07-01T15:47:59Z

* Default variants will in principle depend on what you're doing. For instance, we consciously decided not to switch to the "new variants" for the theory uncertainty papers. My understanding is that the "new" variants will become "default" variants only when we will seriously start to work on NNPDF4.0.

There is some discussion on this at #226. Since then I have reached the conclusion that we really want two kinds of files:

The valiedphys runcard where you enter things that are filled with defaults in a way that makes sense for the 90% use case. These defaults can change with time. And they can be overridden in the runcard if needed. And you can be warned if you override with strange defaults.
A "lock file" that is generated from the runcard, with all the defaults explicitly resolved. The lock file is a valid runcard itself and you can give it to validphys (or nnfit or whatever). That would give always the same result (in terms of physics), with bugs and all.

Zaharid · 2019-07-01T15:48:37Z

@wilsonmr I guess the above should be added as an objective for reportengine 1.0.

Zaharid · 2019-07-01T16:26:28Z

See #496.

Zaharid assigned enocera and wilsonmr Jul 1, 2019

Zaharid mentioned this issue Jul 1, 2019

Lock files #496

Closed

Zaharid mentioned this issue Jun 4, 2020

FKTables should come with a version #796

Closed

Zaharid mentioned this issue Jun 17, 2020

Documentation #656

Closed

33 tasks

Zaharid mentioned this issue Mar 10, 2021

Warnings when using deprecated data sets? #1139

Closed

wilsonmr mentioned this issue Apr 12, 2021

Comparing datasets with variants on the same name #1196

Closed

Zaharid mentioned this issue Sep 23, 2021

Design Improved commondata fomat #1416

Closed

RoyStegeman linked a pull request Mar 4, 2024 that will close this issue

New CommonData Reader #1678

Merged

3 tasks

scarlehoff closed this as completed in #1678 Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset variants #494

Dataset variants #494

Zaharid commented Jul 1, 2019

enocera commented Jul 1, 2019

Zaharid commented Jul 1, 2019

Zaharid commented Jul 1, 2019

Zaharid commented Jul 1, 2019

Dataset variants #494

Dataset variants #494

Comments

Zaharid commented Jul 1, 2019

enocera commented Jul 1, 2019

Zaharid commented Jul 1, 2019

Zaharid commented Jul 1, 2019

Zaharid commented Jul 1, 2019