Refactor Result Saving #841

joernweissenborn · 2021-09-30T15:34:46Z

This PR changes the way results are saved and enables loading and validating results.

Change summary

Added custom dataclass serialisation
Models are serializable now
Results can be loaded
Added save/load_result_file
Parameter History added to result

Checklist

✔️ Passing the tests (mandatory for all PR's)
👌 Closes issue (mandatory for ✨ feature and 🩹 bug fix PR's)
🧪 Adds new tests for the feature (mandatory for ✨ feature and 🩹 bug fix PR's)

Closes issues

closes #323
closes #317
closes #316
closes #767

github-actions · 2021-09-30T15:35:00Z

👈 Launch a binder notebook on branch joernweissenborn/pyglotaran/refactor/result_saving

github-actions · 2021-09-30T15:44:58Z

Benchmark is done. Checkout the benchmark result page.
Benchmark differences below 5% might be due to CI noise.

Benchmark diff v0.4.1 vs. main

Parametrized benchmark signatures:

BenchmarkOptimize.time_optimize(index_dependent, grouped, weight)


All benchmarks:

       before           after         ratio
     [21ba272a]       [08beac41]
     <v0.4.1>                   
-        73.1±1ms       48.3±0.6ms     0.66  BenchmarkOptimize.time_optimize(False, False, False)
-         442±8ms        66.7±30ms     0.15  BenchmarkOptimize.time_optimize(False, False, True)
-        99.5±2ms         81.9±3ms     0.82  BenchmarkOptimize.time_optimize(False, True, False)
          102±2ms        93.1±20ms     0.91  BenchmarkOptimize.time_optimize(False, True, True)
         73.4±1ms         63.3±1ms    ~0.86  BenchmarkOptimize.time_optimize(True, False, False)
-         442±3ms        97.2±40ms     0.22  BenchmarkOptimize.time_optimize(True, False, True)
          101±2ms         99.8±3ms     0.99  BenchmarkOptimize.time_optimize(True, True, False)
          101±1ms         114±40ms    ~1.13  BenchmarkOptimize.time_optimize(True, True, True)
             178M             179M     1.01  IntegrationTwoDatasets.peakmem_create_result
             196M             197M     1.00  IntegrationTwoDatasets.peakmem_optimize
-         301±6ms          248±4ms     0.83  IntegrationTwoDatasets.time_create_result
       6.06±0.06s        2.08±0.1s    ~0.34  IntegrationTwoDatasets.time_optimize

Benchmark diff main vs. PR

Parametrized benchmark signatures:

BenchmarkOptimize.time_optimize(index_dependent, grouped, weight)


All benchmarks:

       before           after         ratio
     [7797bc9e]       [08beac41]
       45.3±0.4ms       48.3±0.6ms     1.07  BenchmarkOptimize.time_optimize(False, False, False)
         61.5±5ms        66.7±30ms     1.08  BenchmarkOptimize.time_optimize(False, False, True)
         80.9±1ms         81.9±3ms     1.01  BenchmarkOptimize.time_optimize(False, True, False)
         88.7±2ms        93.1±20ms     1.05  BenchmarkOptimize.time_optimize(False, True, True)
       62.5±0.7ms         63.3±1ms     1.01  BenchmarkOptimize.time_optimize(True, False, False)
        88.9±40ms        97.2±40ms     1.09  BenchmarkOptimize.time_optimize(True, False, True)
          100±2ms         99.8±3ms     0.99  BenchmarkOptimize.time_optimize(True, True, False)
         108±40ms         114±40ms     1.05  BenchmarkOptimize.time_optimize(True, True, True)
             183M             179M     0.98  IntegrationTwoDatasets.peakmem_create_result
             200M             197M     0.98  IntegrationTwoDatasets.peakmem_optimize
          240±4ms          248±4ms     1.03  IntegrationTwoDatasets.time_create_result
       2.05±0.09s        2.08±0.1s     1.01  IntegrationTwoDatasets.time_optimize

glotaran/project/result.py

glotaran/model/model.py

codecov · 2021-10-02T19:12:56Z

Codecov Report

Merging #841 (08beac4) into main (7797bc9) will decrease coverage by 0.3%.
The diff coverage is 83.4%.

@@           Coverage Diff           @@
##            main    #841     +/-   ##
=======================================
- Coverage   84.9%   84.5%   -0.4%     
=======================================
  Files         77      79      +2     
  Lines       4372    4522    +150     
  Branches     785     826     +41     
=======================================
+ Hits        3712    3824    +112     
- Misses       521     556     +35     
- Partials     139     142      +3

Impacted Files	Coverage Δ
glotaran/plugin_system/data_io_registration.py	`100.0% <ø> (ø)`
glotaran/project/__init__.py	`100.0% <ø> (ø)`
glotaran/analysis/problem_ungrouped.py	`93.8% <50.0%> (ø)`
glotaran/analysis/problem.py	`90.3% <61.5%> (-0.7%)`	⬇️
glotaran/model/property.py	`85.5% <64.5%> (-7.9%)`	⬇️
glotaran/model/item.py	`93.4% <68.4%> (-2.5%)`	⬇️
glotaran/analysis/problem_grouped.py	`95.6% <75.0%> (-0.4%)`	⬇️
glotaran/model/model.py	`83.2% <75.7%> (-3.1%)`	⬇️
glotaran/project/result.py	`86.7% <76.6%> (-11.9%)`	⬇️
glotaran/parameter/parameter_history.py	`76.7% <76.7%> (ø)`
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7797bc9...08beac4. Read the comment docs.

glotaran/analysis/problem.py

Signed-off-by: Jörn Weißenborn <joern.weissenborn@gmail.com>

Co-authored-by: Sebastian Weigand <s.weigand.phy@gmail.com>

As a side effect it sets the file paths of saved files in the Result object.

s-weigand

Sorry for the long wait on my review.

In general, I don't like the cluttering of the data classes with the <attr>_file attributes.
So in a follow-up PR we should refactor this to a more elegant solution.

E.g.:
1.) Loaders overwrite an origin_file attribute ({"path": <file_path>, "format_name": <format_name>} ) on the objects (Model, ParameterGroup, ...) which we can set to a default value.
2.) We use a file_mapping attribute {"model": {"path": <file_path>, "format_name": <format_name>}, ....}

Since each object has one loader plugin we can make it a property of the object itself.

glotaran/analysis/optimize.py

glotaran/builtin/io/yml/test/test_save_model.py

glotaran/parameter/parameter_group.py

glotaran/project/result.py

glotaran/project/scheme.py

Co-authored-by: Sebastian Weigand <s.weigand.phy@gmail.com>

sonarqubecloud · 2021-10-10T20:42:44Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
6 Code Smells

No Coverage information
0.0% Duplication

…p_path instead of tmpdir

For some reason when running pre-commit on pre-commit-ci the exclude settings of interrogate are ignored. Compare: https://github.com/glotaran/pyglotaran/runs/3853222703 vs. https://results.pre-commit.ci/run/github/58401715/1633899505.QoeN82EHTouKusNC0fMSCg But since we run it in our github actions workflow it is save to deactivate it for pre-commit-ci.

sonarqubecloud · 2021-10-12T16:22:53Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
6 Code Smells

No Coverage information
0.0% Duplication

sourcery-ai · 2021-10-12T16:22:53Z

Sourcery Code Quality Report

✅ Merging this PR will increase code quality in the affected files by 0.83%.

Quality metrics	Before	After	Change
Complexity	6.54 ⭐	6.43 ⭐	-0.11 👍
Method Length	49.31 ⭐	48.31 ⭐	-1.00 👍
Working memory	7.94 🙂	7.78 🙂	-0.16 👍
Quality	69.53% 🙂	70.36% 🙂	0.83% 👍

Other metrics	Before	After	Change
Lines	5929	6445	516

Changed files	Quality Before	Quality After	Quality Change
glotaran/analysis/optimize.py	45.91% 😞	46.86% 😞	0.95% 👍
glotaran/analysis/problem.py	80.36% ⭐	80.47% ⭐	0.11% 👍
glotaran/analysis/problem_grouped.py	57.20% 🙂	57.19% 🙂	-0.01% 👎
glotaran/analysis/problem_ungrouped.py	67.75% 🙂	67.75% 🙂	0.00%
glotaran/builtin/io/folder/folder_plugin.py	71.72% 🙂	55.31% 🙂	-16.41% 👎
glotaran/builtin/io/folder/test/test_folder_plugin.py	86.72% ⭐	85.18% ⭐	-1.54% 👎
glotaran/builtin/io/netCDF/netCDF.py	80.62% ⭐	93.78% ⭐	13.16% 👍
glotaran/builtin/io/yml/yml.py	52.74% 🙂	78.34% ⭐	25.60% 👍
glotaran/builtin/io/yml/test/test_save_result.py	88.48% ⭐	88.79% ⭐	0.31% 👍
glotaran/deprecation/modules/test/test_project_result.py	90.19% ⭐	94.31% ⭐	4.12% 👍
glotaran/examples/sequential.py	46.88% 😞	45.17% 😞	-1.71% 👎
glotaran/io/init.py	87.99% ⭐	87.99% ⭐	0.00%
glotaran/io/interface.py	97.71% ⭐	97.71% ⭐	0.00%
glotaran/model/dataset_model.py	81.03% ⭐	80.93% ⭐	-0.10% 👎
glotaran/model/item.py	55.63% 🙂	56.94% 🙂	1.31% 👍
glotaran/model/model.py	73.24% 🙂	73.26% 🙂	0.02% 👍
glotaran/model/property.py	43.98% 😞	48.70% 😞	4.72% 👍
glotaran/model/test/test_model.py	71.05% 🙂	70.89% 🙂	-0.16% 👎
glotaran/parameter/init.py	96.84% ⭐	%	%
glotaran/parameter/parameter.py	88.43% ⭐	88.34% ⭐	-0.09% 👎
glotaran/parameter/parameter_group.py	69.08% 🙂	69.30% 🙂	0.22% 👍
glotaran/plugin_system/data_io_registration.py	92.67% ⭐	92.43% ⭐	-0.24% 👎
glotaran/plugin_system/project_io_registration.py	89.03% ⭐	87.08% ⭐	-1.95% 👎
glotaran/project/init.py	%	%	%
glotaran/project/result.py	69.15% 🙂	76.37% ⭐	7.22% 👍
glotaran/project/scheme.py	90.25% ⭐	89.88% ⭐	-0.37% 👎
glotaran/project/test/test_result.py	80.10% ⭐	80.28% ⭐	0.18% 👍
glotaran/project/test/test_scheme.py	79.38% ⭐	80.82% ⭐	1.44% 👍

Here are some functions in these files that still need a tune-up:

File	Function	Complexity	Length	Working Memory	Quality	Recommendation
glotaran/model/item.py	_create_mprint_func	45 ⛔	275 ⛔	10 😞	25.00% ⛔	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
glotaran/model/property.py	ModelProperty.validate	44 ⛔	187 😞	13 😞	25.34% 😞	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
glotaran/parameter/parameter_group.py	ParameterGroup.from_dataframe	28 😞	267 ⛔	13 😞	26.41% 😞	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
glotaran/analysis/optimize.py	_create_result	14 🙂	245 ⛔	24 ⛔	28.14% 😞	Try splitting into smaller methods. Extract out complex expressions
glotaran/model/item.py	_create_mprint_func.mprint_item	35 ⛔	270 ⛔	10 😞	28.28% 😞	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

⭐ excellent
🙂 good
😞 poor
⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.

Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

s-weigand

For me, this is now fine to merge the open issues from the review are tracked in #855

jsnel

Reviewed ok, taking into account some new minor issues (#855)

joernweissenborn requested review from jsnel, s-weigand and a team as code owners September 30, 2021 15:34

sourcery-ai bot mentioned this pull request Sep 30, 2021

Refactor Result Saving (Sourcery refactored) #842

Closed

s-weigand reviewed Sep 30, 2021

View reviewed changes

glotaran/project/result.py Outdated Show resolved Hide resolved

s-weigand reviewed Sep 30, 2021

View reviewed changes

glotaran/model/model.py Outdated Show resolved Hide resolved

s-weigand force-pushed the refactor/result_saving branch from 086cec9 to 78a6b5c Compare October 2, 2021 16:51

jsnel reviewed Oct 3, 2021

View reviewed changes

glotaran/analysis/problem.py Show resolved Hide resolved

joernweissenborn and others added 19 commits October 3, 2021 17:03

Added project to darglint, mypy and pydocstyle pre-commit checks

ac4b378

Added Model.as_dict and Model.get_parameters

2515c14

Signed-off-by: Jörn Weißenborn <joern.weissenborn@gmail.com>

Added project/dataclass, changed scheme, adapted yml io

77c1b3a

Changed result use project/dataclass

2f3490e

Added test for save model yml

b9151cf

Added load and save result file and implemented it with yml

9859ed5

Refactored folder plugin, moved SavingOptions.

5a6c9a1

Update glotaran/project/result.py

f949e86

Co-authored-by: Sebastian Weigand <s.weigand.phy@gmail.com>

Made variable names in model more consistent.

5b5eb68

Added parameter history class and added it to result.

2baedf8

🔧 Configure darglint to ignore protocol methods

98b77e0

Fixed Parameter doc.

7b2bb98

♻️ Refactored by Sourcery

56449f7

🧹 Partial revert of eb430c

b00f9ac

🧹 Partial revert of 761d4b

3019125

🧹 Restored original behavior of save_result folder plugin + added files

6b7064d

As a side effect it sets the file paths of saved files in the Result object.

🧹 Removed unused variable name in yml save_model looping over dict

b42b3d9

🩹 Fixed wrong typing usage of any builtin function

032d50c

🔧👌 Activated mypy for parameter subpackage and fixed typing issues

f7a3e73

jsnel added 2 commits October 5, 2021 20:55

👌 Add annotation to __str__ method

f180dca

♻️ Rename project.dataclasses to project.dataclass_helpers

9ff154d

s-weigand requested changes Oct 10, 2021

View reviewed changes

joernweissenborn and others added 4 commits October 10, 2021 21:18

Apply suggestions from code review

8b0986c

Co-authored-by: Sebastian Weigand <s.weigand.phy@gmail.com>

Update glotaran/analysis/optimize.py

68aaa4d

Co-authored-by: Sebastian Weigand <s.weigand.phy@gmail.com>

Update glotaran/parameter/parameter_group.py

fbbbf89

Co-authored-by: Sebastian Weigand <s.weigand.phy@gmail.com>

Update glotaran/parameter/parameter_group.py

67a2054

Co-authored-by: Sebastian Weigand <s.weigand.phy@gmail.com>

♻️ Refactor glotaran/builtin/io/yml/test/test_save_model.py to use tm…

533598c

…p_path instead of tmpdir

s-weigand force-pushed the refactor/result_saving branch from b9217b4 to 533598c Compare October 10, 2021 20:58

s-weigand force-pushed the refactor/result_saving branch from 26fb1a3 to 144b4d1 Compare October 10, 2021 22:07

🧹 Renamed 'test_dataclasses.py' to 'test_dataclass_helpers.py'

08beac4

This was referenced Oct 12, 2021

♻️ Rename dataclass_helper functions and arguments with more sensible names #853

Closed

🗑️ Properly deprecate ParameterGroup.to_csv #854

Closed

This was referenced Oct 12, 2021

✨♻️ Extend result reconstruction #856

Closed

🗑️ Deprecate changed scheme file specs #857

Closed

♻️ Refactor Result and Scheme loading to to use 'file' fields #858

Closed

🧹 Follow up cleanups for #841 #855

Closed

s-weigand requested review from jsnel and s-weigand October 12, 2021 17:02

s-weigand approved these changes Oct 12, 2021

View reviewed changes

jsnel approved these changes Oct 12, 2021

View reviewed changes

jsnel merged commit d1e36a9 into glotaran:main Oct 12, 2021

s-weigand mentioned this pull request Oct 22, 2021

🗑️ Deprecate ParameterGroup.to_csv #879

Merged

3 tasks

This was referenced Apr 2, 2022

✨ Add Model.to_dict function. #766

Closed

Store model type and irf type as part of saved result object #320

Closed

s-weigand mentioned this pull request Jun 6, 2022

🚧 Decide on pyglotaran and pyglotaran-extras version cupelling glotaran/pyglotaran-extras#84

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Result Saving #841

Refactor Result Saving #841

joernweissenborn commented Sep 30, 2021 •

edited by s-weigand

Loading

github-actions bot commented Sep 30, 2021

github-actions bot commented Sep 30, 2021 •

edited

Loading

codecov bot commented Oct 2, 2021 •

edited

Loading

s-weigand left a comment

sonarqubecloud bot commented Oct 10, 2021

sonarqubecloud bot commented Oct 12, 2021

sourcery-ai bot commented Oct 12, 2021

s-weigand left a comment

jsnel left a comment

Refactor Result Saving #841

Refactor Result Saving #841

Conversation

joernweissenborn commented Sep 30, 2021 • edited by s-weigand Loading

Change summary

Checklist

Closes issues

github-actions bot commented Sep 30, 2021

github-actions bot commented Sep 30, 2021 • edited Loading

codecov bot commented Oct 2, 2021 • edited Loading

Codecov Report

s-weigand left a comment

Choose a reason for hiding this comment

sonarqubecloud bot commented Oct 10, 2021

sonarqubecloud bot commented Oct 12, 2021

sourcery-ai bot commented Oct 12, 2021

Sourcery Code Quality Report

Legend and Explanation

s-weigand left a comment

Choose a reason for hiding this comment

jsnel left a comment

Choose a reason for hiding this comment

joernweissenborn commented Sep 30, 2021 •

edited by s-weigand

Loading

github-actions bot commented Sep 30, 2021 •

edited

Loading

codecov bot commented Oct 2, 2021 •

edited

Loading