Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brushing over notebook #1470

Merged
merged 29 commits into from
Oct 8, 2024
Merged

Brushing over notebook #1470

merged 29 commits into from
Oct 8, 2024

Conversation

PaulJonasJost
Copy link
Collaborator

Mainly:

  • shortening many optimizations
  • Replacing Boehm with conversion reaction where possible
  • Thermodynamic more robust.

Hopefully closes #1460 and #1396

…on where possible. Thermodynamic more robust.
@PaulJonasJost PaulJonasJost marked this pull request as draft September 18, 2024 10:37
@codecov-commenter
Copy link

codecov-commenter commented Sep 19, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.89%. Comparing base (c5a46d4) to head (db1168a).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1470      +/-   ##
===========================================
- Coverage    82.92%   82.89%   -0.03%     
===========================================
  Files          163      163              
  Lines        13786    13786              
===========================================
- Hits         11432    11428       -4     
- Misses        2354     2358       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@PaulJonasJost PaulJonasJost marked this pull request as ready for review September 25, 2024 15:25
Copy link
Contributor

@Doresic Doresic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good! Thanks.

I added a few comments of the same type about the reduction of starts of optimization and samples of sampling. I think it's good to reduce as much as we can so it takes slower. But I would not reduce so low that the figures look less presentable or nice. If the histograms start being just random bars it's not much use showing them. So if you didn't already, check whether the figures stay the same in character -- especially if there's no random seed that controls startpoints and how well the model is fit.

doc/example/amici.ipynb Show resolved Hide resolved
doc/example/amici.ipynb Outdated Show resolved Hide resolved
Comment on lines 916 to 1030
"n_starts = 20 # usually a value >= 100 should be used\n",
"n_starts = 10 # usually a value >= 100 should be used\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just check that these 10 starts with the set random seed give a good model fit. It's kinda important to have fast and good sampling and profiling.

Also, is 10 enough to have a parameter histogram and scatter plot that looks like something nice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think problem would here be more that we do find optima veeery well and hence the correlation plot looks hideous. You think we should switch back to boehm? Histogram looks soso

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if it makes the notebook, the figures and conclusions taken from them make more sense, I would go for that rather than it being a bit faster

doc/example/conversion_reaction.ipynb Show resolved Hide resolved
doc/example/conversion_reaction.ipynb Show resolved Hide resolved
doc/example/custom_objective_function.ipynb Outdated Show resolved Hide resolved
Comment on lines 221 to 241
"outputs": [
{
"data": {
"text/plain": [
"[0.0,\n",
" 0.0,\n",
" 0.0,\n",
" 0.0,\n",
" 0.0,\n",
" 0.0,\n",
" 3.056285312137946e-36,\n",
" 3.76158192263132e-36,\n",
" 6.018531076210112e-36,\n",
" 9.780112998841433e-36]"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove all outputs from the notebook

Copy link
Member

@dilpath dilpath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me, thanks!

I would not reduce so low that the figures look less presentable or nice.

re: this balance between computation time of CI vs. sufficient computation for nice figures, we could use some environment variable (e.g. CI and GITHUB_ACTIONS are always set to true during GitHub actions CI) to determine the computation effort in each notebook. This could look like this at the top of the notebooks.

import os
n_starts = 100
n_samples = 10000
if os.environ.get("CI", None) is not None:
    n_starts = 10
    n_samples = 1000

doc/example/amici.ipynb Show resolved Hide resolved
doc/example/getting_started.ipynb Outdated Show resolved Hide resolved
PaulJonasJost and others added 2 commits October 1, 2024 16:46
Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com>
Copy link
Contributor

@Doresic Doresic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks

@Doresic
Copy link
Contributor

Doresic commented Oct 8, 2024

Ohh I just remembered.
At the retreat, we did talk about visualization again, and Jan again mentioned how it's always good to know in which order one should look at visualizations: Firstly, it's important to see that the model actually fits the data. If that's not true, waterfall plot and parameter plot are not important. Only when the data is somewhat fit, it's nice to look at the waterfall plot, to see convergence and parameter plot to see a fast-less-informative indication of identifiability and whether the bounds are good enough (actually changing the parameter bounds might be necessary to fit the data in the first place). Only then it makes sense to go into profiling and sampling -- to get some more accurate insight into parameter identifiability.

It would be nice if this was implemented in the getting_started notebook as a somewhat guide to estimating models. This might take some restructuring of the notebook, so I'm also ok with doing it myself in a separate PR. Wanted to mention it here so it's not forgotten :D

@PaulJonasJost PaulJonasJost merged commit b2cde8c into develop Oct 8, 2024
18 checks passed
@PaulJonasJost PaulJonasJost deleted the brush_over_notebooks branch October 8, 2024 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants