Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add order statistics checks #72

Merged
merged 15 commits into from
Feb 12, 2021
Merged

Conversation

mj-will
Copy link
Contributor

@mj-will mj-will commented Jun 11, 2020

Add the nested sampling cross-checks using order statistics that are described in: https://arxiv.org/abs/2006.03371

The test checks if the distribution of indices of each newly inserted live point with respect to the existing live points is uniform using the likelihood as the ordering criteria and produces a KS-statistic and p-value.

@pep8speaks
Copy link

pep8speaks commented Jun 11, 2020

Hello @mj-will! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 61:80: E501 line too long (82 > 79 characters)
Line 72:80: E501 line too long (82 > 79 characters)
Line 258:23: E221 multiple spaces before operator
Line 331:56: E701 multiple statements on one line (colon)
Line 331:80: E501 line too long (132 > 79 characters)
Line 336:80: E501 line too long (80 > 79 characters)
Line 350:80: E501 line too long (92 > 79 characters)
Line 362:19: E128 continuation line under-indented for visual indent
Line 362:80: E501 line too long (98 > 79 characters)
Line 392:80: E501 line too long (89 > 79 characters)
Line 395:80: E501 line too long (87 > 79 characters)
Line 401:29: E231 missing whitespace after ','
Line 418:35: E231 missing whitespace after ','
Line 418:50: E231 missing whitespace after ','
Line 418:61: E231 missing whitespace after ','
Line 418:80: E501 line too long (127 > 79 characters)
Line 421:80: E501 line too long (104 > 79 characters)
Line 423:42: E225 missing whitespace around operator
Line 423:70: E225 missing whitespace around operator

Line 292:80: E501 line too long (81 > 79 characters)
Line 531:80: E501 line too long (117 > 79 characters)

Line 90:7: E221 multiple spaces before operator
Line 93:17: E127 continuation line over-indented for visual indent

Comment last updated at 2021-02-11 15:14:02 UTC

@mj-will
Copy link
Contributor Author

mj-will commented Jun 11, 2020

This is currently missing the Bonferroni correction for multiple testing that is used in the paper to combine the rolling p-values

@johnveitch
Copy link
Owner

@mj-will Is this ready to merge? I see that you were changing some internal arrays of samples - is that necessary?

@mj-will
Copy link
Contributor Author

mj-will commented Oct 13, 2020

@johnveitch, I'll need to take another look at this. I had to change things to avoid sorting the samples every time a new live point is inserted. It's correctly implemented in FlowProposal but there a small differences here because of the multiple threads.

@johnveitch
Copy link
Owner

@mj-will Could you take a look at the conflicts? They don't seem too difficult to solve and it'd be good to get this merged in.

@mj-will
Copy link
Contributor Author

mj-will commented Jan 20, 2021

@johnveitch sure thing. Other than the conflicts I think there was an issue with how it computed the indices when using more than 1 thread. I'll try to look into it over the next week.

@mj-will
Copy link
Contributor Author

mj-will commented Jan 21, 2021

There are few things left that I think need to be done before merging this:

  • Verify that the KS is correct for the distribution of live points and returns low p-values when distribution is not uniform
  • Agree on maximum number of bins for indices plot
  • Compare output from before and after changes
  • Choose how behaviour relates to verbosity

Let me know if anything else comes to mind.

@mj-will
Copy link
Contributor Author

mj-will commented Feb 11, 2021

@johnveitch I think this is ready to merge, excluding any changes you'd like made. I've compared results between the tests before and after these changes and I can't see any obvious issues but it's probably worth double-checking.

There are two points in my previous comment that could easily be changed:

  • the maximum number of bins for the indices histogram is 30
  • the plotting triggers with verbose >= 2 and the KS test is computed for all values of verbose

Let me know what you think.

@mj-will mj-will changed the title WIP: Add order statistics checks Add order statistics checks Feb 11, 2021
Copy link
Collaborator

@wdpozzo wdpozzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mj-will I had a look at your changes and they seem ok to me. I have a question:
how much does it help not having to sort the live points at each step?
Also, while i think this can be merged, I will bounce this to @johnveitch for the final approval.

Copy link
Owner

@johnveitch johnveitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@johnveitch johnveitch merged commit b9028a0 into johnveitch:master Feb 12, 2021
@mj-will
Copy link
Contributor Author

mj-will commented Feb 12, 2021

Hi @mj-will I had a look at your changes and they seem ok to me. I have a question:
how much does it help not having to sort the live points at each step?

Hi @wdpozzo, when running the scripts in tests the difference seems to be minimal. I think they're too short to really notice anything. I also compared results using rosenbrock.py and there appeared to be roughly a 10% improvement but there's also quite a lot of variability between runs, so it's hard to say for sure.

@wdpozzo
Copy link
Collaborator

wdpozzo commented Feb 12, 2021

Hi @mj-will, thank you. I am going to port your PR also to the rework branch, where the live points are managed differently and the increase in performance should become more apparent.

@wdpozzo
Copy link
Collaborator

wdpozzo commented Feb 12, 2021

@mj-will I incorporated your pull request in the rework branch and it seems to give a bit of improvement indeed. I think I got it right, but if you feel like, please check that branch out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants