Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition to msprime 1.0 demography model #764

Merged
merged 1 commit into from
Mar 5, 2021

Conversation

jeromekelleher
Copy link
Member

This is an initial pass at converting the codebase to use the msprime 1.0 Demography model, rather than the msprime 0.x model as we currently do.

This will be a messy change, but I think it's something we can do incrementally, which will also move us closer to demes (which will be the future basis for demography).

The first pass here is to just remove the model comparison code we have here, which is using internal msprime APIs, and use msprime's APIs instead.

My proposal for the transition would be the following:

  1. Remove the population_configurations, demographic_events and migration_matrix replace with a demography object. (May get rid of the populations list as well, since this is redundant, but not in the first pass)
  2. Go through all the catalog files and call Demography.from_old_style on all the places we create a model object.

This much should work, and we should be able to merge.

Once that's working, we can then start tweaking the model definitions to make them more idiomatic/more demes-like. Ultimately, I'd like to make some demes-like Demographys for the stuff in the catalog, where we use population splits etc rather than mass migrations. We then compare them with the qc models using the is_equivalent() method, which is able to take into account populations being reused, via a mapping of equivalent populations epoch-by-epoch. That way, we should be able to keep the QC models as they are, but incrementally modernise the catalog models.

Upgrading from the msprime demes-like Demography model should be straightforward enough then at a later date.

What do you think @grahamgower?

@grahamgower
Copy link
Member

Yeah, that sounds like a good transition plan. Of course, the devil will be in the details. 😈

@jeromekelleher jeromekelleher changed the base branch from msprime-1.0 to main March 3, 2021 16:53
@jeromekelleher jeromekelleher force-pushed the msprime-1-demog branch 3 times, most recently from 49af25d to eb95ffe Compare March 4, 2021 14:25
@jeromekelleher
Copy link
Member Author

@grahamgower, it'd be good to get your eyes on this at this point.

I've got stuff mostly working - the CLI is working with the msprime engine, and I think we can work through all the rest of the test failures easily enough.

The basic idea is to basically get rid of all the code we have for modelling populations/demography etc and to push this (first) out to msprime and then later to Demes. If people have strayed off the simple "run a simulation" example code and written stuff that depends on the internal structure of the stdpopsim objects, then we'll break their code. But, mostly I think the update shouldn't affect people.

The next step here would be to make the minimal changes to get the QC and production models agreeing, via is_equivalent(), but I didn't want to get into the weeds here in case I'm on the wrong track entirely.

What do you think?

@grahamgower
Copy link
Member

There's quite a lot of breakage here in the DemographicModel class compared to the documented API (https://stdpopsim.readthedocs.io/en/stable/api.html#stdpopsim.DemographicModel). I think we can probably minimise this though, and include deprecation warnings? In particular for get_demography_debugger(). And we could provide @propertys for the demographic_events, population_configurations, and migration_matrix attributes.

@jeromekelleher
Copy link
Member Author

There's quite a lot of breakage here in the DemographicModel class compared to the documented API

I guess we can try to keep compatibility with this, but I wonder if we should. Most of the API documentation there was "premature documentation", it should never have been exposed as public in the first place. I think we only put the API documentation in here for models etc for stdpopsim developers - I certainly didn't have a long-term stable API in mind there. In general, we're going to be pretty hidebound if we try to keep full compatibility with all these details, and end up doing an awful lot of extra work, probably for nothing.

Do you think anyone is actually using the DemographicModel API?

@grahamgower
Copy link
Member

Do you think anyone is actually using the DemographicModel API?

That's hard to guess at. Maybe it's worth asking on slack?

@jeromekelleher
Copy link
Member Author

jeromekelleher commented Mar 4, 2021

@petrelharp @andrewkern This is worth taking a look at now. There's still some details to be worked out, but the basic idea is that we get rid of as much of our local code for defining demographic models as we can, and lean on the msprime tools. I've also updated to use msprime 1.0 APIs for the msprime engine, and updated from RecombinationMap to RateMap everywhere.

What do we think of the basic direction? I think we just have to view the old implementation of DemographicModel as an internal implementation detail that wasn't documented/intended for end users. We'll never get through the transition to demes otherwise.

@codecov
Copy link

codecov bot commented Mar 4, 2021

Codecov Report

Merging #764 (5a1ddc9) into main (190b39e) will decrease coverage by 0.06%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #764      +/-   ##
==========================================
- Coverage   99.54%   99.48%   -0.07%     
==========================================
  Files          36       36              
  Lines        2429     2337      -92     
  Branches      303      279      -24     
==========================================
- Hits         2418     2325      -93     
- Misses          5        6       +1     
  Partials        6        6              
Impacted Files Coverage Δ
stdpopsim/catalog/BosTau/__init__.py 100.00% <ø> (ø)
stdpopsim/genomes.py 100.00% <ø> (ø)
stdpopsim/cli.py 98.43% <100.00%> (ø)
stdpopsim/engines.py 100.00% <100.00%> (ø)
stdpopsim/genetic_maps.py 100.00% <100.00%> (ø)
stdpopsim/models.py 98.82% <100.00%> (-1.18%) ⬇️
stdpopsim/qc/AraTha.py 100.00% <100.00%> (ø)
stdpopsim/qc/DroMel.py 100.00% <100.00%> (ø)
stdpopsim/qc/HomSap.py 100.00% <100.00%> (ø)
stdpopsim/qc/PonAbe.py 100.00% <100.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 190b39e...5a1ddc9. Read the comment docs.

@petrelharp
Copy link
Contributor

This looks good! I'm not sure I have enough of the big picture to be more specific, though.

@jeromekelleher
Copy link
Member Author

jeromekelleher commented Mar 5, 2021

This is ready for review @grahamgower. I think most things went fairly smoothly - the plan for incrementally modernising the catalog should work, I think. The only thing I'm not sure about is non-sampling pops, which I've commented on elsewhere.

There'll be a batch of stuff to do after this, which I guess is best handled by opening issues. If you see things that are outstanding (like, say, updating the recapitation bit of the SLiM engine) which aren't quick fixes, can you open some issues to track please?

Wrt to the API breakage for the DemographicModel class, I suggest we (a) undocument everything except get_samples() and (b) if anyone complains about breakage we can add stuff back in as it seems necessary.

Developers adding new models should only need to use msprime APIs for now, and later Demes.

We should sort out #714 before the next release.

Copy link
Member

@grahamgower grahamgower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

tests/test_HomSap.py Outdated Show resolved Hide resolved
tests/test_HomSap.py Outdated Show resolved Hide resolved
This is a transitional update, which gets most features working with
msprime 1.0, without starting the process of modernising the catalog
models.

Also updates the DemographicModel class to remove instance variables
that were tied to the old representation and their documentation.

Closes popsim-consortium#748
Closes popsim-consortium#740
Closes popsim-consortium#535
@jeromekelleher
Copy link
Member Author

OK, I think this is ready to go. @grahamgower, I'll let you do the merging honours!

@grahamgower grahamgower merged commit 76bf3bc into popsim-consortium:main Mar 5, 2021
@jeromekelleher jeromekelleher deleted the msprime-1-demog branch March 6, 2021 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants