-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved access to AGASC supplement #56
Conversation
I know I'm the most pedantic of us, but do we need/want a way to identify which version of the supplement was used? And would that be just a date key or something? |
I guess the issue is that the AGASC supplement is not versioned and it is not possible to roll back to an arbitrary version even if you have a version / date. So right now we basically have "use the latest/greatest supplement", or "don't use the supplement at all for reproducibility". I hear you that this might be a nice-to-have, but it doesn't quite fit into the current data model we've established. Unless there is something I'm not considering. |
It occurred to me in the meantime that we could probably just put the whole file into a git repo and make a commit and tag at the point of each "release" (promotion). The version can be stored in the file in a HDF5 keyword. I'd be happy enough with that and the |
I think it is good, in principle, to be able to somehow keep track of the supplement version. I would not recommend adding the supplement itself to version control unless we change the format. Currently, the file is about 1.5 GB, and we intend to modify it once a week. If git keeps a copy of each version, that is 75 GB in a year. And this gets downloaded when you clone the repository. If we still want to keep the format and add it to git, one option could be to use |
I'm showing the |
Also, that repo would not be part of the |
Right, I don't have a huge justification for the additional work/tracking version, but I was disturbed at the thought of not being able to easily reproduce star selection. I had been thinking of either a "save last week's file" method or just tracking of date so one could regenerate a supplement as it would have existed at a date. The git repo of the file with tags for weekly releases sounds better. I note that I can't think of a reason to distribute the repo (unlike, say, the new thermal models version control with git repo as the sync'd data product). Which I mention not because it was suggested... just talking this out in the comments. |
ok, so it seems we will use a separate repository for versioning the supplement. If so, then we can use the commit SHA or the version returned by ska_helpers, but then the update scripts should include committing to the repository. |
I think I have addressed all the comments now, and also made the main documentation page a bit better. |
Description
This adds a new function
agasc.get_supplement_table()
which should be the primary external way to access the AGASC supplement. With this it is possible to get any of the 3 tables as either aTable
or a keyed dict.It also fixes a bug that showed up in functional testing with
proseco
, namely that theproseco_agasc_1p7.h5
file doesn't have aMAG_CATID
so the update code was failing due to a key error.The function is cached with a one-hour timeout, so a long-lived process will get any updates that occur in the meantime.
Testing
Testing used the current
agasc_supplement.h5
file from/proj/sot/ska/jgonzalez/miniconda3/envs/agasc/data/agasc
(if I got that path right from memory), except that I added one entry to theobs
table.Functional testing
Also used this with
proseco
via and sparkles via sot/proseco#345 andsparkles
via sot/sparkles#149.