-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
export v2: Add --metadata-columns option #1384
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1384 +/- ##
==========================================
+ Coverage 66.71% 66.96% +0.24%
==========================================
Files 69 69
Lines 7322 7341 +19
Branches 1799 1804 +5
==========================================
+ Hits 4885 4916 +31
+ Misses 2169 2154 -15
- Partials 268 271 +3 ☔ View full report in Codecov by Sentry. |
Nice one Jover - the overall aim of this work is something I've wanted for a long time. I haven't tested the code, just skimmed through.
Here do you mean visible in the modal when (shift-)clicking on a branch/tip? It'd be good to surface them as sidebar filtering options too, but that'll take an Auspice change I think. My understanding of the linked issue (and from discussions with others) is that datasets are adding unnecessary colorings in order to expose metadata as filtering options. |
Yes, I mean they will only be visible on the modal on click.
Yup, I think this will require changes in Auspice. My understanding is that currently filters must be defined in the Auspice JSON under |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionality is great! Suggestions below are non-blocking (except changelog request).
6b1dbf6
to
4bf28ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allows users to specify additional metadata columns to export as node attributes that are not specified as coloring options via the existing --color-by-metadata option or the Auspice config JSON. This allows the metadata columns to be visible in the tree in Auspice without polluting the color-by options.
This behaviour is already possible by specifying the (TSV metadata column names) in the auspice config's filters
array. There are side effects beyond those implemented in this PR, as each key in filters
will be shown as a (collapsed) section in the footer, but the (big) upside is that the metadata is searchable.
I want to change the general behaviour here a bit, but I think we are better to sketch the big direction out first before merging any PRs. I'm envisioning something like the following, but it's not been completely thought through:
- Auspice uses all
node_attrs
as sidebar filtering options (and maybe make branch labels searchable too, but that's complicated). - The
filters
array (in the dataset JSON) is reserved for the items listed in the footer. I see this functionality as less important now that we have sidebar filtering. - We have some way of exporting metadata (TSV) fields without them being colourings. Maybe that's this PR?
But maybe just doing (1) is enough? Thoughts?
augur/export_v2.py
Outdated
@@ -877,6 +887,9 @@ def register_parser(parent_subparsers): | |||
help="delimiters to accept when reading a metadata file. Only one delimiter will be inferred.") | |||
optional_inputs.add_argument('--metadata-id-columns', default=DEFAULT_ID_COLUMNS, nargs="+", | |||
help="names of possible metadata columns containing identifier information, ordered by priority. Only one ID column will be inferred.") | |||
optional_inputs.add_argument('--metadata-columns', nargs="+", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any functionality we support via command line arguments should be available via the auspice-config JSON as well, as the former is a subset of the latter's capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this flag relates to the metadata input/export but not really Auspice control settings. Not sure this fits in the auspice-config JSON?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it as controlling what data is exported, similar to filters & colours. The (original) intention when we started adding command line arguments for options in the config JSON was for auspice-config to be all you need to control what data the JSON contains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that makes sense. I'll move up the flag to be part of the config
args group too then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Auspice config schema update and changes to support the new property added in bdb2b2c
I think (1) is definitely needed. I think this will require some design on filtering for continuous value fields? On (2), I think we can even deprecate the footer filters. I definitely only use the sidebar filtering and have not looked at footer filters in a while. I think even if we implement (1), we still need some way to support (3). I think it would be nice to fully detangle exported metadata from coloring options. I would want to be more explicit about what metadata fields are exported instead of letting it stay a side effect of other Auspice configurations. |
Allows users to specify additional metadata columns to export as node attributes that are not specified as coloring options via the existing `--color-by-metadata` option or the Auspice config JSON. This allows the metadata columns to be visible in the tree in Auspice without polluting the color-by options.
Suggested by @victorlin in review.
4bf28ec
to
1b83e78
Compare
Rebased to fix merge conflicts in changelog. |
@jameshadfield since (1) was implemented in nextstrain/auspice#1743, do you still have feedback here? |
My interpretation of the deprecation here was to deprecate the auspice UI, but keep the augur export filters setting and use this as the way to add metadata that won't be a coloring. As in "what data would you like to include for filtering in auspice". But would you prefer to add (If we do add |
I'd prefer to add For me, |
Since the new `--metadata-columns` flag controls what data is available in the final Auspice JSON, it should have a matching option in the Auspice config JSON setting. This also moves the flag to the "DISPLAY CONFIGURATION" group to match the other Auspice config controlled settings. As per the existing pattern, the command line flag overrules what is set in the config file. Based on feedback from @jameshadfield #1384 (comment)
1b83e78
to
32c9740
Compare
I'd vote to please keep the footer filters, if possible. I still use them regularly and find them helpful for quickly picking from a longer list of options (e.g., year-month values for flu, clades, regions, etc.). It's generally been helpful to have a way to see in a single view what the filter fields are and the specific values within each field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't re-reviewed everything but the changes in bdb2b2c look good by inspection
@huddlej No worries, this PR doesn't touch the footer filters! Although this reminded me that the default footer filters will likely only be a subset of available filter fields since it currently does not include the additional metadata columns! Lines 539 to 542 in 4bf72f8
|
Merging this PR as-is, will update the footer filters separately (if desired). |
Fixing changelog update made in #1384. The PR was actually released as part of v24.2.0, not v 24.1.0.
Description of proposed changes
Allows users to specify additional metadata columns to export as
node attributes that are not specified as coloring options via
the existing
--color-by-metadata
option or the Auspice config JSON.This allows the metadata columns to be visible in the tree in Auspice
without polluting the color-by options.
Related issue(s)
Resolves #1383
Checklist