Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotate epiweek as color and filter option #703

Merged
merged 4 commits into from
Aug 19, 2021
Merged

Annotate epiweek as color and filter option #703

merged 4 commits into from
Aug 19, 2021

Conversation

huddlej
Copy link
Contributor

@huddlej huddlej commented Aug 5, 2021

Description of proposed changes

Although we plan to eventually support custom start and end date values through Auspice's filter interface, this commit some similar information that is also relevant to epidemiologists in the form of CDC-style epiweeks. We add a new script to calculate epiweeks using the Python library of the same name and produce a node data JSON representation of those values.

Note that while this commit updates the Conda environment file to represent the new dependency on the "epiweeks" package, we would also need to include this library in the Docker base image, if we end up merging this code.

We would also need to update all auspice configs to include "epiweek" as a color-by and filter. This commit updates only the default auspice config.

The resulting annotations (for the Nextstrain CI build) look like this:

5CA37AF1-4810-4707-ADC1-952BF1303EC6

I think we need to use the fancy new Auspice legends syntax for this type of annotation that is both numerical and not a floating point value.

Testing

  • CI

Release checklist

If this pull request introduces backward incompatible changes, complete the following steps for a new release of the workflow:

  • Determine the version number for the new release by incrementing the most recent release -> "v8".
  • Update docs/change_log.md in this pull request to document these changes and the new version number.
  • After merging, create a new GitHub release with the new version number as the tag and release title.

Additional context

Thank you to @BryanTegomoh for requesting this feature and @josephfauver for implementing the original version in his builds.

@huddlej huddlej marked this pull request as ready for review August 13, 2021 00:32
@jameshadfield
Copy link
Member

jameshadfield commented Aug 16, 2021

Thanks @huddlej - this looks good. I expanded the rule finalize to dynamically create a categorical legend using epiweeks which span the entire observed range, which results in auspice generating a colour scale which maps evenly to time to create a Nextstrain 🌈 colour scale across the observed time range. This looks smooth for our main nCoV runs, and the smaller CI run is a good example of how the colours represent time, rather than just being sequential for the observed data.

image

image

notes

  • There are other ways to achieve this, e.g. a continuous scale and then only display certain legend entries, however I wanted to leave open a path to using ISO epi weeks in the future, which can't be interpreted as numerical values.
  • We don't need to update auspice_config.jsons as we are providing the epiweek data as a node-data JSON, which will be automatically exported irregardless of what color-bys are in the config JSON. The only exception here is if we want them to show up in the filtering in the page footer (they'll always be in the sidebar dropdown) or to have a nicely formatted title.
  • If there are more than 30 epiweeks in the observed timeframe we reuse colour hexes. I don't think this is problematic, but we could interpolate between them if that's important.

@cassiawag
Copy link
Collaborator

I think this looks fantastic with @jameshadfield's edits! I also agree that reusing color after 30 weeks is no problem. You could potentially choose an even fewer number of colors (20 or so) as is color's are distinguishable, but barely, and I'm not sure how much this would be improved by 10 fewer colors.

@huddlej
Copy link
Contributor Author

huddlej commented Aug 19, 2021

Thank you for the improved color scale, @jameshadfield and for the review, @cassiawag! The annotations look much better now.

I'm going to merge this and release it as a major version, since it does require users to update their software environment to include the epiweeks package.

huddlej and others added 4 commits August 19, 2021 14:59
Although we plan to eventually support custom start and end date values
through Auspice's filter interface, this commit some similar information
that is also relevant to epidemiologists in the form of CDC-style
epiweeks. We add a new script to calculate epiweeks using the Python
library of the same name and produce a node data JSON representation of
those values.

Note that while this commit updates the Conda environment file to
represent the new dependency on the "epiweeks" package, we would also
need to include this library in the Docker base image, if we end up
merging this code.

We would also need to update all auspice configs to include "epiweek" as
a color-by and filter. This commit updates only the default auspice
config.
This expands our `finalize` rule to dynamicallly generate a scale for
datasets which use a coloring for (CDC) epiweek. The resulting dataset
coloring will have an ordered legend spanning the observed time frame.
Since we're already forcing an update to the conda environment with
`--use-conda`, use the latest Augur and same some memory during augur
filter steps.
We require new software in the user's workflow environment, so we should
at least note this with a major release and provide instructions on how
to update their software.
@huddlej huddlej merged commit 92202ab into master Aug 19, 2021
@huddlej huddlej deleted the epiweeks branch August 19, 2021 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants