Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine scope of Augur's Python Development API #1011

Open
victorlin opened this issue Jul 28, 2022 · 3 comments
Open

Refine scope of Augur's Python Development API #1011

victorlin opened this issue Jul 28, 2022 · 3 comments
Assignees
Labels
proposal Proposals that warrant further discussion

Comments

@victorlin
Copy link
Member

victorlin commented Jul 28, 2022

Context

The doc page on Python Development API converts the docstrings of each class/function.

There has also been some ongoing efforts to modularize large files in this project. Some examples:

I was just made aware of the unintended side effects of these changes on their respective doc pages:

Discussion

Note that in the change to augur.io, the docs are inaccurate because everything is still exposed at a higher level (i.e. doc generation could be improved).

However, I think the wide net casted by the publicizing of this API should be retracted a bit so we don't create so many "breaking changes" like this from internal refactoring.

This can be achieved with more underscore-prefixed names like augur.filter._calculate_total_sequences which indicates a "private" function (not actually private, mostly just a naming convention, but at least our docs parser seems to follow this convention since augur.filter._calculate_total_sequences is not shown on the doc page).

https://github.com/nextstrain/augur/labels/request%20for%20comments

  1. Who is are target audience with this public API?
  2. For each subcommand, what do we want to expose in this API?
@victorlin victorlin added the proposal Proposals that warrant further discussion label Jul 28, 2022
@victorlin victorlin self-assigned this Jul 28, 2022
@tsibley
Copy link
Member

tsibley commented Jul 28, 2022

To clarify, I'm not sure it's settled that the "Python Development API" section of docs corresponds to what we consider to be Augur's public Python API. What's in and out is a long-standing open question that we haven't really nailed down. I do agree that ultimately using the docs is a good way to list what's public and what's not, but I'm not sure that care has been taken yet.

Input from @huddlej in particular would be good as he's in the past had many thoughts/suggestions here from his own perspective of both developer and user.

@victorlin
Copy link
Member Author

victorlin commented Oct 20, 2022

We just discussed this over a call. Action items:

  1. Reword the Python Development API page to make it clear that these are intended for developers, not users importing Augur in their scripts.
  2. Create a new page titled Python Public API that generates docs for a list of functions intended to be imported and used by users. This list of functions should be defined by the docs project.
  3. Start with just io.read_metadata and io.read_sequences.
  4. Search for other external imports of Augur (e.g. GitHub search org:nextstrain read_node_data, GitHub code search) and decide what to do with each function individually. Options:
    1. Add it to the list of functions to be included in the Public API
    2. Move it to a better location (e.g. augur.utils.read_node_data should be augur.io.read_node_data) then add it to the Public API
    3. Note to not include it in the Public API. (I don't think this should be done unless there is a good reason, since there must already be a good reason for it being used in the first place.)

@tsibley
Copy link
Member

tsibley commented Oct 20, 2022

I used another GitHub search to look more widely and then quickly tallied up the results:

symbol count percent histogram
augur.__version__.__version__ 2 1.44 **
augur.align.read_sequences 2 1.44 **
augur.align.run 4 2.88 *****
augur.ancestral.collect_mutations_and_sequences 1 0.72 *
augur.clades.is_node_in_clade 4 2.88 *****
augur.clades.read_in_clade_definitions 4 2.88 *****
augur.dates.get_numerical_dates 1 0.72 *
augur.distance.read_distance_map 1 0.72 *
augur.export_v2.parse_node_data_and_metadata 1 0.72 *
augur.frequences.TreeKdeFrequencies 1 0.72 *
augur.frequency_estimators.TreeKdeFrequencies 2 1.44 **
augur.frequency_estimators.float_to_datestring 1 0.72 *
augur.frequency_estimators.logit_transform 1 0.72 *
augur.io.open_file 7 5.04 ********
augur.io.read_metadata 6 4.32 *******
augur.io.read_sequences 6 4.32 *******
augur.io.write_sequences 3 2.16 ***
augur.lbi.select_nodes_in_season 1 0.72 *
augur.parse.forbidden_characters 1 0.72 *
augur.titer_model.TiterCollection 1 0.72 *
augur.translate.safe_translate 5 3.60 ******
augur.translate.translate_feature 1 0.72 *
augur.utils.get_numerical_dates 4 2.88 *****
augur.utils.json_to_tree 3 2.16 ***
augur.utils.load_feature 1 0.72 *
augur.utils.load_features 7 5.04 ********
augur.utils.numeric_date 2 1.44 **
augur.utils.read_lat_longs 1 0.72 *
augur.utils.read_metadata 8 5.76 **********
augur.utils.read_node_data 6 4.32 *******
augur.utils.read_tree 2 1.44 **
augur.utils.write_json 9 6.47 ***********

Underlying data at https://gist.github.com/tsibley/d2266ca5f9d1fb9948195532795824d7.

This includes our own usage as well as external usage (both novel and in forks-of-our-usage), but excludes usage within Augur itself (naturally).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Proposals that warrant further discussion
Projects
No open projects
Status: Backlog
Development

No branches or pull requests

2 participants