Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification about ID type in the docs for Table.update_ids #879

Merged
merged 7 commits into from
Oct 31, 2022

Conversation

fedarko
Copy link
Contributor

@fedarko fedarko commented Apr 20, 2022

I'm not sure if non-string IDs were ever officially supported for BIOM tables, but some of the Empress test code (these lines, in particular) involved setting IDs to integers. It looks like this commit broke this behavior, and by extension this particular Empress test -- it triggered the following error:

ERROR: test_match_inputs_no_tips_in_table (python.test_tools.TestTools)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/empress/empress/tests/python/test_tools.py", line 143, in test_match_inputs_no_tips_in_table
    bad_table.update_ids({i: idx for idx, i in
  File "/usr/share/miniconda/envs/empress/lib/python3.9/site-packages/biom/table.py", line 1405, in update_ids
    str_dtype = 'U%d' % max([len(v) for v in id_map.values()])
  File "/usr/share/miniconda/envs/empress/lib/python3.9/site-packages/biom/table.py", line 1405, in <listcomp>
    str_dtype = 'U%d' % max([len(v) for v in id_map.values()])
TypeError: object of type 'int' has no len()

This PR documents explicitly that both keys and values for the id_map parameter of Table.update_ids() should be strings. I haven't added any validation code that checks this (I'm not sure if the added runtime incurred by such a check would be desirable, and this seems like it'd be a pretty weird thing for users to try to do normally...?), but hopefully this clarifies things for future devs who encounter this problem on updating to newer versions of biom-format.

This PR also includes two small grammar fixes.

Thanks!

Also fixed a grammar issue (lets --> let's).
@fedarko
Copy link
Contributor Author

fedarko commented Apr 20, 2022

It looks like the build is failing for this -- since this PR only contains docs fixes, these failures are likely due to dependencies shifting. The build seems to be failing when attempting to install anndata:

Collecting anndata
  Downloading anndata-0.7.8-py3-none-any.whl (91 kB)
Requirement already satisfied: numpy>=1.16.5 in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from anndata) (1.19.5)
Requirement already satisfied: packaging>=20 in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from anndata) (21.3)
Requirement already satisfied: h5py in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from anndata) (2.10.0)
Requirement already satisfied: scipy>1.4 in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from anndata) (1.5.3)
Requirement already satisfied: pandas>=1.1.1 in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from anndata) (1.1.5)
Collecting natsort
  Downloading natsort-8.1.0-py3-none-any.whl (37 kB)
Requirement already satisfied: importlib_metadata>=0.7; python_version < "3.8" in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from anndata) (3.10.1)
Collecting xlrd<2.0
  Downloading xlrd-1.2.0-py2.py3-none-any.whl (103 kB)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from packaging>=20->anndata) (3.0.8)
Requirement already satisfied: six in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from h5py->anndata) (1.16.0)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from pandas>=1.1.1->anndata) (2.8.2)
Requirement already satisfied: pytz>=2017.2 in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from pandas>=1.1.1->anndata) (2022.1)
Requirement already satisfied: typing-extensions>=3.6.4; python_version < "3.8" in /usr/share/miniconda/envs/env_name/lib/python3.6/site-packages (from importlib_metadata>=0.7; python_version < "3.8"->anndata) (4.2.0)
ERROR: Package 'typing-extensions' requires a different Python: 3.6.15 not in '>=3.7'
Error: Process completed with exit code 1.

... I'm not sure how to fix this! It seems like anndata (or at least its dependencies) might no longer be compatible with Python 3.6?

@wasade
Copy link
Member

wasade commented Apr 20, 2022

That is unusual, would it be possible to add "anndata" to the conda install in gh actions? It's present in the bioconda channel. I wonder if pip and conda are fighting

@fedarko
Copy link
Contributor Author

fedarko commented Apr 20, 2022

@wasade awesome, installing anndata via conda instead of via pip fixed the build! For now, the CI installs anndata from conda-forge instead of bioconda -- anndata is available on both of these channels (conda-forge, bioconda), but it looks like only the conda-forge version is actively maintained.

I also went ahead and updated the README a bit, similar to the changes made for the iow PR.

@peterjc
Copy link
Contributor

peterjc commented Oct 31, 2022

Confirming anndata was at 0.6.22.post1 when it was dropped from bioconda in bioconda/bioconda-recipes#17783 (bulk change in Oct 2019) so using conda-forge makes sense. That package lives at https://github.com/conda-forge/anndata-feedstock/

@wasade
Copy link
Member

wasade commented Oct 31, 2022

Thanks for the bump on this. Is this PR good to go? I was out all summer on leave and missed the follow up here

@fedarko
Copy link
Contributor Author

fedarko commented Oct 31, 2022

The PR should be good to go. Thanks!

@wasade wasade merged commit 5bf61d5 into biocore:master Oct 31, 2022
@wasade
Copy link
Member

wasade commented Oct 31, 2022

Thanks!

@fedarko
Copy link
Contributor Author

fedarko commented Oct 31, 2022

Just as a heads up, it looks like the Python 3.6 build began failing in recent months (although the Python 3.10 build is fine). As far as I can tell, this is due to anndata 0.8.0 bumping its minimum Python version to 3.7 (source)—biom doesn't set a pin on the anndata version used.

I guess the two obvious ways to handle this issue would be 1) dropping biom's support for Python 3.6 also, or 2) pinning its anndata dependency to be < 0.8.0.

@wasade
Copy link
Member

wasade commented Oct 31, 2022

I don't know the implications of pinning anndata. Would that be bad?

@fedarko
Copy link
Contributor Author

fedarko commented Oct 31, 2022

It seems to be an explicitly optional dependency that's just used in the to_anndata() function, so I don't think pinning anndata would be a big deal. But I guess it could cause problems for users down the line if, for example, later versions of anndata lose backwards compatibility with the outputs of anndata < 0.8.0.

Looking through the code some more, a better fix might just be removing the anndata installation from the Python 3.6 build—the test suite should automatically detect if it isn't installed and skip the relevant tests. If we do this (while presumably updating the docs somewhere to mention that using anndata >= 0.8.0 implies a minimum Python version of 3.7), then we get the best of both worlds: there's no anndata pin required, and we can still support Python 3.6 (albeit without anndata >= 0.8.0).

Ultimately not my call, though :)

@peterjc
Copy link
Contributor

peterjc commented Nov 1, 2022

Python 3.6 is EOL, so at some point you'll be dropping it anyway: https://peps.python.org/pep-0494/

@wasade wasade mentioned this pull request Nov 1, 2022
@wasade
Copy link
Member

wasade commented Nov 1, 2022

We can account for that on the next release, but I do not have a timeline at the moment for that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants