Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to include nodes marked as obsolete #15

Merged
merged 6 commits into from
Sep 27, 2020

Conversation

torstees
Copy link
Contributor

Added parameter to the read_obo function to permit users to keep obsolete nodes in their graphs. Default is such that it shouldn't break any pre-existing client calls.

@torstees
Copy link
Contributor Author

I had to make this change in order to identify some legacy codes in some of our datasets.

Copy link
Owner

@dhimmel dhimmel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the contribution!

one small suggestion, otherwise looks good.

a test would be nice, but this is so simple that I'm pretty confident your implementation will work. So happy to merge without test.

obonet/read.py Outdated
Comment on lines 38 to 39
is_obsolete = ignore_obsolete and term.get('is_obsolete', 'false') == 'true'
if is_obsolete:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps keep is_obsolete = term.get('is_obsolete', 'false') == 'true' but change the next line to:

if ignore_obsolete and is_obsolete:

assert taxrank.node['TAXRANK:0000001']['name'] == 'phylum'
assert 'NCBITaxon:kingdom' in taxrank.node['TAXRANK:0000017']['xref']
# It looks like networkx has changed the name of the node variable to nodes -- EST 2020-09-25
assert taxrank.nodes['TAXRANK:0000001']['name'] == 'phylum'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like neworkx 2.4 removed G.nodes, which had been marked as deprecated in 2.1.

not sure why Travis CI builds are not showing up on GitHub for this pull request, but see the failures at https://travis-ci.org/github/dhimmel/obonet/builds/730420444. We support both networkx 1.x and 2.x.

You could add pytest.importorskip("networkx", minversion="2.0") to this test to skip it on 1.x builds. Removing 1.x support might be something we do soon anyways.

Copy link
Contributor Author

@torstees torstees Sep 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original (taxrank.node) fails with networkx 2.5:

>       assert taxrank.node['TAXRANK:0000001']['name'] == 'phylum'
E       AttributeError: 'MultiDiGraph' object has no attribute 'node'

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original (taxrank.node) fails with networkx 2.5:

yes. we should change to using .node, although this will cause failure for the networkx 1.x builds. Therefore, I was thinking we could just skip the taxrank tests if networkx < 2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taxrank.node fails for me when using networkx 2.5, but I left the failing test as it was originally.

Comment on lines 123 to 133
def test_ignore_obsolete_nodes():
hpo = obonet.read_obo("http://purl.obolibrary.org/obo/hp.obo")
nodes = hpo.nodes(data=True)
assert "HP:0005549" not in nodes

def test_presence_of_obsolete_nodes():
hpo = obonet.read_obo("http://purl.obolibrary.org/obo/hp.obo", ignore_obsolete=False)
nodes = hpo.nodes(data=True)
assert "HP:0005549" in nodes
node = nodes['HP:0005549']
assert node['is_obsolete'] == 'true'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests.

hp.obo looks to be 6.7M, so for the purpose of keeping these tests speedy, we might want to test on a simple ontology that we store locally. See examples in tests/data. This also protects us from future test failure if something changes at that URL.

One easy way might be to add the following to tests/data/brenda-subset.obo (source):

[Term]
id: BTO:0000311
name: culture filtrate
comment: Added as synonym to culture fluid.
is_obsolete: true

@@ -8,7 +8,6 @@

directory = os.path.dirname(os.path.abspath(__file__))


def test_read_taxrank_file():
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test function also uses .node and is therefore causing test failures on networkx 2.5. Would you be able to update .node to .nodes here and apply the pytest.importorskip?

Sorry to make you fix things unrelated to the feature enhancement. I can also make these changes on master if they are becoming too burdensome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Travis seems to be happy now.

@dhimmel dhimmel merged commit 0cf3b4a into dhimmel:master Sep 27, 2020
@dhimmel
Copy link
Owner

dhimmel commented Sep 27, 2020

Thanks @torstees for this great contribution!

I'm going to enable scheduled CI builds so the next contributor isn't stuck with fixing a bunch of problems that aren't their fault.

Will comment here when I've released a new version to PyPI with these changes.

@dhimmel
Copy link
Owner

dhimmel commented Sep 27, 2020

@torstees merged commit (0cf3b4a) is released in https://pypi.org/project/obonet/0.2.6/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants