Backing out of requesting alt_identifiers every time. #523

msdemlei · 2024-02-12T13:38:46Z

This is supposed to address bug #522. The code as it was up to here would have needed aggregation of alt_identifiers (which are n:1 over resources), or else we see duplicate capabilities.

But at least some registry operators prefer to not hit the rr.alt_identifier table by default as long as it's not clear who will actually look at these alternate identifiers.

But we maintain the alt identifiers in describe(); to do that, there's now get_alternate_identifiers method returning these.

The downside: describe() now does an uncached network query. Perhaps we want to at least hide failures from there? On the other hand, once we are here we can also call get_contact() here; should we?

codecov · 2024-02-12T13:40:37Z

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (3a24ac2) 80.33% compared to head (5609504) 79.71%.
Report is 2 commits behind head on main.

Files	Patch %	Lines
pyvo/registry/regtap.py	16.66%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #523      +/-   ##
==========================================
- Coverage   80.33%   79.71%   -0.62%     
==========================================
  Files          52       52              
  Lines        6177     6178       +1     
==========================================
- Hits         4962     4925      -37     
- Misses       1215     1253      +38

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bsipocz

Could you please add a changelog entry?

And unless @andamian or @tomdonaldson wants to have a look at this we can merge it.

This is supposed to address bug astropy#522. The code as it was up to here would have needed aggregation of alt_identifiers (which are n:1 over resources), or else we see duplicate capabilities. But at least some registry operators prefer to not hit the rr.alt_identifier table by default as long as it's not clear who will actually look at these alternate identifiers. But we maintain the alt identifiers in describe(); to do that, there's now get_alternate_identifiers method returning these. The downside: describe() now does an uncached network query. Perhaps we want to at least hide failures from there? On the other hand, once we are here we can also call get_contact() here; should we?

msdemlei · 2024-02-15T07:55:00Z

On Thu, Feb 15, 2024 at 06:32:56AM +0000, Brigitta Sipőcz wrote: @bsipocz commented on this pull request. Could you please add a changelog entry?

Ah, sorry. I had deluded myself into believing that hadn't been in released and then failed to see that even then, I'd have to do something about the changelog. Force-pushed a commit including the changelog.

ManonMarchand · 2024-02-15T08:37:56Z

LGTM, thanks Markus!

About adding get_contact in describe. From a user point of view, I'd use get_contact() only in case of issues/bugs with the service, it's not something I'd like to see all the time.

msdemlei · 2024-02-15T09:11:32Z

So... the CI doesn't like my indentation in

res = get_RegTAP_service().run_sync("""
    SELECT alt_identifier
    FROM rr.alt_identifier
    WHERE ivoid={}""".format(
        rtcons.make_sql_literal(self.ivoid)))

-- which, frankly, I consider severely odd. Yes, this might be part of
my personal preferences, but regardless of how else I try to indent
this, it only becomes uglier (and that includes making this an f-string:
I don't like complicated expressions in {}, and I think using
make_sql_literal as a serialiser would be terribly non-obvious).

So... can we perhaps haggle about dropping another pep8 check?

bsipocz · 2024-02-15T17:58:25Z

Ah, sorry. I had deluded myself into believing that hadn't been in
released and then failed to see that even then, I'd have to do
something about the changelog.

Oh, that may be my bad, I haven't remembered this.

bsipocz · 2024-02-15T17:59:13Z

So... the CI doesn't like my indentation in

Let me see what it does when removing the linebreak in format. If it still complains then I'll go ahead and add this check to the ignore list.

bsipocz · 2024-02-15T18:03:39Z

@msdemlei, how about this? I find this one more readable than your version with the nested indents, and apparently, the linter likes it too.
(I personally like f-strings a lot, but also understand if you prefer not to use one in this one).

        res = get_RegTAP_service().run_sync("""
            SELECT alt_identifier
            FROM rr.alt_identifier
            WHERE ivoid={}""".format(rtcons.make_sql_literal(self.ivoid)))

msdemlei · 2024-02-16T07:07:35Z

On Thu, Feb 15, 2024 at 10:03:52AM -0800, Brigitta Sipőcz wrote: @msdemlei, how about this? I find this one more readable than your version with the nested indents, and apparently, the linter likes it too. (I personally like f-strings a lot, but also understand if you prefer not to use one in this one). ``` res = get_RegTAP_service().run_sync(""" SELECT alt_identifier FROM rr.alt_identifier WHERE ivoid={}""".format(rtcons.make_sql_literal(self.ivoid))) ```

Readability necessarily is a matter of taste. Here, I frankly can't really see the rationale of the whitespace checker, and I'd slightly prefer if we could avoid mixing nontrivial SQL and Python on one line. But then... I think people can easily figure out what's going on in either version, and the lines are shorter than 80 characters (which I do care about; longer lines *really* spoil a lot of things in my workflows). So, I'm happy with this. Thanks!

bsipocz · 2024-02-16T07:09:56Z

longer lines really spoil a lot of things in my workflows

that is good to know about, I'll try to remember to be more strict in my codes for long lines.

pyvo/registry/regtap.py

bsipocz · 2024-02-16T07:12:50Z

Once we get this one and #416 in, I'm happy to tag a bugfix release. What do you think?

bsipocz · 2024-02-16T07:19:50Z

thank you @msdemlei!

msdemlei · 2024-02-16T16:05:36Z

On Thu, Feb 15, 2024 at 11:13:01PM -0800, Brigitta Sipőcz wrote: Once we get this one and #416 in, I'm happy to tag a bugfix release. What do you think?

I'd really be grateful. And I apologise again for not having realised that something is amiss before the release. This brings me back go sketching some pre-release exercise to try a representative sample of jupyter notebooks as considered in #304. If someone has a good example of where folks have a reasonable system where they run notebooks as integration tests, I might try my hands...

bsipocz · 2024-02-16T17:09:08Z

I do run notebooks in CI (we both have pyvo focusing navo notebooks as well as other IRSA notebooks), but arguably these don't cover much of pyvo's codebase (what they cover and uncover I tend to open issues for).

CDS also has very nice notebooks, and as more archives switch to use the pyvo backend in astroquery I expect a bit more coverage from there, too.

As for #304, I still think it's not our (==pyvo maintainers') job to test third-party notebooks as part of the release process, but I do think we can help promote CI practices (or oversee such CIs in our other capacities, like I do with the navo and irsa notebooks) and e.g. help setting up cron jobs that run the notebooks regularly, or before releases.

I was chattering about tagging the 1.5 release on the astropy slack, but I don't think it reaches the relevant people. Would e.g. announcing on the ivoa slack be preferable? We could do that ~a week before releasing, at least when doing a non-bugfix release.

ManonMarchand · 2024-02-16T17:10:37Z

We have a CI for notebooks here: https://github.com/cds-astro/tutorials
It uses pytest nbmake:

pytest --nbmake -n=auto .

The big flaw is that it only makes sure that they run, not that the output did not change. I'm exploring solutions for that.
It already got a few really good catches even like this.

bsipocz · 2024-02-16T17:12:47Z

we do use nbval all over the scientific python ecosystem. The outputs could be doctested, but yes, I agree it's rather painful and do have gaps in feature-space.

ManonMarchand · 2024-02-16T17:14:48Z

we do use nbval all over the scientific python ecosystem. The outputs could be doctested, but yes, I agree it's rather painful and do have gaps in feature-space.

Thanks! Don't know how I missed that, looks awesome.

bsipocz · 2024-02-16T17:20:55Z

We do have some plans to improve this part of the ecosystem, but it's not happening overnight (but we know eventually there would be a lot of users, at least for the small tutorial/snippet like notebooks (not for the long demo ones that require a lot of exotic dependencies and are expected to run for many hours to complete)

bsipocz · 2024-02-16T17:21:55Z

Thanks! Don't know how I missed that, looks awesome.

I'm happy to give a tour about it at the ivoa if it would be useful for more groups.

Backing out of requesting alt_identifiers every time.

bsipocz added this to the v1.5.1 milestone Feb 15, 2024

bsipocz added bug component: registry labels Feb 15, 2024

bsipocz reviewed Feb 15, 2024

View reviewed changes

msdemlei force-pushed the fix-alt-identifier-overselect branch from 2e8a828 to 095b734 Compare February 15, 2024 07:53

bsipocz approved these changes Feb 16, 2024

View reviewed changes

pyvo/registry/regtap.py Outdated Show resolved Hide resolved

MAINT: making linter happy.

5609504

bsipocz merged commit 6518d05 into astropy:main Feb 16, 2024
9 of 12 checks passed

bsipocz added a commit that referenced this pull request Feb 22, 2024

Merge pull request #523 from msdemlei/fix-alt-identifier-overselect

559de7c

Backing out of requesting alt_identifiers every time.

bsipocz added a commit that referenced this pull request Feb 22, 2024

Merge pull request #523 from msdemlei/fix-alt-identifier-overselect

7f791c0

Backing out of requesting alt_identifiers every time.

msdemlei mentioned this pull request Jun 26, 2024

Stale alt_identifier join damages RegistryResource-s with multiple alternative identifiers #560

Closed

lmichel mentioned this pull request Sep 4, 2024

Feature skycoord #591

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backing out of requesting alt_identifiers every time. #523

Backing out of requesting alt_identifiers every time. #523

msdemlei commented Feb 12, 2024

codecov bot commented Feb 12, 2024 •

edited

Loading

bsipocz left a comment

msdemlei commented Feb 15, 2024 via email

ManonMarchand commented Feb 15, 2024

msdemlei commented Feb 15, 2024

bsipocz commented Feb 15, 2024

bsipocz commented Feb 15, 2024

bsipocz commented Feb 15, 2024

msdemlei commented Feb 16, 2024 via email

bsipocz commented Feb 16, 2024

bsipocz commented Feb 16, 2024

bsipocz commented Feb 16, 2024

msdemlei commented Feb 16, 2024 via email

bsipocz commented Feb 16, 2024

ManonMarchand commented Feb 16, 2024

bsipocz commented Feb 16, 2024

ManonMarchand commented Feb 16, 2024

bsipocz commented Feb 16, 2024

bsipocz commented Feb 16, 2024

Backing out of requesting alt_identifiers every time. #523

Backing out of requesting alt_identifiers every time. #523

Conversation

msdemlei commented Feb 12, 2024

codecov bot commented Feb 12, 2024 • edited Loading

Codecov Report

bsipocz left a comment

Choose a reason for hiding this comment

msdemlei commented Feb 15, 2024 via email

ManonMarchand commented Feb 15, 2024

msdemlei commented Feb 15, 2024

bsipocz commented Feb 15, 2024

bsipocz commented Feb 15, 2024

bsipocz commented Feb 15, 2024

msdemlei commented Feb 16, 2024 via email

bsipocz commented Feb 16, 2024

bsipocz commented Feb 16, 2024

bsipocz commented Feb 16, 2024

msdemlei commented Feb 16, 2024 via email

bsipocz commented Feb 16, 2024

ManonMarchand commented Feb 16, 2024

bsipocz commented Feb 16, 2024

ManonMarchand commented Feb 16, 2024

bsipocz commented Feb 16, 2024

bsipocz commented Feb 16, 2024

codecov bot commented Feb 12, 2024 •

edited

Loading