Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support columns key in get_agasc_cone and improve caching #183

Merged
merged 10 commits into from
Jul 15, 2024

Conversation

taldcroft
Copy link
Member

@taldcroft taldcroft commented Jul 5, 2024

Description

This is driven by the need to support AGASC 1.8 in kadi.observations.get_starcats(). The kadi.observations module defines a get_agasc_cone_fast that caches a limited set of proseco_agasc columns, but it won't work with AGASC 1.8.

This PR implements support for columns in get_agasc_cone with caching which works with 1.7 and 1.8. The plan is then to remove kadi.observations.get_agasc_cone_fast (or make it a thin wrapper around cached get_agasc_cone).

Other changes

  • Do not limit the cache size to 1 (last). It is up to the user to manage memory if multiple AGASCs or column combinations are cached.
  • Do not require COLOR1 column. If not present then update_color1_column() returns immediately. Previously it would crash.
  • Update dev/profile_cache.py to include a case with columns defined.
  • Update for ruff 0.5.1, remove black in python-linting, and make a few non-impacting format / lint-based changes.

Related PR's

Interface impacts

New keywords columns in get_agasc_cone.

Testing

Unit tests

  • Mac
(ska3) ➜  agasc git:(cone-by-columns-with-caching) git rev-parse HEAD                                                              
154e6e51820977ad1820533723d135009784bafd
(ska3) ➜  agasc git:(cone-by-columns-with-caching) pytest
======================================================== test session starts ========================================================
platform darwin -- Python 3.11.8, pytest-7.4.4, pluggy-1.4.0
rootdir: /Users/aldcroft/git
configfile: pytest.ini
plugins: timeout-2.2.0, anyio-4.3.0
collected 77 items                                                                                                                  

agasc/tests/test_agasc_1.py .......                                                                                           [  9%]
agasc/tests/test_agasc_2.py ..........sssss..........ss..................                                                     [ 67%]
agasc/tests/test_agasc_healpix.py ...........                                                                                 [ 81%]
agasc/tests/test_obs_status.py ..............                                                                                 [100%]

=================================================== 70 passed, 7 skipped in 8.20s ===================================================

Independent check of unit tests by Jean

  • Linux
jeanconn-fido> pytest
=================================================== test session starts ====================================================
platform linux -- Python 3.11.8, pytest-7.4.4, pluggy-1.4.0
rootdir: /proj/sot/ska/jeanproj/git
configfile: pytest.ini
plugins: anyio-4.3.0, timeout-2.2.0
collected 77 items                                                                                                         

agasc/tests/test_agasc_1.py .......                                                                                  [  9%]
agasc/tests/test_agasc_2.py .............................................                                            [ 67%]
agasc/tests/test_agasc_healpix.py ...........                                                                        [ 81%]
agasc/tests/test_obs_status.py ..............                                                                        [100%]

============================================== 77 passed in 222.93s (0:03:42) ==============================================
jeanconn-fido> git rev-parse HEAD
154e6e51820977ad1820533723d135009784bafd

Functional tests

(ska3) ➜  agasc git:(cone-by-columns-with-caching) env PYTHONPATH=$PWD python dev/profile_cache.py
........
agasc version: 4.20.1.dev10+g154e6e5
        filename         cache all_columns  dt 
                                            ms 
------------------------ ----- ----------- ----
    proseco_agasc_1p7.h5 False        True 6.74
    proseco_agasc_1p7.h5 False       False 9.20
    proseco_agasc_1p7.h5  True        True 2.23
    proseco_agasc_1p7.h5  True       False 2.02
proseco_agasc_1p8rc11.h5 False        True 4.52
proseco_agasc_1p8rc11.h5 False       False 4.39
proseco_agasc_1p8rc11.h5  True        True 1.20
proseco_agasc_1p8rc11.h5  True       False 0.75

@taldcroft taldcroft force-pushed the cone-by-columns-with-caching branch from 3c6f2ea to 4bbbb1f Compare July 7, 2024 10:03
@taldcroft taldcroft marked this pull request as ready for review July 7, 2024 10:19
@taldcroft taldcroft changed the title WIP: Support columns key in get_agasc_cone and improve caching Support columns key in get_agasc_cone and improve caching Jul 7, 2024
ra, dec, rad_pm, agasc_file=agasc_file, columns=columns_query, cache=cache
)

if matlab_pm_bug:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this bug, I don't know if it should be applied before or after the filtering. But hopefully it didn't matter in your initial testing in the original kadi implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice the order doesn't matter since we have enough margin on the cone edge.


# Minimal columns to compute PM-corrected positions and do filtering.
if columns and COLUMNS_REQUIRED - set(columns):
raise ValueError(f"columns must include all of {COLUMNS_REQUIRED}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this seems fine and is solid explicit coding style, I'm just not sure if it would be a lighter touch to just add any needed columns to columns.

Copy link
Contributor

@javierggt javierggt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the changes and they make sense to me. I also ran the unit tests to verify they pass.

I do not like chasing the ruff version. The latest ruff is now 0.5.2, so this PR is already obsolete. Running ruff 0.5.2 on this branch still shows two files that need fixing. Feel free to fix it or not. Just saying.

Also, making ruff commits with other changes makes it harder to see the relevant changes.

@taldcroft
Copy link
Member Author

I do not like chasing the ruff version. The latest ruff is now 0.5.2, so this PR is already obsolete. Running ruff 0.5.2 on this branch still shows two files that need fixing. Feel free to fix it or not. Just saying.

Chasing ruff is indeed annoying. The one bit that is hard to freeze is the VS code extension. The least-work path is just keeping all those extensions updated, so if I see warnings in VS code linting then I want to fix them. Sigh.

About the 0.5.2 changes, I am not seeing any problems now with 0.5.2:

(ska3) ➜  agasc git:(cone-by-columns-with-caching) pip install --upgrade ruff          
Requirement already satisfied: ruff in /Users/aldcroft/miniconda3-arm/envs/ska3/lib/python3.11/site-packages (0.5.1)
Collecting ruff
  Using cached ruff-0.5.2-py3-none-macosx_11_0_arm64.whl.metadata (24 kB)
Using cached ruff-0.5.2-py3-none-macosx_11_0_arm64.whl (8.2 MB)
Installing collected packages: ruff
  Attempting uninstall: ruff
    Found existing installation: ruff 0.5.1
    Uninstalling ruff-0.5.1:
      Successfully uninstalled ruff-0.5.1
Successfully installed ruff-0.5.2
(ska3) ➜  agasc git:(cone-by-columns-with-caching) ruff check
All checks passed!
(ska3) ➜  agasc git:(cone-by-columns-with-caching) ruff format --diff
30 files already formatted

Also, making ruff commits with other changes makes it harder to see the relevant changes.

I hear you, but I did at least do all the ruff in one commit so it is easy to see them and confirm that they are trivial.

@taldcroft taldcroft merged commit aeeda59 into master Jul 15, 2024
2 checks passed
@taldcroft taldcroft deleted the cone-by-columns-with-caching branch July 15, 2024 20:54
This was referenced Jul 16, 2024
@javierggt javierggt mentioned this pull request Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants