Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python implementation #100

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

piyushrpt
Copy link
Collaborator

@piyushrpt piyushrpt commented Nov 11, 2024

  1. pysolid.py_solid is the python only implementation of pysolid
  2. The entire folder should be a drop-in replacement of pysolid
  3. point_pyimpl.py and grid_pyimpl.py have been added to tests. These essentially are same as the original tests except they use py_solid and have a couple of minor changes related to the function signatures in py_solid.

Summary by Sourcery

Implement a Python-only version of the 'pysolid' library, named 'py_solid', to replace the original implementation. Update import statements to use relative paths and add new tests to verify the functionality of the new implementation.

New Features:

  • Introduce a Python-only implementation of the 'pysolid' library, named 'py_solid', which serves as a drop-in replacement for the original 'pysolid'.

Enhancements:

  • Refactor import statements in 'pysolid/init.py' to use relative imports for better modularity.

Tests:

  • Add new test files 'point_pyimpl.py' and 'grid_pyimpl.py' to validate the functionality of the 'py_solid' implementation, ensuring compatibility with existing tests.

Copy link

sourcery-ai bot commented Nov 11, 2024

Reviewer's Guide by Sourcery

This pull request implements a pure Python version of the solid Earth tides calculation functionality, previously implemented in Fortran. The implementation includes core calculation functions and wrappers to maintain API compatibility with the existing codebase. The changes are organized into a new py_solid package that serves as a drop-in replacement for the original implementation.

Class diagram for the new py_solid module

Loading
classDiagram
    class py_solid {
        +calc_solid_earth_tides_grid(datetime, dict, float, bool, bool) np.ndarray
        +plot_solid_earth_tides_grid(np.ndarray, np.ndarray, np.ndarray, datetime, str, bool, bool)
        +calc_solid_earth_tides_point(float, float, datetime, datetime, int, bool, bool) tuple
        +plot_solid_earth_tides_point(np.ndarray, np.ndarray, np.ndarray, np.ndarray, list, str, bool, bool)
        +plot_power_spectral_density4tides(np.ndarray, float, str, int, int)
    }
    class solid {
        +solid_point(LLH, date, int) tuple
        +solid_grid(datetime, npt.ArrayLike, npt.ArrayLike) npt.ArrayLike
    }
    class LLH {
        +float lat
        +float lon
        +float hte
        +geoxyz() XYZ
    }
    class XYZ {
        +float x
        +float y
        +float z
        +enorm8() float
        +rot3(float) XYZ
        +rot1(float) XYZ
        +rge(LLH) XYZ
        +lhsaaz() XYZ
    }
    py_solid --> solid
    solid --> LLH
    solid --> XYZ
    LLH --> XYZ

File-Level Changes

Change Details Files
Added pure Python implementation of solid Earth tides calculation core functionality
  • Implemented core calculation functions for solid Earth tides in Python
  • Added data structures and constants required for calculations
  • Implemented utility functions for time system conversions
  • Added functions for coordinate system transformations
  • Implemented mathematical models for tidal force calculations
src/pysolid/py_solid/solid.py
Created Python wrappers for point-based calculations
  • Implemented functions for calculating tides at a single point
  • Added plotting utilities for point-based results
  • Defined tidal constituent data structures
  • Added power spectral density analysis functionality
src/pysolid/py_solid/point.py
Created Python wrappers for grid-based calculations
  • Implemented functions for calculating tides on a spatial grid
  • Added plotting utilities for grid-based results
  • Implemented grid interpolation functionality
src/pysolid/py_solid/grid.py
Added package initialization and integration with existing codebase
  • Created package initialization file
  • Updated imports to include new Python implementation
  • Maintained backward compatibility with existing API
src/pysolid/__init__.py
src/pysolid/py_solid/__init__.py
Added test files for the Python implementation
  • Created test for point-based calculations
  • Created test for grid-based calculations
  • Added reference data for validation
tests/point_pyimpl.py
tests/grid_pyimpl.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@piyushrpt piyushrpt requested a review from yunjunz November 11, 2024 05:59
Copy link

codeautopilot bot commented Nov 11, 2024

PR summary

This Pull Request introduces a Python-only implementation of the pysolid library, named py_solid, which serves as a drop-in replacement for the original Fortran-based implementation. The purpose of this change is to provide a more accessible and maintainable version of the library by eliminating the dependency on Fortran. The PR includes modifications to import statements to use relative paths, ensuring better modularity. Additionally, new test files (point_pyimpl.py and grid_pyimpl.py) are added to validate the functionality of the py_solid implementation, ensuring compatibility with existing tests. The impact of this PR is significant as it simplifies the installation process and enhances the maintainability of the library.

Suggestion

To further improve this PR, consider adding documentation that highlights the differences between the original and the new Python-only implementation, if any, and provide guidance on how users can transition to using py_solid. Additionally, ensure that performance benchmarks are conducted to compare the Python implementation with the original Fortran version, as this could be a concern for users relying on the library for computationally intensive tasks.

Disclaimer: This comment was entirely generated using AI. Be aware that the information provided may be incorrect.

Current plan usage: 0.00%

Have feedback or need help?
Discord
Documentation
support@codeautopilot.com

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @piyushrpt - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider adding a note in the documentation about which implementation (Fortran vs Python) is recommended for different use cases, e.g. performance vs portability.
Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟡 Testing: 2 issues found
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

src/pysolid/py_solid/solid.py Show resolved Hide resolved
Comment on lines +32 to +41
# reference
# calculated based on version 0.3.2.post6 on Jun 24, 2024
# env: macOS with python-3.10, numpy-1.24
# install: manual compilation via f2py
tide_e_80_100 = np.array(
[[0.01628786, 0.01630887, 0.01633078, 0.01635247, 0.01637394],
[0.01633248, 0.01635348, 0.01637538, 0.01639706, 0.01641851],
[0.01638009, 0.01640107, 0.01642296, 0.01644462, 0.01646606],
[0.01642767, 0.01644864, 0.01647052, 0.01649217, 0.01651359],
[0.01647523, 0.01649619, 0.01651805, 0.01653968, 0.01656109]],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider parameterizing test reference data

The reference data arrays could be moved to a separate test data file or fixture to improve maintainability and readability. This would also make it easier to update reference values in the future.

# tests/test_data/grid_reference.py
import numpy as np

TIDE_E_80_100 = np.array([
    [0.01628786, 0.01630887, 0.01633078, 0.01635247, 0.01637394],
    [0.01633248, 0.01635348, 0.01637538, 0.01639706, 0.01641851], 
    [0.01638009, 0.01640107, 0.01642296, 0.01644462, 0.01646606],
    [0.01642767, 0.01644864, 0.01647052, 0.01649217, 0.01651359],
    [0.01647523, 0.01649619, 0.01651805, 0.01653968, 0.01656109]
])

Comment on lines +63 to +66
# compare
assert np.allclose(tide_e[::80, ::100], tide_e_80_100)
assert np.allclose(tide_n[::80, ::100], tide_n_80_100)
assert np.allclose(tide_u[::80, ::100], tide_u_80_100)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add tolerance values to np.allclose assertions

Consider explicitly specifying rtol and atol values in np.allclose() to make the test's precision requirements clear. This helps catch subtle numerical differences that may be important for this scientific calculation.

Suggested change
# compare
assert np.allclose(tide_e[::80, ::100], tide_e_80_100)
assert np.allclose(tide_n[::80, ::100], tide_n_80_100)
assert np.allclose(tide_u[::80, ::100], tide_u_80_100)
# compare
assert np.allclose(tide_e[::80, ::100], tide_e_80_100, rtol=1e-10, atol=1e-12)
assert np.allclose(tide_n[::80, ::100], tide_n_80_100, rtol=1e-10, atol=1e-12)
assert np.allclose(tide_u[::80, ::100], tide_u_80_100, rtol=1e-10, atol=1e-12)

return xcorsta


def detide(xsta: XYZ,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider refactoring the tidal computation code to extract shared calculations into helper classes and functions

The code would benefit from extracting shared calculations and improving organization while preserving the core algorithms. Suggested changes:

  1. Extract shared trigonometric calculations into helper functions:
def calc_tidal_angles(t: float) -> TidalAngles:
    """Calculate common angles used in tidal computations"""
    s = 218.31664563 + 481267.88194 * t - 0.0014663889 * t * t \
        + 0.00000185139 * t ** 3
    h = 280.46645 + 36000.7697489 * t + 0.00030322222 * t * t \
        + 0.000000020 * t ** 3 - 0.00000000654 * t ** 4
    # ... other shared angle calculations
    return TidalAngles(s=s, h=h, ...)
  1. Create a TidalComponents class to handle coordinate transforms:
class TidalComponents:
    def __init__(self, xsta: XYZ):
        self.rsta = xsta.enorm8()
        self.sinphi = xsta.z / self.rsta
        self.cosphi = np.sqrt(xsta.x**2 + xsta.y**2) / self.rsta
        self.sinla = xsta.y / self.cosphi / self.rsta
        self.cosla = xsta.x / self.cosphi / self.rsta

    def transform_tide(self, dr: float, de: float, dn: float) -> XYZ:
        """Transform tidal components to XYZ coordinates"""
        return XYZ(
            dr * self.cosla * self.cosphi - de * self.sinla 
                - dn * self.sinphi * self.cosla,
            dr * self.sinla * self.cosphi + de * self.cosla 
                - dn * self.sinphi * self.sinla,
            dr * self.sinphi + dn * self.cosphi
        )
  1. Break detide() into smaller focused functions:
def compute_love_corrections(components: TidalComponents, 
                           sun_moon: SunMoonPositions) -> XYZ:
    """Compute corrections for love number frequency dependence"""
    diurnal = st1idiu(components, sun_moon)
    semidiurnal = st1isem(components, sun_moon) 
    latitude = st1l1(components, sun_moon)
    return diurnal + semidiurnal + latitude

These changes maintain the exact same calculations while reducing code duplication and improving maintainability.

@yunjunz
Copy link
Member

yunjunz commented Nov 11, 2024

Thank you for the PR, @piyushrpt! I am occupied by work this week, but I will try to get to this PR, mainly the testing, as said in #99, at the weekend. @scottstanie, please feel free to step in whenever you got a chance.

@piyushrpt
Copy link
Collaborator Author

All the unit tests etc have been updated. All we now need is someone to setup a bunch of tests - random times, locations etc and compare the 2 sets of numbers. There is no rush to this.

The python code can also be further cleaned up as suggested by the bot here - but that would make the python implementation look significantly different from the original fortran code - which may / may not be desired. I will leave it up to you both - no strong preferences.

@scottstanie
Copy link
Collaborator

scottstanie commented Dec 18, 2024

I started to check it out and converted the script in tests/ to one using pytest and no plotting.
One odd point on the relative speeds:

In [15]: %time tide_e, tide_n, tide_u = pysolid.calc_solid_earth_tides_grid( grid_ref_data["dt_obj"], grid_ref_data["atr"], verbose=False)
CPU times: user 36.6 ms, sys: 1.45 ms, total: 38.1 ms
Wall time: 37.1 ms

In [16]: %time tide_e, tide_n, tide_u = py_solid.calc_solid_earth_tides_grid( grid_ref_data["dt_obj"], grid_ref_data["atr"], verbose=False)
CPU times: user 415 ms, sys: 10.8 ms, total: 425 ms
Wall time: 419 ms

grid speed looks fine for python 👍 I can live with a 10x slow down if it's still just half a second for one date's output

In [13]: %time dt_out, tide_e, tide_n, tide_u = pysolid.calc_solid_earth_tides_point( point_ref_data["lat"], point_ref_data["lon"], point_ref_data["dt_obj0"], point_ref_data["dt_obj1"], verbose=False,)
PYSOLID: calculate solid Earth tides in east/north/up direction
PYSOLID: lot/lon: 34.0/-118.0 degree
PYSOLID: start UTC: 2020-11-05T12:00:00
PYSOLID: end   UTC: 2020-12-31T00:00:00
PYSOLID: time step: 60 seconds
CPU times: user 245 ms, sys: 2.58 ms, total: 248 ms
Wall time: 248 ms

In [14]: %time dt_out, tide_e, tide_n, tide_u = py_solid.calc_solid_earth_tides_point( point_ref_data["lat"], point_ref_data["lon"], point_ref_data["dt_obj0"], point_ref_data["dt_obj1"], verbose=False,)
PYSOLID: calculate solid Earth tides in east/north/up direction
PYSOLID: lot/lon: 34.0/-118.0 degree
PYSOLID: start UTC: 2020-11-05T12:00:00
PYSOLID: end   UTC: 2020-12-31T00:00:00
PYSOLID: time step: 60 seconds
CPU times: user 22.7 s, sys: 38 ms, total: 22.7 s
Wall time: 22.7 s

somehow the point one is much slower? haven't tried to see why, maybe the time loop is just slower...

@piyushrpt
Copy link
Collaborator Author

The slow down for single point is expected since it has to recompute sun / moon positions for each time epoch. When running stuff on a grid for a single epoch - these are done only once.

@scottstanie
Copy link
Collaborator

ok gotcha, so for the insar application that only calculates ephemerides a handful times, it's not a real problem

@piyushrpt
Copy link
Collaborator Author

The python code could be used as a starting point for cython - and this speed could be gained back. The primary goal was to remove dependence on fortran

@scottstanie
Copy link
Collaborator

scottstanie commented Dec 19, 2024

Here's a first pass at a validation script. Takes about 15 minutes to run, seems to have agreement for a global run for a few times per year from 2010 to 2024.

image

There's the question Piyush raised in the discussion about best ways to handle the switch to not disrupt, but also have a pure-python version for easier installs (pip install PyPySolid?).

@piyushrpt
Copy link
Collaborator Author

piyushrpt commented Dec 19, 2024

Maybe path of least resistance

  1. Include python implementation as is in subfolder called py_solid or pyimpl or something that makes more sense.
  2. Change __init__ to import from python implementation if fortran one is not available. Use fortran one always when available.
  3. Add an option to install pysolid[pyonly] that can skip the fortran stuff during installation. Let default be the usual fortran stuff + python implementation but python stuff is not used. This will allow for side-by-side comparisons to be done easily as well with the default /full installation.

@piyushrpt
Copy link
Collaborator Author

piyushrpt commented Dec 20, 2024

Something like this in CMakeLists.txt in addition to just detecting the extension in __init__.py. Dont have a system with a fortran compiler to test this out

cmake_minimum_required(VERSION 3.17.2...3.29)
project(${SKBUILD_PROJECT_NAME} LANGUAGES C)

include(CheckLanguage)
check_language(Fortran)
if(CMAKE_Fortran_COMPILER)
  enable_language(Fortran)

...
else()
   message(STATUS "Skipping building Fortran extension")
endif()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants