Skip to content

Commit

Permalink
climada.hazard.trop_cyclone.TropCytlone.from_tracks: option for prede…
Browse files Browse the repository at this point in the history
…fined dist_coast in centroids
  • Loading branch information
emanuel-schmid committed Feb 15, 2024
1 parent e59eb18 commit 0ad0804
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion climada/hazard/trop_cyclone.py
Original file line number Diff line number Diff line change
Expand Up @@ -268,8 +268,12 @@ def from_tracks(
[idx_centr_filter] = (np.abs(centroids.lat) <= max_latitude).nonzero()
else:
# Select centroids which are inside max_dist_inland_km and lat <= max_latitude
if 'dist_coast' not in centroids.gdf.columns:
dist_coast = centroids.get_dist_coast()
else:
dist_coast = centroids.gdf.dist_coast

This comment has been minimized.

Copy link
@tovogt

tovogt Feb 16, 2024

Collaborator

@emanuel-schmid What do you think about attribute-style access to DataFrame columns in general? After years of working with pandas, I feel like this is very error-prone and should be avoided outside of private 10-line scripts that are written for a single purpose on your own machine. Reasons are: Improved syntax highlighting, more consistency (since in many cases you cannot use attribute-style access, so you are forced to fall back to item-style access), avoid mixing up attribute and column names. Actually, I have been wondering for a long time why pandas does not drop this anti-feature. As a maintainer, I would enforce item-style access in my code base. What my users do privately is up to them. But it's much easier to maintain the code if you have item-style access to DataFrame columns.

In this case, that would mean centroids.gdf["dist_coast"], but there are cases in CLIMADA where the benefits would be really obvious. For example, in Exposures you have a column called "value" and it is so easy to mix this up with the attribute values. Also, as a maintainer, you never know which built-in attributes will be added by pandas in the future that might eventually override your project-specific column names.

If you agree, I would propose to add a user story in Taiga that enforces an item-style access policy throughout CLIMADA and goes through the existing code base to clean up the current mess. By the way, the same applies to variables and attributes of xarray Datasets and DataArrays.

This comment has been minimized.

Copy link
@tovogt

tovogt Feb 19, 2024

Collaborator

This comment has been minimized.

Copy link
@emanuel-schmid

emanuel-schmid Feb 19, 2024

Author Collaborator

In general I fully agree. This has been discussed at some point and we took the short way out - which was probably not the short way in the long run. Thanks for the user story! 🙌

[idx_centr_filter] = (
(centroids.get_dist_coast() <= max_dist_inland_km * 1000)
(dist_coast <= max_dist_inland_km * 1000)
& (np.abs(centroids.lat) <= max_latitude)
).nonzero()

Expand Down

0 comments on commit 0ad0804

Please sign in to comment.