Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add get_name patch/spool method #471

Merged
merged 9 commits into from
Dec 30, 2024
Merged

add get_name patch/spool method #471

merged 9 commits into from
Dec 30, 2024

Conversation

d-chambers
Copy link
Contributor

Description

This PR implements a get_name method for the patch and the spool to return a series of names. The names are either generated by columns provided to the function (which has sensible defaults to maintain the old behavior) or by existing columns such as name or path.

This PR supersede #461.

Checklist

I have (if applicable):

  • referenced the GitHub issue this PR closes.
  • documented the new feature with docstrings or appropriate doc page.
  • included a test. See testing guidelines.
  • your name has been added to the contributors page (docs/contributors.md).
  • added the "ready_for_review" tag once the PR is ready to be reviewed.

Copy link

codecov bot commented Dec 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.85%. Comparing base (2138fee) to head (6044f67).
Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #471   +/-   ##
=======================================
  Coverage   99.85%   99.85%           
=======================================
  Files         115      115           
  Lines        9507     9548   +41     
=======================================
+ Hits         9493     9534   +41     
  Misses         14       14           
Flag Coverage Δ
unittests 99.85% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@d-chambers d-chambers mentioned this pull request Dec 24, 2024
5 tasks
@d-chambers d-chambers added the ready_for_review PR is ready for review label Dec 24, 2024
@d-chambers
Copy link
Contributor Author

@ahmadtourei, mind taking this for a spin to see if it behaves how you wanted ?

dascore/utils/patch.py Outdated Show resolved Hide resolved
Comment on lines 465 to 470
Examples
--------
>>> import dascore as dc
>>> from dascore.utils.patch import get_patch_names
>>> patch = dc.get_example_patch()
>>> name = get_patch_names(patch)
Copy link
Collaborator

@ahmadtourei ahmadtourei Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name is:

0    DAS_______random__2017_09_18__2017_09_18T00_00_07
Name: network, dtype: object

We get DAS_______random__2017_09_18__2017_09_18T00_00_07 using name[0] because the function returns a <class 'pandas.core.series.Series'>. Is there a specific reason that it returns a pandas series with Name: network, dtype: object (a little confusing) instead of the names in string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for the function to work with a spool it needs to return multiple names. I avoided returning multiple types (based on if the input is a Patch or not) from get_patch_names as that is a bit of an anti-pattern.

I understand the confusion though. I will add get_patch_name as a method of Patch that will do this. I think with the different name (get_patch_name vs get_patch_names) it will be clear.

Comment on lines +435 to +439
patch_data: pd.DataFrame | dc.Patch | dc.BaseSpool,
prefix="DAS",
attrs=("network", "station", "tag"),
coords=("time",),
sep="__",
Copy link
Collaborator

@ahmadtourei ahmadtourei Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need test(s) for optional inputs . For example, get_patch_names(patch, patch) does not raise an error.

Comment on lines 388 to 394
@pytest.fixture(scope="session")
def random_spool_directory(tmp_path_factory):
"""A directory with a few patch files."""
out = Path(tmp_path_factory.mktemp("one_file_file_spool"))
spool = ex.get_example_spool("random_das")
out_path = ex.spool_to_directory(spool, path=out)
return dc.spool(out_path).update()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have this as an example in dascore.examples as all of the example spools are memory spools.

def _get_filename(path_ser):
"""Get the file name from a path series."""
ser = path_ser.astype(str)
file_names = [x[-1].split(".")[0] for x in ser.str.split("/")]
Copy link
Collaborator

@ahmadtourei ahmadtourei Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we introduce an optional argument (such as keep_extension) so the user can decide whether to get patch's name with or without file extension? I can work on this as I already implemented this in my version #461. What do you think?

Copy link
Collaborator

@ahmadtourei ahmadtourei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I have posted my comments below.

One general comment that I have is because this can be a very useful function, it would be great to create a shortcut for it (something like patch.name and spool.name so the user can quickly get name(s) in string format. Then we can include it in the Shortcuts section of dascore/docs/tutorial/patch.qmd. I assume users with limited DASCore experience cannot quickly locate it in dascore.utils.patch.

@d-chambers
Copy link
Contributor Author

I assume users with limited DASCore experience cannot quickly locate it in dascore.utils.patch.

I agree. I think it should be a patch/spool method. Because there are optional parameters though, probably best not to make it an attribute (because then the users are always stuck with the defaults).

@d-chambers
Copy link
Contributor Author

Thanks @ahmadtourei for the great review. I have addressed all of your suggestions/concerns. Anything else you want to see in here? I also included a fix for the doc build so I am keen to merge this soon to see if the fix works.

@@ -451,6 +451,24 @@ def random_spool(time_gap=0, length=3, time_min=np.datetime64("2020-01-03"), **k
return dc.spool(out)


@register_func(EXAMPLE_SPOOLS, key="radom_directory_das")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*random

Comment on lines +230 to +240
The "name" of the patch, which is the filename that would be used if the patch were saved to disk by DASCore, can be generated via the [`Spool.get_patch_names`](`dascore.BaseSpool.get_patch_names`) to get a series of the names of the managed patches, or [`Patch.get_patch_name`](`dascore.Patch.get_patch_name`) to get a string of the patch name.

```python
import dascore as dc

patch = dc.get_example_patch()
spool = dc.get_example_spool()

patch_name = patch.get_patch_name()
patch_names = spool.get_patch_names()
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great!

@ahmadtourei
Copy link
Collaborator

Thanks @ahmadtourei for the great review. I have addressed all of your suggestions/concerns. Anything else you want to see in here? I also included a fix for the doc build so I am keen to merge this soon to see if the fix works.

Thanks, Derrick, for your work on this PR! It looks great. The only new comment I have is a typo I listed below.

@d-chambers d-chambers merged commit 4f48bd6 into master Dec 30, 2024
18 checks passed
@d-chambers d-chambers deleted the get_name branch December 30, 2024 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready_for_review PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants