-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow passing in pandas dataframes to x2sys_cross #591
Conversation
Implemented by storing pandas.DataFrame data in a temporary file and passing this intermediate file to x2sys_cross. Need to do some regex file parsing to get the right file extension (suffix) for this to work.
So that the tests will pass on macOS and Windows too.
Because Windows (and macOS?) might not support opening same temporary file twice.
0f4d5d7
to
fcd6cfe
Compare
9bb678c
to
1b658c7
Compare
pygmt/x2sys.py
Outdated
) # e.g. "-Dxyz -Etsv -I1/1" | ||
try: | ||
# 1st try to match file extension after -E | ||
suffix = re.search(pattern=r"-E(\S*)", string=lastline).group(1) | ||
except AttributeError: # 'NoneType' object has no attribute 'group' | ||
# 2nd try to match file extension after -D | ||
suffix = re.search(pattern=r"-D(\S*)", string=lastline).group(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better way to check if -Exyz
is in the string, and if not, fallback to parsing from Dxyz
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this one?
lastline = "-Dxyz -Etsv -I1/1"
#lastline = "-Dxyz -I1/1"
for item in lastline.split():
for key in ['-E', '-D']:
if item.startswith(key):
suffix = item[2:]
break
print(suffix)
Note: the code may be wrong, but maybe some codes like this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this gives me some ideas. I'll play around with it, thanks!
Also rename 'result' to 'table' to prevent pylint complaining about R0914: Too many local variables (16/15) (too-many-locals)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok ready for review! I understand that this isn't easy to review properly, but would really appreciate getting this in for v0.2.0 tomorrow as I'll be using it for my PhD research. There's two unit tests added, one for internal crossovers (i.e. on 1 track) and one for external crossovers (2 tracks). I'll add a tutorial example for this over the weekend to explain things better.
try: | ||
tmpfilename = f"track-{unique_name()[:7]}.{suffix}" | ||
track.to_csv( | ||
path_or_buf=tmpfilename, | ||
sep="\t", | ||
index=False, | ||
date_format="%Y-%m-%dT%H:%M:%S.%fZ", | ||
) | ||
yield tmpfilename | ||
finally: | ||
os.remove(tmpfilename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original implementation using GMTTempFile/NamedTemporaryFile didn't work because of some permissions issues (on macOS/Windows), which is why this try-finally block is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code quality looks good. As you're the one who develops and uses these functions, we have to trust you. 😄
Just one suggestion, add the comment to the codes, explaining why you use unique_name
here.
That's the first question when I read your codes before I see your comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code quality looks good. As you're the one who develops and uses these functions, we have to trust you. smile
It's all Paul's work done a decade ago, I'm just wrapping it in Python so more people can use it easily 😃 You won't believe how many 'crossover analysis' tools have been written again and again, but that's another story.
Just one suggestion, add the comment to the codes, explaining why you use
unique_name
here.That's the first question when I read your codes before I see your comment here.
Ok, will do.
Bumps [pygmt](https://github.com/GenericMappingTools/pygmt) from 0.1.2-36-g4939ee2a to 0.2.0. - [Release notes](https://github.com/GenericMappingTools/pygmt/releases) - [Changelog](https://github.com/GenericMappingTools/pygmt/blob/master/doc/changes.rst) - [Commits](GenericMappingTools/pygmt@v0.1.2-36-g4939ee2a...v0.2.0) This includes several enhancements such as 'Sensible array outputs for pygmt info' (GenericMappingTools/pygmt#575) and 'Allow passing in pandas dataframes to x2sys_cross' (GenericMappingTools/pygmt#591) that will make our crossover analysis work and figure generation easier! Also edited Github Actions workflow to only run Docker build on Pull Requests when ready to review or when review is requested (i.e. not when PR is in draft mode).
Bumps [pygmt](https://github.com/GenericMappingTools/pygmt) from 0.1.2-36-g4939ee2a to 0.2.0. - [Release notes](https://github.com/GenericMappingTools/pygmt/releases) - [Changelog](https://github.com/GenericMappingTools/pygmt/blob/master/doc/changes.rst) - [Commits](GenericMappingTools/pygmt@v0.1.2-36-g4939ee2a...v0.2.0) This includes several enhancements such as 'Sensible array outputs for pygmt info' (GenericMappingTools/pygmt#575) and 'Allow passing in pandas dataframes to x2sys_cross' (GenericMappingTools/pygmt#591) that will make our crossover analysis work and figure generation easier! Also edited Github Actions workflow to only run Docker build on Pull Requests when ready to review or when review is requested (i.e. not when PR is in draft mode).
Description of proposed changes
Run crossover analysis directly on
pandas.DataFrame
inputs instead of having to write to tab-separated value (TSV) files first!Example code:
This isn't a trivial thing to implement, because:
x2sys
requires those TSV files in quite a specific format (especially for the datetime columns)/tmp
, it must be stored either in the current working directory or in specific locations listed in the TAG_paths.txt file.Support for pandas DataFrame inputs into
x2sys_cross
was originally left out in the original implementation at #546 because we wanted to wait for GenericMappingTools/gmt#3717. But seeing as it's not a trivial matter, this is an interim solution.Fixes #
Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.